intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Semantic text similarity

Xem 1-19 trên 19 kết quả Semantic text similarity
  • Semantic matching has received much attention in the database, AI, KDD, Web, and Semantic Web communities. Recently, many works have also applied deep learning (DL) to semantic matching. In this paper we survey this fast growing topic. We define the semantic matching problem, categorize its variations into a taxonomy, and describe important applications. We describe DL solutions for important variations of semantic matching. Finally, we discuss future R&D directions.

    pdf38p spiritedaway36 25-11-2021 12 0   Download

  • The Human Phenotype Ontology (HPO) is one of the most popular bioinformatics resources. Recently, HPO-based phenotype semantic similarity has been effectively applied to model patient phenotype data. However, the existing tools are revised based on the Gene Ontology (GO)-based term similarity. The design of the models are not optimized for the unique features of HPO. In addition, existing tools only allow HPO terms as input and only provide pure text-based outputs.

    pdf9p vitzuyu2711 29-09-2021 19 1   Download

  • Semantic similarity measures estimate the similarity between concepts, and play an important role in many text processing tasks. Approaches to semantic similarity in the biomedical domain can be roughly divided into knowledge based and distributional based methods.

    pdf13p viwyoming2711 16-12-2020 6 1   Download

  • Neural network based embedding models are receiving significant attention in the field of natural language processing due to their capability to effectively capture semantic information representing words, sentences or even larger text elements in low-dimensional vector space.

    pdf10p vijisoo2711 27-10-2020 13 1   Download

  • Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development.

    pdf12p vicolorado2711 23-10-2020 12 1   Download

  • Many disease causing genes have been identified through different methods, but there have been no uniform annotations of biomedical named entity (bio-NE) of the disease phenotypes of these genes yet. Furthermore, semantic similarity comparison between two bio-NE annotations has become important for data integration or system genetics analysis.

    pdf14p vicolorado2711 22-10-2020 10 0   Download

  • Social media networks have evolved as a large repository of short documents and gives the greater challenges to effectively retrieve the content out of it. Many factors were involved in this process such as restricted length of a content, informal use of language (i.e., slangs, abbreviations, styles, etc.) and low contextualization of the user generated content. To meet out the above stated problems, latest studies on context-based information searching have been developed and built on adding semantics to the user generated content into the existing knowledge base.

    pdf20p kequaidan1 05-11-2019 19 2   Download

  • This approach is based on text snippets and page counts. These two measures are taken from the results of a search engine like Google. To achieve the aim of this paper, lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page counts. The results of these two are combined.

    pdf6p byphasse043256 24-03-2019 17 0   Download

  • This paper proposes a method for measuring semantic similarity between words as a new tool for text analysis. The similarity is measured on a semantic network constructed systematically from a subset of the English dictionary, LDOCE (Long-man Dictionary of Contemporary English). Spreading activation on the network can directly compute the similarity between any two words in the Longman Defining Vocabulary, and indirectly the similarity of all the other words in LDOCE.

    pdf8p buncha_1 08-05-2013 51 2   Download

  • In this paper, we explore unsupervised techniques for the task of automatic short answer grading. We compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating automatic feedback from the student answers. Overall, our system significantly and consistently outperforms other unsupervised methods for short answer grading that have been proposed in the past. ...

    pdf9p bunthai_1 06-05-2013 40 2   Download

  • A method of determining the similarity of nouns on the basis of a metric derived from the distribution of subject, verb and object in a large text corpus is described. The resulting quasi-semantic classification of nouns demonstrates the plausibility of the distributional hypothesis, and has potential application to a variety of tasks, including automatic indexing, resolving nominal compounds, and determining the scope of modification. 1. I N T R O D U C T I O N A variety of linguistic relations apply to sets of semantically similar words. ...

    pdf8p bungio_1 03-05-2013 51 1   Download

  • This paper proposes a new indicator of text structure, called the lexical cohesion profile (LCP), which locates segment boundaries in a text. A text segment is a coherent scene; the words in a segment a~e linked together via lexical cohesion relations. LCP records mutual similarity of words in a sequence of text. The similarity of words, which represents their cohesiveness, is computed using a semantic network. Comparison with the text segments marked by a number of subjects shows that LCP closely correlates with the human judgments.

    pdf3p bunmoc_1 20-04-2013 39 3   Download

  • In this paper we present a method to group adjectives according to their meaning, as a first step towards the automatic identification of adjectival scales. We discuss the properties of adjectival scales and of groups of semantically related adjectives and how they imply sources of linguistic knowledge in text corpora. We describe how our system exploits this linguistic knowledge to compute a measure of similarity between two adjectives, using statistical techniques and without having access to any semantic information about the adjectives. ...

    pdf11p bunmoc_1 20-04-2013 56 1   Download

  • For a very long time, it has been considered that the only way of automatically extracting similar groups of words from a text collection for which no semantic information exists is to use docum e n t co-occurrence data. But, with robust syntactic parsers that are becoming more frequently available, syntactically recognizable p h e n o m e n a about word usage can be confidently noted in large collections of texts.

    pdf3p bunmoc_1 20-04-2013 31 1   Download

  • Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of words. The similarity measure allows us to construct a thesaurus using a parsed corpus. We then present a new evaluation methodology for the automatically constructed thesaurus. The evaluation results show that the thesaurns is significantly closer to WordNet than Roget Thesaurus is.

    pdf7p bunrieu_1 18-04-2013 46 2   Download

  • In this paper, we present a method for the semantic tagging of word chunks extracted from a written transcription of conversations. This work is part of an ongoing project for an information extraction system in the field of maritime Search And Rescue (SAR). Our purpose is to automatically annotate parts of texts with concepts from a SAR ontology. Our approach combines two knowledge sources a SAR ontology and the Wordsmyth dictionarythesaurus, and it uses a similarity measure for the classification.

    pdf8p bunbo_1 17-04-2013 39 2   Download

  • In this paper we propose a domainindependent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic distribution, and we measure semantic similarity by the Fisher kernel. Finally global best segmentation is achieved by dynamic programming. Experiments on Chinese data sets with the technique show it can be effective. Introducing latent semantic information, our algorithm is robust on irregular-sized segments.

    pdf4p hongphan_1 15-04-2013 42 2   Download

  • Hand-coded scripts were used in the 1970-80s as knowledge backbones that enabled inference and other NLP tasks requiring deep semantic knowledge. We propose unsupervised induction of similar schemata called narrative event chains from raw newswire text. A narrative event chain is a partially ordered set of events related by a common protagonist. We describe a three step process to learning narrative event chains. The first uses unsupervised distributional methods to learn narrative relations between events sharing coreferring arguments.

    pdf9p hongphan_1 15-04-2013 53 1   Download

  • In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space.

    pdf10p hongdo_1 12-04-2013 37 2   Download

CHỦ ĐỀ BẠN MUỐN TÌM

ADSENSE

nocache searchPhinxDoc

 

Đồng bộ tài khoản
4=>1