Research on the discovery of terms from corpora has focused on word sequences whose recurrent occurrence in a corpus is indicative of their terminological status, and has not addressed the issue of discovering terms when data is sparse. This becomes apparent in the case of noun compounding, which is extremely productive: more than half of the candidate compounds extracted from a corpus are attested only once. We show how evidence about established (i.e.
A novel video smoke detection method using both color and motion
features is presented. The result of optical flow is assumed to be an approximation of
motion field. Background estimation and color-based decision rule are used to determine
candidate smoke regions. The Lucas Kanade optical flow algorithm is proposed
to calculate the optical flow of candidate regions. And the motion features are calculated
from the optical flow results and use to differentiate smoke from some other
A Study in Scarlet was Sherlock Holmes' first outing into the literary world. Published in 1887 (after many rejections) it was an immediate success. Conan Doyle's quirky hero, with his cold deductive mind, violin playing and cocaine addiction, fascinated the reading public, and laid the foundation for the many Sherlock Holmes books and short stories that were to follow over the next three decades.
In this study, a novel approach to robust dialogue act detection for error-prone speech recognition in a spoken dialogue system is proposed. First, partial sentence trees are proposed to represent a speech recognition output sentence. Semantic information and the derivation rules of the partial sentence trees are extracted and used to model the relationship between the dialogue acts and the derivation rules.
Despite the rising interest in developing grammatical error detection systems for non-native speakers of English, progress in the ﬁeld has been hampered by a lack of informative metrics and an inability to directly compare the performance of systems developed by different researchers. In this paper we address these problems by presenting two evaluation methodologies, both based on a novel use of crowdsourcing.
We investigate the unsupervised detection of semi-ﬁxed cue phrases such as “This paper proposes a novel approach. . . 1 ” from unseen text, on the basis of only a handful of seed cue phrases with the desired semantics. The problem, in contrast to bootstrapping approaches for Question Answering and Information Extraction, is that it is hard to ﬁnd a constraining context for occurrences of semi-ﬁxed cue phrases. Our method uses components of the cue phrase itself, rather than external context, to bootstrap. ...
An approach to automatic detection of syllable boundaries is presented. We demonstrate the use of several manually constructed grammars trained with a novel algorithm combining the advantages of treebank and bracketed corpora training. We investigate the effect of the training corpus size on the performance of our system. The evaluation shows that a hand-written grammar performs better on ﬁnding syllable boundaries than does a treebank grammar.
In this paper, we extend the work on using latent cross-language topic models for identifying word translations across comparable corpora. We present a novel precisionoriented algorithm that relies on per-topic word distributions obtained by the bilingual LDA (BiLDA) latent topic model. The algorithm aims at harvesting only the most probable word translations across languages in a greedy fashion, without any prior knowledge about the language pair, relying on a symmetrization process and the one-to-one constraint.
We apply topic modelling to automatically induce word senses of a target word, and demonstrate that our word sense induction method can be used to automatically detect words with emergent novel senses, as well as token occurrences of those senses. We start by exploring the utility of standard topic models for word sense induction (WSI), with a pre-determined number of topics (=senses). We next demonstrate that a non-parametric formulation that learns an appropriate number of senses per word actually performs better at the WSI task. ...
In pro-drop languages, the detection of explicit subjects, zero subjects and nonreferential impersonal constructions is crucial for anaphora and co-reference resolution. While the identiﬁcation of explicit and zero subjects has attracted the attention of researchers in the past, the automatic identiﬁcation of impersonal constructions in Spanish has not been addressed yet and this work is the ﬁrst such study.
This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator ﬁlters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation system outperforms state-of-the-art systems, automatically generating some of the most natural image descriptions to date. ...
In information retrieval, genre classification could enable users to sort search results according to their immediate interests. People who go into a bookstore or library are not usually looking simply for information about a particular topic, but rather have requirements of genre as well: they are looking for scholarly articles about hypnotism, novels about the French Revolution, editorials about the supercollider, and so forth.
Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information.
We address a core aspect of the multilingual content synchronization task: the identiﬁcation of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions.
In another recent study by our group (Gürlek et al. 2009) similar
salivary ICTP levels were detected in smoker, non-smoker and ex-smoker patient groups
with similar clinical periodontal findings. Smoking status was confirmed by salivary
cotinine analysis but there was no clinically healthy control group in that study and the
number of teeth present, average probing depths and attachment levels were all similar in
the three study groups. There were no significant differences in saliva ICTP concentrations
between the smoker and non-smoker patient groups.
Bài này giới thiệu một loại máy thu đa truy nhập sử dụng mạng Hopfield bằng cách kết hợp thuật toán cận tối ưu với khả năng hội tụ nhanh của mạng neural. Abstract This paper introduces a novel multi-user receiver, Hopfield network receiver, which combines fast convergence of neural network with the asymptotically optimum algorithm.
Trong thời gian gần đây, hệ thống thông tin trải phổ (Spread Spectrum Communication System) đã được xem xét rộng rãi do quá trình thực hiện đã dễ dàng hơn.
When left to himself, however, he would seldom produce any music or attempt any recognized air. Leaning back in his arm-chair of an evening, he would close his eyes and scrape carelessly at the fiddle which was thrown across his knee.”
The Moonstone (1868) by Wilkie Collins is a 19th-century British epistolary novel, generally considered the first detective novel in the English language. The story was originally serialized in Charles Dickens' magazine All the Year Round. The Moonstone and The Woman in White are considered Wilkie Collins' best novels. Besides creating many of the characteristics of detective novels, The Moonstone also represented Collins' social opinions by his treatment of the Indians and the servants in the novel.
We normally think of the eye as an organ for vision, but due to the discovery of
additional nerve connections from recently-detected novel photoreceptor cells in
the eye to the brain, it is now understood how light also mediates and controls a
large number of biochemical processes in the human body. The most important
findings are related to the control of the biological clock and to the regulation of
some important hormones through regular light-dark rhythms. This in turn means
that lighting has a large influence on health, well-being and alertness.
Few fi elds of medicine have witnessed such impressive progress as the diagnosis and
treatment of liver tumors. Advances in imaging technology, the development of novel
contrast agents, and the introduction of optimized scanning protocols have greatly
facilitated the non-invasive detection and characterization of focal liver lesions. Furthermore,
image-guided techniques for percutaneous tumor ablation have become an
accepted alternative treatment for patients with inoperable liver cancer.