This paper deals with the task of ﬁnding generally applicable substitutions for a given input term. We show that the output of a distributional similarity system baseline can be ﬁltered to obtain terms that are not simply similar but frequently substitutable. Our ﬁlter relies on the fact that when two terms are in a common entailment relation, it should be possible to substitute one for the other in their most frequent surface contexts.
This design guide provides an overview of the Enterprise Branch Architecture, which is one component
in the overall Cisco Service-Oriented Network Architecture (SONA). SONA is a comprehensive
framework to provide guidelines to accelerate applications, business processes, and profitability. Based
on the Cisco SONA framework, the Enterprise Branch Architecture incorporates networked
infrastructure services, integrated services, and application networking services across typical branch
The project baseline, which is the focus of Chapter 7, arguably falls into place when
planning is complete and the team members have agreed all the scheduled dates. At
this juncture, values are stored, and these include the agreed tasks; the scheduled
start and finish dates for the tasks; team members who will be responsible for
scheduled tasks; and the budgeted cost. All that is done cognisant of performance,
cost, time and scope (PCTS) of the project.
From John Glenn s mission to orbit Earth to the
International Space Station program, space food
research has met the challenge of providing food
that tastes good and travels well in space. To better understand
this process, we can look back through history.
Explorers have always had to face the problem of how to
carry enough food for their journeys. Whether those
explorers are onboard a sailing ship or on the Space
Shuttle, adequate storage space has been a problem.
The Small Business Innovation Research (SBIR) program was created in 1982 by the Small Business
Innovation Development Act. The program is designed to stimulate technology innovation by small businesses,
provide technical and scientific solutions to challenging problems, and encourage the marketing of the resulting
new technologies in the private sector. Federal agencies with more than $100 million in extramural research and
development (R&D) are required to allocate 2.5 percent of their research budgets to small businesses. Such funds
from all federal agencies amounted to approximately $1.
This book is the work of a great team. First I’d like to thank my editor Suzanne Goraj for her
excellent job on the editing process. The production editor Elizabeth Campbell was always
a pleasure to work with and kept the book moving along and on schedule. Thanks also to
technical editor Donald Fuller for his thorough edit and for keeping me honest.
I would like to thank Neil Edde, associate publisher and James Chellis who both helped
develop and nurtured the MCSE series of books since the beginning.
Hungarian is a stereotype of morphologically rich and non-conﬁgurational languages. Here, we introduce results on dependency parsing of Hungarian that employ a 80K, multi-domain, fully manually annotated corpus, the Szeged Dependency Treebank. We show that the results achieved by state-of-the-art data-driven parsers on Hungarian and English (which is at the other end of the conﬁgurational-nonconﬁgurational spectrum) are quite similar to each other in terms of attachment scores.
FA is a binary variable that indicates whether or not the households of the municipalities had receive
any payments of the Familias en Accion program in June 2002, the start date of the survey under study.
Households in some of the municipalities started to receive payments for longer than 6 months before
the baseline interview, but for most of them it was just six months before the interview. The little time
that the program has been operating should be taken into account when analyzing this variable,
especially when the dependent variable is height for age or leg length as they...
We present the results of a large-scale, end-to-end human evaluation of various sentiment summarization models. The evaluation shows that users have a strong preference for summarizers that model sentiment over non-sentiment baselines, but have no broad overall preference between any of the sentiment-based models. However, an analysis of the human judgments suggests that there are identiﬁable situations where one summarizer is generally preferred over the others.
One of the important observations done during the CLEF 2009 campaign (Ferro and Peters, 2009) related to CLIR was that the usage of Statistical Machine Translation (SMT) systems (eg. Google Translate) for query translation led to important improvements in the cross-lingual retrieval performance (the best CLIR performance increased from ˜55% of the monolingual baseline in 2008 to more than 90% in 2009 for French and German target languages). However, generalpurpose SMT systems are not necessarily adapted for query translation.
In this paper we investigate the use of character-level translation models to support the translation from and to underresourced languages and textual domains via closely related pivot languages. Our experiments show that these low-level models can be successful even with tiny amounts of training data. We test the approach on movie subtitles for three language pairs and legal texts for another language pair in a domain adaptation task. Our pivot translations outperform the baselines by a large margin. ...
We present a system for the real-time generation of car navigation instructions with landmarks. Our system relies exclusively on freely available map data from OpenStreetMap, organizes its output to ﬁt into the available time until the next driving maneuver, and reacts in real time to driving errors. We show that female users spend signiﬁcantly less time looking away from the road when using our system compared to a baseline system.
Nucleic acid amplification tests (NAATs) have offered hope for rapid diagnosis of tuberculosis (TB).
However, their efficiency with smear-negative samples has not been widely studied in low income settings. Here,
we evaluated in-house PCR assay for diagnosis of smear-negative TB using Lowenstein-Jensen (LJ) culture as the
baseline test. Two hundred and five pulmonary TB (PTB) suspects with smear-negative sputum samples, admitted
on a short stay emergency ward at Mulago Hospital in Kampala, Uganda, were enrolled.
Spoken dialogue systems would be more acceptable if they were able to produce backchannel continuers such as mm-hmm in naturalistic locations during the user's utterances. Using the HCRC Map Task Corpus as our data source, we describe models for predicting these locations using only limited processing and features of the user's speech that are commonly available, and which therefore could be used as a lowcost improvement for current systems. The baseline model inserts continuers after a predetermined number of words. ...
We propose a new formulation of the PP attachment problem as a 4-way classification which takes into account the argument or adjunct status of the PP. Based on linguistic diagnostics, we train a 4-way classifier that reaches an average accuracy of 73.9% (baseline 66.2%). Compared to a sequence of binary classifiers, the 4-way classifier reaches better performance and individuates a verb's arguments more accurately, thus improving the acquisition of a crucial piece of information for many NLP applications. ...
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting summaries and cannot beat a dummy baseline consisting of the first sentence in the document.
This paper describes a system that produces extractive summaries of short works of literary fiction. The ultimate purpose of produced summaries is defined as helping a reader to determine whether she would be interested in reading a particular story. To this end, the summary aims to provide a reader with an idea about the settings of a story (such as characters, time and place) without revealing the plot. The approach presented here relies heavily on the notion of aspect.
This paper addresses the problem of extracting the most important facts from a news article. Our approach uses syntactic, semantic, and general statistical features to identify the most important sentences in a document. The importance of the individual features is estimated using generalized iterative scaling methods trained on an annotated newswire corpus.
We present a Hebrew to English transliteration method in the context of a machine translation system. Our method uses machine learning to determine which terms are to be transliterated rather than translated. The training corpus for this purpose includes only positive examples, acquired semi-automatically. Our classiﬁer reduces more than 38% of the errors made by a baseline method. The identiﬁed terms are then transliterated.
This article presents empirical evaluations of aspects of annotation for the linguistic property of animacy in Swedish, ranging from manual human annotation, automatic classiﬁcation and, ﬁnally, an external evaluation in the task of syntactic parsing.