Xem 1-20 trên 111 kết quả Web mining
  • We present a web mining method for discovering and enhancing relationships in which a specified concept (word class) participates. We discover a whole range of relationships focused on the given concept, rather than generic known relationships as in most previous work. Our method is based on clustering patterns that contain concept words and other words related to them. We evaluate the method on three different rich concepts and find that in each case the method generates a broad variety of relationships with good precision. ...

    pdf8p hongvang_1 16-04-2013 14 2   Download

  • In Cross-Language Information Retrieval (CLIR), Out-of-Vocabulary (OOV) detection and translation pair relevance evaluation still remain as key problems. In this paper, an English-Chinese Bi-Directional OOV translation model is presented, which utilizes Web mining as the corpus source to collect translation pairs and combines supervised learning to evaluate their association degree.

    pdf4p hongphan_1 15-04-2013 21 1   Download

  • In this paper, we propose a novel system for translating organization names from Chinese to English with the assistance of web resources. Firstly, we adopt a chunkingbased segmentation method to improve the segmentation of Chinese organization names which is plagued by the OOV problem. Then a heuristic query construction method is employed to construct an efficient query which can be used to search the bilingual Web pages containing translation equivalents.

    pdf9p hongphan_1 14-04-2013 11 2   Download

  • This book evolved from material developed over several years by Anand Raja-raman and Jeff Ullman for a one-quarter course at Stanford. The courseCS345A, titled “Web Mining,” was designed as an advanced graduate course,although it has become accessible and interesting to advanced undergraduates.What the Book Is AboutAt the highest level of description, this book is about data mining. However,it focuses on data mining of very large amounts of data, that is, data so largeit does not fit in main memory. Because of the emphasis on size, many of ourexamples are about the Web or data deri...

    pdf415p konbetocroi 04-01-2013 23 9   Download

  • Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques.

    pdf226p camchuong_1 04-12-2012 30 6   Download

  • This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree alignment model is proposed to identify the translationally equivalent texts and hyperlinks between two parallel DOM trees. By tracing the identified parallel hyperlinks, parallel web documents are recursively mined. Compared with previous mining schemes, the benchmarks show that this new mining scheme improves the mining coverage, reduces mining bandwidth, and enhances the quality of mined parallel sentences.

    pdf8p hongvang_1 16-04-2013 20 3   Download

  • The chapters of this book provide an excellent overview of current research and development activities in the area of web information systems. They supply an in-depth description of different issues in web information systems areas, including web-based information modeling, migration between different media types, web information mining, and web information extraction issues.

    pdf391p vutrung 09-09-2009 131 35   Download

  • Data compression is the technique to reduce the redundancies in data representation in order to decrease data storage requirements and, hence, communication costs when transmitted through a communication network [24, 25]. Reducing the storage requirement is equivalent to increasing the capacity of the storage medium. If the compressed data are properly indexed, it may improve the performance of mining data in the compressed large database as well

    pdf20p thachcotran 04-02-2010 77 25   Download

  • Tham khảo tài liệu 'data mining p1', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

    pdf30p thachcotran 04-02-2010 71 20   Download

  • Natural language questions have become popular in web search. However, various questions can be formulated to convey the same information need, which poses a great challenge to search systems. In this paper, we automatically mined 5w1h question reformulation patterns from large scale search log data.

    pdf6p nghetay_1 07-04-2013 18 3   Download

  • Mining bilingual data (including bilingual sentences and terms1) from the Web can benefit many NLP applications, such as machine translation and cross language information retrieval. In this paper, based on the observation that bilingual data in many web pages appear collectively following similar patterns, an adaptive pattern-based bilingual data mining method is proposed.

    pdf9p hongphan_1 14-04-2013 14 3   Download

  • Lots of data is being collected and warehoused Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card transactions Computers have become cheaper and more powerful Competitive Pressure is Strong Provide better, customized services for an edge (e.g. in Customer Relationship Management)

    ppt29p trinh02 18-01-2013 19 2   Download

  • We predict entity type distributions in Web search queries via probabilistic inference in graphical models that capture how entitybearing queries are generated. We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks. We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base.

    pdf9p nghetay_1 07-04-2013 22 2   Download

  • This paper focuses on mining the hyponymy (or is-a) relation from large-scale, open-domain web documents. A nonlinear probabilistic model is exploited to model the correlation between sentences in the aggregation of pattern matching results. Based on the model, we design a set of evidence combination and propagation algorithms.

    pdf10p hongdo_1 12-04-2013 16 2   Download

  • This paper presents Engkoo 1 , a system for exploring and learning language. It is built primarily by mining translation knowledge from billions of web pages - using the Internet to catch language in motion. Currently Engkoo is built for Chinese users who are learning English; however the technology itself is language independent and can be extended in the future.

    pdf6p hongdo_1 12-04-2013 20 2   Download

  • This paper presents an unsupervised relation extraction method for discovering and enhancing relations in which a specified concept in Wikipedia participates. Using respective characteristics of Wikipedia articles and Web corpus, we develop a clustering approach based on combinations of patterns: dependency patterns from dependency analysis of texts in Wikipedia, and surface patterns generated from highly redundant information related to the Web. Evaluations of the proposed approach on two different domains demonstrate the superiority of the pattern combination over existing approaches. ...

    pdf9p hongphan_1 14-04-2013 18 2   Download

  • Adaptive Information Extraction systems (IES) are currently used by some Semantic Web (SW) annotation tools as support to annotation (Handschuh et al., 2002; Vargas-Vera et al., 2002). They are generally based on fully supervised methodologies requiring fairly intense domain-specific annotation. Unfortunately, selecting representative examples may be difficult and annotations can be incorrect and require time. In this paper we present a methodology that drastically reduce (or even remove) the amount of manual annotation required when annotating consistent sets of pages. ...

    pdf4p bunthai_1 06-05-2013 13 2   Download

  • There are a growing number of popular web sites where users submit and review instructions for completing tasks as varied as building a table and baking a pie. In addition to providing their subjective evaluation, reviewers often provide actionable refinements. These refinements clarify, correct, improve, or provide alternatives to the original instructions. However, identifying and reading all relevant reviews is a daunting task for a user.

    pdf9p nghetay_1 07-04-2013 10 1   Download

  • Argo is a web-based NLP and text mining workbench with a convenient graphical user interface for designing and executing processing workflows of various complexity. The workbench is intended for specialists and nontechnical audiences alike, and provides the ever expanding library of analytics compliant with the Unstructured Information Management Architecture, a widely adopted interoperability framework.

    pdf6p nghetay_1 07-04-2013 19 1   Download

  • In this paper, we present a novel backward transliteration approach which can further assist the existing statistical model by mining monolingual web resources. Firstly, we employ the syllable-based search to revise the transliteration candidates from the statistical model. By mapping all of them into existing words, we can filter or correct some pseudo candidates and improve the overall recall. Secondly, an AdaBoost model is used to rerank the revised candidates based on the information extracted from monolingual web pages. ...

    pdf9p hongphan_1 15-04-2013 15 1   Download

Đồng bộ tài khoản