Mô hình CRFs

Xem 1-16 trên 16 kết quả Mô hình CRFs
  • Nội dung của khóa luận là tìm hiểu mô hình CRF, và ứng dụng của mô hình này trong trích chọn thông tin trong tiếng Việt. Trước hết khóa luận trình bày những khái niệm chung về trích chọn thông thông tin. Đồng thời nêu đến hai hướng tiếp cận để xây dựng một hệ thống trích chọn thông tin cũng như ưu nhược điểm của từng hướng tiếp cận, Đồng thời cũng nêu ra được ứng dụng của trích chọn thông tin trong tiếng Việt như thế nào.

    pdf56p chieu_mua 24-08-2012 174 82   Download

  • Luận văn với đề tài "Phân tách cụm danh từ cơ sở tiếng Việt sử dụng mô hình CRFs" được tổ chức thành bốn chương mà nội dung chính của các chương được giới thiệu như dưới đây.Chương 1: Khái quát về bài toán phân tách cụm danh từ giới thiệu bài toán và các nghiên cứu trước đó cũng như kết quả đã đạt được về bài toán này

    pdf58p sunflower_1 06-09-2012 99 28   Download

  • Bố cục của luận văn chia 4 chương như sau:• Chương 1: Trình bày những kiến thức cơ bản về mô hình trường ngẫu nhiên có điều kiện và phương pháp học máy bán giám sát.• Chương 2: Trình bày về tiêu chuẩn kỳ vọng tổng quát và áp dụng tiêu chuẩn kỳ vọng tổng quát vào mô hình trường ngẫu nhiên có điều kiện.

    pdf51p sunflower_1 04-09-2012 76 29   Download

  • Conditional Random Fields (CRFs) have been applied with considerable success to a number of natural language processing tasks. However, these tasks have mostly involved very small label sets. When deployed on tasks with larger label sets, the requirements for computational resources mean that training becomes intractable. This paper describes a method for training CRFs on such tasks, using error correcting output codes (ECOC). A number of CRFs are independently trained on the separate binary labelling tasks of distinguishing between a subset of the labels and its complement. ...

    pdf8p bunbo_1 17-04-2013 18 3   Download

  • This paper presents techniques to apply semi-CRFs to Named Entity Recognition tasks with a tractable computational cost. Our framework can handle an NER task that has long named entities and many labels which increase the computational cost. To reduce the computational cost, we propose two techniques: the first is the use of feature forests, which enables us to pack feature-equivalent states, and the second is the introduction of a filtering process which significantly reduces the number of candidate states. ...

    pdf8p hongvang_1 16-04-2013 23 2   Download

  • We describe a new loss function, due to Jeon and Lin (2006), for estimating structured log-linear models on arbitrary features. The loss function can be seen as a (generative) alternative to maximum likelihood estimation with an interesting information-theoretic interpretation, and it is statistically consistent. It is substantially faster than maximum (conditional) likelihood estimation of conditional random fields (Lafferty et al., 2001; an order of magnitude or more).

    pdf8p hongvang_1 16-04-2013 21 2   Download

  • Recent work on Conditional Random Fields (CRFs) has demonstrated the need for regularisation to counter the tendency of these models to overfit. The standard approach to regularising CRFs involves a prior distribution over the model parameters, typically requiring search over a hyperparameter space. In this paper we address the overfitting problem from a different perspective, by factoring the CRF distribution into a weighted product of individual “expert” CRF distributions. We call this model a logarithmic opinion pool (LOP) of CRFs (LOP-CRFs).

    pdf8p bunbo_1 17-04-2013 16 2   Download

  • This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs). The models are encoded as deterministic weighted finite state automata, and are applied by intersecting the automata with word-lattices that are the output from a baseline recognizer. The perceptron algorithm has the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data. ...

    pdf8p bunbo_1 17-04-2013 19 2   Download

  • The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding speech, while automatic detection of accents can contribute to better wordlevel recognition and better textual understanding. In this paper we investigate probabilistic, contextual, and phonological factors that influence pitch accent placement in natural, conversational speech in a sequence labeling setting.

    pdf7p bunbo_1 17-04-2013 10 2   Download

  • In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions.

    pdf8p hongvang_1 16-04-2013 20 1   Download

  • We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood.

    pdf8p hongvang_1 16-04-2013 16 1   Download

  • This paper proposes a framework for training Conditional Random Fields (CRFs) to optimize multivariate evaluation measures, including non-linear measures such as F-score. Our proposed framework is derived from an error minimization approach that provides a simple solution for directly optimizing any evaluation measure. Specifically focusing on sequential segmentation tasks, i.e. text chunking and named entity recognition, we introduce a loss function that closely reflects the target evaluation measure for these tasks, namely, segmentation F-score. ...

    pdf8p hongvang_1 16-04-2013 20 1   Download

  • There are two decoding algorithms essential to the area of natural language processing. One is the Viterbi algorithm for linear-chain models, such as HMMs or CRFs. The other is the CKY algorithm for probabilistic context free grammars. However, tasks such as noun phrase chunking and relation extraction seem to fall between the two, neither of them being the best fit. Ideally we would like to model entities and relations, with two layers of labels.

    pdf8p hongvang_1 16-04-2013 22 1   Download

  • We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entropy (MaxEnt) and the conditional random fields (CRF) methods. We found that the proposed subword-based tagging outperformed the character-based tagging in all comparative experiments. In addition, we proposed a confidence measure approach to combine the results of a dictionary-based and a subword-tagging-based segmentation. ...

    pdf8p hongvang_1 16-04-2013 11 1   Download

  • In order to build a simulated robot that accepts instructions in unconstrained natural language, a corpus of 427 route instructions was collected from human subjects in the office navigation domain. The instructions were segmented by the steps in the actual route and labeled with the action taken in each step. This flat formulation reduced the problem to an IE/Segmentation task, to which we applied Conditional Random Fields. We compared the performance of CRFs with a set of hand-written rules. The result showed that CRFs perform better with a 73.7% success rate. ...

    pdf6p hongvang_1 16-04-2013 21 1   Download

  • Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and namedentity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. ...

    pdf9p bunbo_1 17-04-2013 18 1   Download


Đồng bộ tài khoản