Statistical feature detection
-
Most two-group statistical tests find broad patterns such as overall shifts in mean, median, or variance. These tests may not have enough power to detect effects in a small subset of samples, e.g., a drug that works well only on a few patients.
19p vibransone 28-03-2024 4 1 Download
-
Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features.
13p viaristotle 29-01-2022 13 0 Download
-
One challenge facing omics association studies is the loss of statistical power when adjusting for confounders and multiple testing. The traditional statistical procedure involves fitting a confounder-adjusted regression model for each omics feature, followed by multiple testing correction.
18p viarchimedes 26-01-2022 11 0 Download
-
This project proposes two new methods of deep neural networks and handcrafted features for damage detection. The first method uses a convolution neural network (CNN) to extract deep features in time series and Long Short Term Memory (LSTM) network to find a statistically significant correlation of each lagged feature in time series data.
6p billyelliot 11-11-2021 27 1 Download
-
Interactions that involve one or more amino acid side chains near the ends of protein helices stabilize helix termini and shape the geometry of the adjacent loops, making a substantial contribution to overall protein structure.
21p vikentucky2711 24-11-2020 10 1 Download
-
Automated skin lesion border examination and analysis techniques have become an important field of research for distinguishing malignant pigmented lesions from benign lesions. An abrupt pigment pattern cutoff at the periphery of a skin lesion is one of the most important dermoscopic features for detection of neoplastic behavior.
21p vioklahoma2711 19-11-2020 6 0 Download
-
Performing statistical tests is an important step in analyzing genome-wide datasets for detecting genomic features differentially expressed between conditions. Each type of statistical test has its own advantages in characterizing certain aspects of differences between population means and often assumes a relatively simple data distribution (e.g., Gaussian, Poisson, negative binomial, etc.), which may not be well met by the datasets of interest.
19p vioklahoma2711 19-11-2020 12 0 Download
-
Genomic islands are associated with microbial adaptations, carrying genomic signatures different from the host. Some methods perform an overall test to identify genomic islands based on their local features. However, regions of different scales will display different genomic features.
15p vicolorado2711 22-10-2020 4 0 Download
-
the second edition of atoms, radiation, and radiation protection has several important new features. si units are employed throughout, the older units being de-fined but used sparingly. there are two new chapters. one is on statistics for health physics. it starts with the description of radioactive decay as a bernoulli process and treats sample counting, propagation of error, limits of detection, type-i and type-ii errors, instrument response, and monte carlo radiation-transport computations.
595p tranthanhkhang93 19-04-2017 61 8 Download
-
It is important to correct the errors in the results of speech recognition to increase the performance of a speech translation system. This paper proposes a method for correcting errors using the statistical features of character co-occurrence, and evaluates the method. The proposed method comprises two successive correcting processes. The first process uses pairs of strings: the first string is an erroneous substring of the utterance predicted by speech recognition, the second string is the corresponding section of the actual utterance.
5p bunrieu_1 18-04-2013 49 3 Download
-
We present a global discriminative statistical word order model for machine translation. Our model combines syntactic movement and surface movement information, and is discriminatively trained to choose among possible word orders. We show that combining discriminative training with features to detect these two different kinds of movement phenomena leads to substantial improvements in word ordering performance over strong baselines. Integrating this word order model in a baseline MT system results in a 2.4 points improvement in BLEU for English to Japanese translation. ...
8p hongvang_1 16-04-2013 45 2 Download
-
We have developed an automated Japanese essay scoring system called Jess. The system needs expert writings rather than expert raters to build the evaluation model. By detecting statistical outliers of predetermined aimed essay features compared with many professional writings for each prompt, our system can evaluate essays.
8p hongvang_1 16-04-2013 46 1 Download
-
We investigate the automatic detection of sentences containing linguistic hedges using corpus statistics and syntactic patterns. We take Wikipedia as an already annotated corpus using its tagged weasel words which mark sentences and phrases as non-factual. We evaluate the quality of Wikipedia as training data for hedge detection, as well as shallow linguistic features.
4p hongphan_1 15-04-2013 69 2 Download
-
We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a loglinear surface realisation ranking model. We show that the distribution of pairs of IS categories is strongly asymmetric. Moreover, each category is correlated with morphosyntactic features, which can be automatically detected. We build a loglinear model that incorporates these asymmetries for ranking German string realisations from input LFG F-structures.
9p hongphan_1 14-04-2013 41 3 Download
-
Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation using system-based features, such as word posterior probabilities calculated from N best lists or word lattices. We propose to incorporate two groups of linguistic features, which convey information from outside machine translation systems, into error detection: lexical and syntactic features.
8p hongdo_1 12-04-2013 42 3 Download
-
However, some scientists are working on earthquake precursors and forecasting research and this is a big challenge. Chapter in this book will be devoted to different aspects of the Earthquake Research and analysis, from theoretical advances for practical applications. The first two chapters for statistical analysis. About ten chapters in Part II focus on research precursors of earthquakes and forecasting. Some proposed new methods for early detection, as well as observations of earthquakes through the use of social sensors....
266p lulanphuong 24-03-2012 75 10 Download