High dimensional feature selection

Xem 1-18 trên 18 kết quả High dimensional feature selection

A hybrid multi-filter wrapper feature selection method for software defect predictors

This study proposes a hybrid multi-filter wrapper method for feature selection of relevant and irredundant features in software defect prediction. The proposed hybrid feature selection will be developed to take advantage of filter-filter and filter-wrapper relationships to give optimal feature subsets, reduce its evaluation cycle and subsequently improve SDP models overall predictive performance in terms of Accuracy, Precision and Recall values.

7p longtimenosee10 26-04-2024 2 1 Download

A meta-heuristic approach for enhancing performance of associative classification

Associative Classification is an interesting approach in data mining to create more accurate and easily interpretable predictive systems. This approach is often built on both association rule mining and classification techniques, to find a set of rules called association rules for classification (CAR) of label attributes.

7p viohoyo 25-04-2024 2 2 Download

A comparative analysis of filter-based fea-ture selection methods for software fault pre-diction

The rapid growth of data has become a huge challenge for software systems. The quality of fault prediction model depends on the quality of software dataset. High-dimensional data is the major problem that affects the performance of the fault prediction models. In order to deal with dimensionality problem, feature selection is proposed by various researchers.

7p viplato 05-04-2022 17 1 Download
RHDSI: A novel dimensionality reduction based algorithm on high dimensional feature selection with interactions

This study proposes a novel Dimensionality Reduction based algorithm on High Dimensional feature Selection with Interactions (RHDSI), a new feature selection method that integrates dimensionality reduction and machine learning.

16p guernsey 28-12-2021 5 0 Download
Integrated 3D-QSAR, molecular docking, and molecular dynamics simulation studies on 1,2,3-triazole based derivatives for designing new acetylcholinesterase inhibitors

The molecular features characteristics provided by the 3D-QSAR contour plots were quite useful for designing and improving the activity of acetylcholinesterase of this class. Based on these findings, a new series of 1,2,3-triazole based derivatives were designed, among which compound A1 with the highest predictive activity was subjected to detailed molecular docking and compared to the most active compound.

14p tudichquannguyet 29-11-2021 10 1 Download
Feature clustering for pso - based feature construction on high dimensional data

This paper proposes a cluster based PSO feature construction approach called ClusPSOFC. The Redundancy-Based Feature Clustering (RFC) algorithm was applied to choose the most informative features from the original data, while PSO was used to construct new features from those selected by RFC.

34p spiritedaway36 28-11-2021 6 1 Download
An AUC-based permutation variable importance measure for random forests

The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs).

11p viwyoming2711 16-12-2020 13 0 Download
Feature weight estimation for gene selection: A local hyperlinear learning approach

Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task . One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises.

13p vikentucky2711 26-11-2020 5 0 Download
Non-specific filtering of beta-distributed data

Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis.

14p vikentucky2711 26-11-2020 10 0 Download
Proposal of supervised data analysis strategy of plasma miRNAs from hybridisation array data with an application to assess hemolysis-related deregulation

Plasma miRNAs have the potential as cancer biomarkers but no consolidated guidelines for data mining in this field are available. The purpose of the study was to apply a supervised data analysis strategy in a context where prior knowledge is available, i.e., that of hemolysis-related miRNAs deregulation, so as to compare our results with existing evidence.

10p vioklahoma2711 19-11-2020 11 2 Download
Sparse Proteomics Analysis – a compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data

High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested in how mass spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease.

20p vioklahoma2711 19-11-2020 12 1 Download
Feature selection for high-dimensional temporal data

Feature selection is commonly employed for identifying collectively-predictive biomarkers and biosignatures; it facilitates the construction of small statistical models that are easier to verify, visualize, and comprehend while providing insight to the human expert.

14p viconnecticut2711 28-10-2020 8 1 Download
Cost-Constrained feature selection in binary classification: Adaptations for greedy forward selection and genetic algorithms

With modern methods in biotechnology, the search for biomarkers has advanced to a challenging statistical task exploring high dimensional data sets. Feature selection is a widely researched preprocessing step to handle huge numbers of biomarker candidates and has special importance for the analysis of biomedical data. Such data sets often include many input features not related to the diagnostic or therapeutic target variable.

21p vicolorado2711 22-10-2020 17 0 Download
GARS: Genetic algorithm for the identification of a robust subset of features in high-dimensional datasets

Feature selection is a crucial step in machine learning analysis. Currently, many feature selection approaches do not ensure satisfying results, in terms of accuracy and computational time, when the amount of data is huge, such as in ‘Omics’ datasets.

11p vicolorado2711 22-10-2020 11 2 Download
Hellinger distance-based stable sparse feature selection for high-dimensional class-imbalanced data

Feature selection in class-imbalance learning has gained increasing attention in recent years due to the massive growth of high-dimensional class-imbalanced data across many scientific fields. In addition to reducing model complexity and discovering key biomarkers, feature selection is also an effective method of combating overlapping which may arise in such data and become a crucial aspect for determining classification performance.

14p vicolorado2711 22-10-2020 21 2 Download
Profit agent classification using feature selection eigenvector centrality

In this paper we applied a feature selection based on graph method, graph method identifies the most important nodes that are interrelated with neighbors nodes.

11p lucastanguyen 01-06-2020 8 1 Download
An application of feature selection for the fuzzy rule based classifier design with the order based semantics of linguistic terms for high dimensional datasets

The fuzzy rule based classification system (FRBCS) design methods, whose fuzzy rules are in the form of if-then sentences, have been under intensive study during last years. The fuzzy rule based classification system (FRBThis paper presents an approach to tackle the high-dimensional dataset problem for the hedge algebras based classification method proposed in by utilizing the feature selection algorithm proposed inS) design methods, whose fuzzy rules are in the form of if-then sentences, have been under intensive study during last years.

14p dieutringuyen 07-06-2017 48 1 Download
Báo cáo khoa học: "Co-training for Predicting Emotions with Spoken Dialogue Data"

Natural Language Processing applications often require large amounts of annotated training data, which are expensive to obtain. In this paper we investigate the applicability of Co-training to train classifiers that predict emotions in spoken dialogues. In order to do so, we have first applied the wrapper approach with Forward Selection and Naïve Bayes, to reduce the dimensionality of our feature set. Our results show that Co-training can be highly effective when a good set of features are chosen. ...

4p bunbo_1 17-04-2013 47 2 Download