intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Imbalanced data

Xem 1-17 trên 17 kết quả Imbalanced data
  • Minority oversampling is a standard approach used for adjusting the ratio between the classes on imbalanced data. However, established methods often provide modest improvements in classification performance when applied to data with extremely imbalanced class distribution and to mixed-type data.

    pdf8p vighostrider 25-05-2023 3 2   Download

  • To predict the characteristics of external causes of road trafc accident (RTA) injuries and mortality, we compared performances based on differences in the correction and classifcation techniques for imbalanced samples.

    pdf10p viferrari 28-11-2022 3 2   Download

  • This sentence-level identification information is used by a teacher network to guide the baseline model’s training by sharing its classifier. Like an instructor, the classifier improves the baseline model’s ability to extract this sentence-level identification information from raw texts, thus benefiting overall performance.

    pdf17p guernsey 28-12-2021 10 0   Download

  • In this paper, we have surveyed some typical facial attribute learning methods. Five major categories of the state-of-the-art methods are identified: (1) Traditional learning, (2) Deep Single Task Learning, (3) Deep Multitask Learning, (4) Imbalanced Data Solver, and (5) Facial Attribute Ontology. They included from traditional learning algorithm to deep learning, along with methods that assist in solving semantic gaps based on ontology and solving data imbalances. For each algorithm of category, basic theories as well as their strengths, weaknesses, and differences are discussed.

    pdf20p angicungduoc11 18-04-2021 26 1   Download

  • In this paper, we propose a novel solution to this problem by using generative adversarial networks to generate synthesized attack data for IDS. The synthesized attacks are merged with the original data to form the augmented dataset. Three popular machine learning techniques are trained on the augmented dataset.

    pdf13p nguaconbaynhay11 16-04-2021 23 2   Download

  • The paper depicts complete study about the second method with some proposed algorithms. It focuses mainly on binary classification with kNN and SVM for imbalanced data. Experiments and comparisons among related methods will confirm pros and coin of each method with respect to performance accuracy and time consumption.

    pdf20p viguam2711 11-01-2021 10 2   Download

  • The wealth of gene expression values being generated by high throughput microarray technologies leads to complex high dimensional datasets. Moreover, many cohorts have the problem of imbalanced classes where the number of patients belonging to each class is not the same.

    pdf10p viwyoming2711 16-12-2020 15 1   Download

  • This paper presents a data classification problem and methods to improve imbalanced data classification. Especially, biomedical data has a very high imbalance rate and the sample identification of minority class is a very important. Many studies have shown that border elements are important in imbalanced data classification such as Borderline-SMOTE, Random Under Border Sampling.

    pdf10p tamynhan9 02-12-2020 14 2   Download

  • Aptamer-protein interacting pairs play a variety of physiological functions and therapeutic potentials in organisms. Rapidly and effectively predicting aptamer-protein interacting pairs is significant to design aptamers binding to certain interested proteins, which will give insight into understanding mechanisms of aptamer-protein interacting pairs and developing aptamer-based therapies.

    pdf13p vioklahoma2711 19-11-2020 14 2   Download

  • The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features).

    pdf17p vioklahoma2711 19-11-2020 10 0   Download

  • The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization.

    pdf18p vioklahoma2711 19-11-2020 10 0   Download

  • This study proposed a hybrid machine learning model which is based on k-nearest neighbors (KNN) and Bayesian optimization (BO), named as BOKNN, for predicting the local damages of reinforced concrete (RC) panels under missile impact loading. In the proposed BO-KNN, the hyperparameters of the KNN were optimized by using the BO which is a wellestablished optimization algorithm. Accordingly, the KNN was trained on an experimental dataset that consists of 254 impact tests to predict four levels (or classes) of damages including perforation, scabbing, penetration, and no damage.

    pdf14p cothumenhmong8 04-11-2020 14 2   Download

  • Mass spectra are usually acquired from the Liquid Chromatography-Mass Spectrometry (LC-MS) analysis for isotope labeled proteomics experiments. In such experiments, the mass profiles of labeled (heavy) and unlabeled (light) peptide pairs are represented by isotope clusters (2D or 3D) that provide valuable information about the studied biological samples in different conditions.

    pdf12p vicolorado2711 23-10-2020 9 1   Download

  • Feature selection in class-imbalance learning has gained increasing attention in recent years due to the massive growth of high-dimensional class-imbalanced data across many scientific fields. In addition to reducing model complexity and discovering key biomarkers, feature selection is also an effective method of combating overlapping which may arise in such data and become a crucial aspect for determining classification performance.

    pdf14p vicolorado2711 22-10-2020 21 2   Download

  • In this paper, in order to increase the accuracy of the prediction model in imbalanced data classification problem, we propose a new cluster-based sampling method to address this work. Performing tests on a number of datasets, we have achieved important results when compared to cases without using any data balancing strategies and previous method.

    pdf9p koxih_kothogmih5 04-09-2020 22 3   Download

  • In this paper, we present an overview of the imbalanced data classification and the difficulties encountered in current approaches, from which we propose a new method, SMOTE-PLS. To evaluate the effectiveness of this new method, we conducted experiments based on standard cancer data sets from UCI sources, including breast-p, coil2000, leukemia, colon-cancer, and yeast.

    pdf9p koxih_kothogmih5 04-09-2020 4 1   Download

  • One of the main reasons for choosing ARC is for its superior ability at handling imbalanced class distributions. It utilizes the association rule mining, making sampling unnecessary in many cases otherwise requiring sampling. In [WZYY05], ARC has been shown to produce the best result among many algorithms on the data set used for KDD- 98 [Kdd98], which has a skewed class distribution. In addition, ARC can handle high dimensionality (the data set has more than 400 variables) without a considerably long running time.

    pdf34p lenh_hoi_xung 21-02-2013 97 5   Download

CHỦ ĐỀ BẠN MUỐN TÌM

TOP DOWNLOAD
320 tài liệu
1226 lượt tải
ADSENSE

nocache searchPhinxDoc

 

Đồng bộ tài khoản
2=>2