Data imbalance
-
This approach can be problematic because SDN networks have different characteristics than traditional computer systems. In this paper, we propose a new method for SDN intrusion detection using machine learning. Our method addresses the problem of data imbalance, which is a common problem with machine learning datasets.
14p vigojek 02-02-2024 3 1 Download
-
Ebook "Generational accounting: Theory and application" gives a complete and up-to-date introduction to the theory and practice of the method. It reveals deficiencies of the original residual concept and discusses various measures of intergenerational redistribution based on the recent sustainability approach to generational accounting. An application using data on German public finances serves to provide an in-depth explanation and practical illustration of the technique.
269p loivantrinh 29-10-2023 5 3 Download
-
Part 1 book "BSAVA manual of canine and feline clinical pathology" includes content: In-house versus external testing, quality assurance and interpretation of laboratory data, introduction to haematology, disorders of erythrocytes, disorders of leucocytes, disorders of haemostasis, disorders of plasma proteins, electrolyte imbalances, blood gas analysis and acid–base disorders, urinalysis, laboratory evaluation of renal disorders, laboratory evaluation of hepatic disease, laboratory evaluation of gastrointestinal disease, laboratory evaluation of exocrine pancreatic disease.
315p oursky05 23-10-2023 4 3 Download
-
In this paper, we use Genetic Programming (GP) to predict if there will be heavy rain the next day. GP is an evolution-based machine learning methodology that can identify the model’s functional form as well as its numerical coefficients. Our model was trained and evaluated on a data set collected from 17 stations in Vietnam’s provinces.
11p vikissinger 03-03-2022 12 1 Download
-
Allele-specific measurements of transcription factor binding from ChIP-seq data are key to dissecting the allelic effects of non-coding variants and their contribution to phenotypic diversity. However, most methods of detecting an allelic imbalance assume diploid genomes.
17p vialfrednobel 29-01-2022 10 0 Download
-
The specific objectives were: 1) to improve the ensemble classifier through data-level approach (sampling and feature selection); 2) to perform experiments on sampling, feature selection, and ensemble classifier model; and 3) to evaluate the performance of the ensemble classifier.
31p spiritedaway36 28-11-2021 16 4 Download
-
In mammals, sex chromosomes pose an inherent imbalance of gene expression between sexes. In each female somatic cell, random inactivation of one of the X-chromosomes restores this balance. While most genes from the inactivated X-chromosome are silenced, 15–25% are known to escape X-inactivation (termed escapees). The expression levels of these genes are attributed to sex-dependent phenotypic variability.
17p visilicon2711 20-08-2021 9 1 Download
-
In this paper, we have surveyed some typical facial attribute learning methods. Five major categories of the state-of-the-art methods are identified: (1) Traditional learning, (2) Deep Single Task Learning, (3) Deep Multitask Learning, (4) Imbalanced Data Solver, and (5) Facial Attribute Ontology. They included from traditional learning algorithm to deep learning, along with methods that assist in solving semantic gaps based on ontology and solving data imbalances. For each algorithm of category, basic theories as well as their strengths, weaknesses, and differences are discussed.
20p angicungduoc11 18-04-2021 27 1 Download
-
The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs).
11p viwyoming2711 16-12-2020 13 0 Download
-
This paper presents a data classification problem and methods to improve imbalanced data classification. Especially, biomedical data has a very high imbalance rate and the sample identification of minority class is a very important. Many studies have shown that border elements are important in imbalanced data classification such as Borderline-SMOTE, Random Under Border Sampling.
10p tamynhan9 02-12-2020 14 2 Download
-
Allelic specific expression (ASE) increases our understanding of the genetic control of gene expression and its links to phenotypic variation. ASE testing is implemented through binomial or beta-binomial tests of sequence read counts of alternative alleles at a cSNP of interest in heterozygous individuals.
12p vikentucky2711 26-11-2020 18 0 Download
-
One aspect in which RNA sequencing is more valuable than microarray-based methods is the ability to examine the allelic imbalance of the expression of a gene. This process is often a complex task that entails quality control, alignment, and the counting of reads over heterozygous single-nucleotide polymorphisms.
6p vikentucky2711 24-11-2020 13 1 Download
-
Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions.
10p vioklahoma2711 19-11-2020 10 1 Download
-
The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization.
18p vioklahoma2711 19-11-2020 10 0 Download
-
Drug-drug interaction extraction (DDI) needs assistance from automated methods to address the explosively increasing biomedical texts. In recent years, deep neural network based models have been developed to address such needs and they have made significant progress in relation identification.
11p viconnecticut2711 29-10-2020 17 2 Download
-
The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures.
17p viconnecticut2711 28-10-2020 14 1 Download
-
One of the main issues in the automated protein function prediction (AFP) problem is the integration of multiple networked data sources. The UNIPred algorithm was thereby proposed to efficiently integrate —in a function-specific fashion— the protein networks by taking into account the imbalance that characterizes protein annotations, and to subsequently predict novel hypotheses about unannotated proteins.
19p vijisoo2711 27-10-2020 14 1 Download
-
High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression.
13p vicolorado2711 23-10-2020 11 1 Download
-
Microarray datasets consist of complex and high-dimensional samples and genes, and generally the number of samples is much smaller than the number of genes. Due to this data imbalance, gene selection is a demanding task for microarray expression data analysis.
15p vicolorado2711 23-10-2020 9 1 Download
-
Feature selection in class-imbalance learning has gained increasing attention in recent years due to the massive growth of high-dimensional class-imbalanced data across many scientific fields. In addition to reducing model complexity and discovering key biomarkers, feature selection is also an effective method of combating overlapping which may arise in such data and become a crucial aspect for determining classification performance.
14p vicolorado2711 22-10-2020 21 2 Download