![](images/graphics/blank.gif)
High-dimensional data sets
-
Prediction of patient survival from tumor molecular ‘-omics’ data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of “high dimension”, as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events.
16p
vialfrednobel
23-12-2023
3
3
Download
-
Automated clinical phenotyping is challenging because word-based features quickly turn it into a high-dimensional problem, in which the small, privacy-restricted, training datasets might lead to overfitting. Pretrained embeddings might solve this issue by reusing input representation schemes trained on a larger dataset.
8p
visteverogers
24-06-2023
6
2
Download
-
Lecture Data mining: Lesson 20. The main topics covered in this chapter include: dimensionality reduction; high-dimensional datasets; multi-dimensional scaling; pseudo-projections; Monte-Carlo algorithm;... Please refer to the content of document.
22p
tieuvulinhhoa
22-09-2022
7
3
Download
-
The rapid growth of data has become a huge challenge for software systems. The quality of fault prediction model depends on the quality of software dataset. High-dimensional data is the major problem that affects the performance of the fault prediction models. In order to deal with dimensionality problem, feature selection is proposed by various researchers.
7p
viplato
05-04-2022
17
1
Download
-
The single-molecule multiplex chromatin interaction data are generated by emerging 3D genome mapping technologies such as GAM, SPRITE, and ChIA-Drop. These datasets provide insights into high-dimensional chromatin organization, yet introduce new computational challenges.
13p
vielonmusk
30-01-2022
14
0
Download
-
High-throughput sequencing data are dramatically increasing in volume. Thus, there is urgent need for efficient tools to perform fast and integrative analysis of multiple data types. Enriched heatmap is a specific form of heatmap that visualizes how genomic signals are enriched over specific target regions. It is commonly used and efficient at revealing enrichment patterns especially for high dimensional genomic and epigenomic datasets.
7p
vibeauty
23-10-2021
12
0
Download
-
With the advance of high throughput sequencing, high-dimensional data are generated. Detecting dependence/correlation between these datasets is becoming one of most important issues in multi-dimensional data integration and co-expression network construction.
14p
vijeeni2711
30-06-2021
13
1
Download
-
The wealth of gene expression values being generated by high throughput microarray technologies leads to complex high dimensional datasets. Moreover, many cohorts have the problem of imbalanced classes where the number of patients belonging to each class is not the same.
10p
viwyoming2711
16-12-2020
15
1
Download
-
To leverage the potential of multi-omics studies, exploratory data analysis methods that provide systematic integration and comparison of multiple layers of omics information are required. We describe multiple co-inertia analysis (MCIA), an exploratory data analysis method that identifies co-relationships between multiple high dimensional datasets.
13p
vikentucky2711
26-11-2020
17
3
Download
-
A generalized notion of biclustering involves the identification of patterns across subspaces within a data matrix. This approach is particularly well-suited to analysis of heterogeneous molecular biology datasets, such as those collected from populations of cancer patients.
14p
vikentucky2711
26-11-2020
10
2
Download
-
In the last few years, the Non-negative Matrix Factorization (NMF) technique has gained a great interest among the Bioinformatics community, since it is able to extract interpretable parts from high-dimensional datasets.
12p
vikentucky2711
26-11-2020
19
0
Download
-
Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes.
19p
vikentucky2711
24-11-2020
13
1
Download
-
Metabolomics datasets are often high-dimensional though only a limited number of variables are expected to be informative given a specific research question. The important task of selecting informative variables can therefore become complex. In this paper we look at discriminating between two groups.
12p
vioklahoma2711
19-11-2020
12
4
Download
-
In the context of high-throughput molecular data analysis it is common that the observations included in a dataset form distinct groups; for example, measured at different times, under different conditions or even in different labs. These groups are generally denoted as batches.
19p
vioklahoma2711
19-11-2020
11
2
Download
-
For clinical genomic studies with high-dimensional datasets, tree-based ensemble methods offer a powerful solution for variable selection and prediction taking into account the complex interrelationships between explanatory variables.
21p
vioklahoma2711
19-11-2020
13
1
Download
-
The last decades witnessed an explosion of large-scale biological datasets whose analyses require the continuous development of innovative algorithms. Many of these high-dimensional datasets are related to large biological networks with few or no experimentally proven interactions.
13p
vioklahoma2711
19-11-2020
10
1
Download
-
High throughput metabolomics makes it possible to measure the relative abundances of numerous metabolites in biological samples, which is useful to many areas of biomedical research. However, missing values (MVs) in metabolomics datasets are common and can arise due to both technical and biological reasons.
13p
vioklahoma2711
19-11-2020
8
0
Download
-
Detecting patterns in high-dimensional multivariate datasets is non-trivial. Clustering and dimensionality reduction techniques often help in discerning inherent structures. In biological datasets such as microbial community composition or gene expression data, observations can be generated from a continuous process, often unknown.
15p
viflorida2711
30-10-2020
9
1
Download
-
The inclusion of high-dimensional omics data in prediction models has become a well-studied topic in the last decades. Although most of these methods do not account for possibly different types of variables in the set of covariates available in the same dataset, there are many such scenarios where the variables can be structured in blocks of different types, e.g., clinical, transcriptomic, and methylation data.
14p
viconnecticut2711
28-10-2020
16
1
Download
-
Advances in high-resolution mass spectrometry facilitate the identification of hundreds of metabolites, thousands of proteins and their post-translational modifications. This remarkable progress poses a challenge to data analysis and visualization, requiring methods to reduce dimensionality and represent the data in a compact way.
4p
vicoachella2711
27-10-2020
14
0
Download
CHỦ ĐỀ BẠN MUỐN TÌM
![](images/graphics/blank.gif)