Benchmark datasets
-
In this study we generated a whole exome sequencing benchmark dataset using the platinum genome sample NA12878 and developed an intersect-then-combine (ITC) approach to increase the accuracy in calling single nucleotide variants (SNVs) and indels in tumour-normal pairs. We evaluated the effect of alignment, base quality recalibration, mutation caller and filtering on sensitivity and false positive rate.
11p vioraclene 31-03-2024 4 2 Download
-
The proposed algorithm utilizes the migration method to identify local extremes and then relocates the population to explore new solution spaces for further evolution. The MIMEM algorithm is evaluated on the iMOPSE benchmark dataset, and the results demonstrate that it outperforms.
8p visystrom 22-11-2023 8 3 Download
-
This paper presents the first challenge on recognizing textual entailment (RTE), also known as natural language inference (NLI), held in a Vietnamese Language and Speech Processing workshop (VLSP 2021). The challenge aims to determine, for a given pair of sentences, whether the two sentences semantically agree, disagree, or are neutral/irrelevant to each other.
10p viberkshire 09-08-2023 5 3 Download
-
This makes comparing the performance among different methods and developing solutions for various problems challenging. To address this issue, we have constructed two standardized datasets of thyroid scintigraphy images for identifying and quantifying the depth. The purpose of designing the models is to establish a benchmark assessment for developing CADx models on the datasets in the future.
8p viwarmachine 01-07-2023 6 2 Download
-
To create a high-quality electronic health record (EHR)–derived mortality dataset for retrospective and prospective real-world evidence generation. Data Sources/Study Setting. Oncology EHR data, supplemented with external commercial and US Social Security Death Index data, benchmarked to the National Death Index (NDI).
17p vigamora 25-05-2023 6 2 Download
-
Recently, latent representation models, such as Shrink Autoencoder (SAE), have been demonstrated as robust feature representations for one-class learning-based network anomaly detection. In these studies, benchmark network datasets that are processed in laboratory environments to make them completely clean are often employed for constructing and evaluating such models.
11p viirenerosenfeld 26-05-2022 14 3 Download
-
Chromatin conformation capture (3C)-based technologies have enabled the accurate detection of topological genomic interactions, and the adoption of ChIP techniques to 3C-based protocols makes it possible to identify long-range interactions.
21p viarchimedes 26-01-2022 15 0 Download
-
Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory.
17p viarchimedes 26-01-2022 12 0 Download
-
Large-scale single-cell transcriptomic datasets generated using different technologies contain batchspecific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial.
32p viarchimedes 26-01-2022 15 0 Download
-
This paper used the Malay dataset of 100 new articles covering the natural disaster and events domain to find the optimal compression rate and its effect on the summary content.
24p spiritedaway36 28-11-2021 9 1 Download
-
This approach provides a unified framework for leveraging the ability of the Transformer’s self-attention mechanism in modeling session sequences while taking into account the user’s main interest in the session. We empirically evaluate the proposed method on two benchmark datasets. The results show that DTER outperforms state-of-the-art session-based recommendation methods on common evaluation metrics.
17p spiritedaway36 25-11-2021 16 2 Download
-
In this paper, a new hybrid meta-heuristic algorithm for FCP called BHHS which combines the power of existing meta-heuristic frameworks such as Black Hole and Harmony Search is proposed. These two algorithms cooperate and support each other. The Black Hole part of the algorithm has the goal to find good candidate solution while the Harmony Search part has the goal to generate new candidate solutions for Black Hole when the event horizon of the Black Hole occurs. In addition, new best solutions of these two components are exchanged with each other to improve the quality of the search.
7p chauchaungayxua11 23-03-2021 15 1 Download
-
It is important to accurately determine the performance of peptide:MHC binding predictions, as this enables users to compare and choose between different prediction methods and provides estimates of the expected error rate.
9p vikentucky2711 26-11-2020 7 0 Download
-
In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods.
15p vikentucky2711 24-11-2020 8 1 Download
-
Alignment of large and diverse sequence sets is a common task in biological investigations, yet there remains considerable room for improvement in alignment quality. Multiple sequence alignment programs tend to reach maximal accuracy when aligning only a few sequences, and then diminish steadily as more sequences are added.
14p vikentucky2711 24-11-2020 6 1 Download
-
There have been great advancements in the field of digital pathology. The surge in development of analytical methods for such data makes it crucial to develop benchmark synthetic datasets for objectively validating and comparing these methods.
16p vioklahoma2711 19-11-2020 16 2 Download
-
Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it.
11p vioklahoma2711 19-11-2020 6 1 Download
-
The automated prediction of the enzymatic functions of uncharacterized proteins is a crucial topic in bioinformatics. Although several methods and tools have been proposed to classify enzymes, most of these studies are limited to specific functional classes and levels of the Enzyme Commission (EC) number hierarchy.
13p viconnecticut2711 28-10-2020 14 0 Download
-
Siamese network-based trackers have achieved excellent performance on visual object tracking. Some Siamese network experiments on long-terms visual tracking benchmarks achieve state-of-the-art performance, confirming its effectiveness and efficiency. In this work, we study state-of-the-art Siamese networks, then, propose a model based on Siamese architecture to tracking UAV from the Anti-UAV Challenge dataset include 100 videos infrared.
8p viv2711 27-10-2020 7 1 Download
-
In this paper, we present a novel global protein-protein interaction network alignment algorithm, which is enhanced with an extended large neighborhood search heuristics. Evaluated on benchmark datasets of yeast, fly, human and worm, the proposed algorithm outperforms state-of-the-art algorithms. Furthermore, the complexity of ours is polynomial, thus being scalable to large biological networks in practice.
11p tamynhan4 06-09-2020 13 3 Download