Software Benchmark—Classification Tree Algorithms for Cell Atlases Annotation Using Single-Cell RNA-Sequencing Data

Alaqeeli, Omar and Xing, Li and Zhang, Xuekui (2021) Software Benchmark—Classification Tree Algorithms for Cell Atlases Annotation Using Single-Cell RNA-Sequencing Data. Microbiology Research, 12 (2). pp. 317-334. ISSN 2036-7481

[thumbnail of microbiolres-12-00022-v2.pdf] Text
microbiolres-12-00022-v2.pdf - Published Version

Download (2MB)

Abstract

Software Benchmark—Classification Tree Algorithms for Cell Atlases Annotation Using Single-Cell RNA-Sequencing Data Omar Alaqeeli http://orcid.org/0000-0003-4030-6648 Li Xing http://orcid.org/0000-0002-4186-7909 Xuekui Zhang http://orcid.org/0000-0003-4728-2343

Classification tree is a widely used machine learning method. It has multiple implementations as R packages; rpart, ctree, evtree, tree and C5.0. The details of these implementations are not the same, and hence their performances differ from one application to another. We are interested in their performance in the classification of cells using the single-cell RNA-Sequencing data. In this paper, we conducted a benchmark study using 22 Single-Cell RNA-sequencing data sets. Using cross-validation, we compare packages’ prediction performances based on their Precision, Recall, F1-score, Area Under the Curve (AUC). We also compared the Complexity and Run-time of these R packages. Our study shows that rpart and evtree have the best Precision; evtree is the best in Recall, F1-score and AUC; C5.0 prefers more complex trees; tree is consistently much faster than others, although its complexity is often higher than others.
04 07 2021 317 334 microbiolres12020022 Natural Sciences and Engineering Research Council of Canada http://dx.doi.org/10.13039/501100000038 RGPIN-2017-04722 Canada Research Chairs http://dx.doi.org/10.13039/501100001804 950-231363 https://creativecommons.org/licenses/by/4.0/ 10.3390/microbiolres12020022 https://www.mdpi.com/2036-7481/12/2/22 https://www.mdpi.com/2036-7481/12/2/22/pdf 10.1038/s41422-020-0355-0 10.1038/s41467-020-20059-6 10.1126/sciadv.aba1983 10.1093/nar/gkv806 10.1126/science.1232542 10.1038/nmeth.2771 10.1038/nrg.2015.16 10.1038/nrg2626 10.1109/tcbb.2007.1078 10.1101/gr.192237.115 10.1038/nmeth.2764 10.1101/gr.110882.110 10.1126/science.1254257 10.1126/sciadv.aba1972 10.1109/ISBI.2011.5872808 10.1073/pnas.1507125112 10.1038/s41592-019-0535-3 10.1093/bib/bbw057 10.1186/gb-2014-15-2-r29 10.1093/nar/gks042 10.1186/gb-2013-14-9-r95 10.1007/s40484-016-0089-7 10.1038/nrg3833 10.1186/gb-2010-11-12-220 10.1093/bib/bbt086 10.1186/s12859-019-2599-6 10.3389/fgene.2019.01253 10.1093/bioinformatics/btu638 10.1038/nbt.3000 10.1186/s12859-016-0944-6 10.1038/nmeth.4612 Breiman Classification and Regression Trees 1984 10.1198/106186006X133933 10.18637/jss.v061.i01

Item Type: Article
Subjects: OA Library Press > Medical Science
Depositing User: Unnamed user with email support@oalibrarypress.com
Date Deposited: 16 Jun 2023 06:26
Last Modified: 20 Jul 2024 09:25
URI: http://archive.submissionwrite.com/id/eprint/1215

Actions (login required)

View Item
View Item