Packaging and containerization of computational methods

Alser, M. et al. Technology dictates algorithms: recent developments in read alignment. Genome Biol. 22, 249 (2021).

Article  PubMed  PubMed Central  Google Scholar 

Mangul, S. et al. Systematic benchmarking of omics computational tools. Nat. Commun. 10, 1393 (2019).

Article  PubMed  PubMed Central  Google Scholar 

Alser, M., Eudine, J. & Mutlu, O. Genome-on-diet: taming large-scale genomic analyses via sparsified genomics. Preprint at arXiv https://doi.org/10.48550/arXiv.2211.08157 (2022).

Meyer, F. et al. Critical assessment of metagenome interpretation: the second round of challenges. Nat. Methods 19, 429–440 (2022).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Cox, R. Surviving software dependencies. Commun. ACM 62, 36–43 (2019).

Article  Google Scholar 

Mangul, S., Martin, L. S., Eskin, E. & Blekhman, R. Improving the usability and archival stability of bioinformatics software. Genome Biol. 20, 47 (2019).

Article  PubMed  PubMed Central  Google Scholar 

Mangul, S. et al. Challenges and recommendations to improve the installability and archival stability of omics computational tools. PLoS Biol. 17, e3000333 (2019).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Begley, C. G., Buchan, A. M. & Dirnagl, U. Robust research: institutions must do their part for reproducibility. Nature 525, 25–27 (2015).

Article  CAS  PubMed  Google Scholar 

Wratten, L., Wilm, A. & Göke, J. Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. Nat. Methods https://doi.org/10.1038/s41592-021-01254-9 (2021).

Article  PubMed  Google Scholar 

Brito, J. J. et al. Recommendations to enhance rigor and reproducibility in biomedical research. Gigascience 9, giaa056 (2020).

Article  PubMed  PubMed Central  Google Scholar 

Heil, B. J. et al. Reproducibility standards for machine learning in the life sciences. Nat. Methods 18, 1132–1135 (2021).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).

Article  CAS  PubMed  Google Scholar 

Malloy, B. A. & Power, J. F. An empirical analysis of the transition from Python 2 to Python 3. Empir. Softw. Eng. 24, 751–778 (2019).

Article  Google Scholar 

Gosden, J. A. Software compatibility. In Proc. December 9–11, 1968, Fall Joint Computer Conference, Part I—AFIPS ’68 (Fall, Part I) https://doi.org/10.1145/1476589.1476605 (ACM Press, 1968).

Abate, P., Di Cosmo, R., Treinen, R. & Zacchiroli, S. A modular package manager architecture. Inf. Softw. Technol. 55, 459–474 (2013).

Article  Google Scholar 

Decan, A., Mens, T. & Grosjean, P. An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir. Softw. Eng. 24, 381–416 (2018).

Article  Google Scholar 

Boettiger, C. An introduction to Docker for reproducible research. ACM SIGOPS Oper. Syst. Rev. 49, 71–79 (2015). 49.

Article  Google Scholar 

Silver, A. Software simplified. Nature 546, 173–174 (2017).

Article  CAS  PubMed  Google Scholar 

Dunn, M. C. & Bourne, P. E. Building the biomedical data science workforce. PLoS Biol. 15, e2003082 (2017).

Article  PubMed  PubMed Central  Google Scholar 

Florance, V. in Informatics Education in Healthcare: Lessons Learned (ed. Berner, E. S.) 125–133 (Springer, 2020).

Bush, W. S., Wheeler, N., Darabos, C. & Beaulieu-Jones, B. in Biocomputing 2022 412–416 (World Scientific, 2021).

Wu, J. et al. Virtual meetings promise to eliminate geographical and administrative barriers and increase accessibility, diversity and inclusivity. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01176-z (2021).

Article  PubMed  PubMed Central  Google Scholar 

Siepel, A. Challenges in funding and developing genomic software: roots and remedies. Genome Biol. 20, 147 (2019).

Article  PubMed  PubMed Central  Google Scholar 

Gardner, P. P. et al. Sustained software development, not number of citations or journal choice, is indicative of accurate bioinformatic software. Genome Biol. 23, 56 (2022).

Article  PubMed  PubMed Central  Google Scholar 

Hoffman, D. et al. The BOGUS Linux Release https://bogus.org/ (2003)

Fernández-Sanguino, J. et al. A Brief History of Debian Ch. 4 https://www.debian.org/doc/manuals/project-history/detailed.en.html (2023).

Gunthorpe, J. APT User’s Guide https://www.debian.org/doc/manuals/apt-guide/index.en.html (1998).

Leonard, T. Introduction. Zero Install Docs https://docs.0install.net/basics/ (CERN Web Services, 2003).

Conda documentation. Conda https://docs.conda.io/en/latest/ (2017).

Bicking, I. pip 24.0. PyPI https://pypi.org/project/pip/ (2024).

Parnas, D. L. Designing software for ease of extension and contraction. IEEE Trans. Softw. Eng. SE-5, 128–138 (1979).

Article  Google Scholar 

Claes, M., Mens, T., Di Cosmo, R. & Vouillon, J. A historical analysis of Debian package incompatibilities. 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories https://doi.org/10.1109/msr.2015.27 (2015).

Dolstra, E., De Jonge, M., Visser, E. & Others. Nix: a safe and policy-free system for software deployment. In LISA 4, 79–92 (2004).

Grüning, B. et al. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat. Methods 15, 475–476 (2018).

Article  PubMed  Google Scholar 

Mancinelli, F. et al. Managing the complexity of large free and open source package-based software distributions. In 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06) 199–208 (2006).

Gamblin, T. et al. The Spack package manager. In Proc. International Conference for High Performance Computing, Networking, Storage and Analysis on SC ’15. https://doi.org/10.1145/2807591.2807623 (2015).

Hoste, K., Timmerman, J., Georges, A. & De Weirdt, S. EasyBuild: building software with ease. In 2012 SC Companion.: High. Perform. Comput., Netw. Storage Anal. https://doi.org/10.1109/sc.companion.2012.81 (2012).

Dongarra, J. Report on the Fujitsu Fugaku System. Tech. Report No. ICLUT-20-06 (Univ. Tennessee Knoxville Innovative Computing Laboratory, 2020).

Dagnat, F., Simon, G. & Zhang, X. Toward a distributed package management system. In Lococo 2011: Workshop on Logics for Component Configuration (2011).

Kamp, P.-H. & Watson, R. N. M. Jails: confining the omnipotent root. Proc. 2nd Int. SANE Conf. 43, 116 (2000).

Google Scholar 

Syed, M. H. & Fernandez, E. B. The software container pattern. In Proc. 22nd Conference on Pattern Languages of Programs 24–26 (The Hillside Group, 2015).

da Veiga Leprevost, F. et al. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics 33, 2580–2582 (2017).

Article  PubMed  PubMed Central  Google Scholar 

Adair, R. J., Bayles, R. U., Comeau, L. W. & Creasy, R. J. A Virtual Machine System for the 360/40. Tech. Report (International Business Machines Corporation, 1966).

Smith, J. & Nair, R. Virtual Machines: Versatile Platforms for Systems and Processes (Elsevier, 2005).

Angiuoli, S. V. et al. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinform. https://doi.org/10.1186/1471-2105-12-356 (2011).

Merkel, D. et al. Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014, 2 (2014).

Google Scholar 

Cook, J. in Docker for Data Science 103–118 (Apress, 2017).

Kurtzer, G. M., Sochat, V. & Bauer, M. W. Singularity: scientific containers for mobility of compute. PLoS ONE 12, e0177459 (2017).

Article  PubMed  PubMed Central  Google Scholar 

Huang, D., Cui, H., Wen, S. & Huang, C. Security analysis and threats detection techniques on Docker container. In 2019 IEEE 5th International Conference on Computer and Communications (ICCC) 1214–1220 (2019).

Tomar, A., Jeena, D., Mishra, P. & Bisht, R. Docker security: a threat model, attack taxonomy and real-time attack scenario of DoS. In 2020 10th International Conference on Cloud Computing, Data Science and Engineering (Confluence) 150–155 (2020).

Zahid, F., Kuo, M. M. Y. & Sinha, R. Light-weight active security for detecting DDoS attacks in containerised ICPS. In 2021 18th International Conference on Privacy, Security and Trust (PST) 1–5 (2021).

Martin, A., Raponi, S., Combe, T. & Di Pietro, R. Docker ecosystem—vulnerability analysis. Comput. Commun. 122, 30–43 (2018).

Article  Google Scholar 

Galaxy Community. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res. 50, W345–W351 (2022).

Zhang, Z. et al. Uniform genomic data analysis in the NCI Genomic Data Commons. Nat. Commun. 12, 1226 (2021).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Seven Bridges Genomics—the biomedical data analysis company. Seven Bridges https://www.sevenbridges.com (2016).

Hornik, K. The comprehensive R archive network. Wiley Interdiscip. Rev. Comput. Stat. 4, 394–398 (2012).

Article  Google Scholar 

Lawlor, B. & Sleator, R. D. The democratization of bioinformatics: a software engineering perspective. Gigascience 9, giaa063 (2020).

Article  PubMed  PubMed Central  Google Scholar 

Shirinbab, S., Lundberg, L. & Casalicchio, E. Performance evaluation of containers and virtual machines when running Cassandra workload concurrently. Concurr. Comput. Pract. Exp. 32, e5693 (2020).

Article  Google Scholar 

Felter, W., Ferreira, A., Rajamony, R. & Rubio, J. An updated performance comparison of virtual machines and Linux containers. In 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) 171–172 (2015).

BioBuilds home. L7 informatics https://l7informatics.com/resource-center/biobuilds-home/ (2018).

Yuen, D. et al. The Dockstore: enhancing a community platform for sharing reproducible and accessible computational protocols. Nucleic Acids Res. 49, W624–W632 (2021).

Article  CAS  PubMed  PubMed Central  Google Scholar 

Belmann, P. et al. Bioboxes: standardised containers for interchangeable bioinformatics software. Gigascience 4, 47 (2015).

Article  PubMed  PubMed Central  Google Scholar 

Field, D. et al. Open software for biologists: from famine to feast. Nat. Biotechnol. 24, 801–803 (2006).

Article  CAS  PubMed  Google Scholar 

Yuen, D. et al. ga4gh/tool-registry-service-schemas: 2.0.1. Zenodo https://zenodo.org/doi/10.5281/zenodo.1193735 (2022).

Dagnat, F. & Simon, G. Toward a distributed package management system. In Lococo 2011: Workshop on Logics for Component Configuration (2011).

Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

留言 (0)

沒有登入
gif