Loading...
Thumbnail Image
Item

nf-core/proteinfamilies: a scalable pipeline for the generation of protein families

Karatzas, Evangelos
Beracochea, Martin
Baltoumas, Fotis A
Aplakidou, Eleni
Richardson, Lorna
Yates, James A Fellows
Lundin, Daniel
community, nf-core
Buluç, Aydin
Kyrpides, Nikos C
... show 3 more
Research Projects
Organizational Units
Journal Issue
Abstract
The growth of metagenomics-derived amino acid sequence data has transformed our understanding of protein function, microbial diversity, and evolutionary relationships. However, the vast majority of these proteins remain functionally uncharacterized. Grouping the millions of such uncharacterized sequences with the few experimentally characterized ones allows the transfer of annotations, while the inspection of conserved residues with multiple sequence alignments can provide clues to function, even in the absence of existing functional information. To address the challenges associated with this data surge and the need to group sequences, we present a scalable, open-source, parametrizable Nextflow pipeline (nf-core/proteinfamilies) that generates nascent protein families or assigns new proteins to existing families. The computational benchmarks demonstrated that resource usage scales approximately linearly with input size, and the biological benchmarks showed that the generated protein families closely resemble manually curated families in widely used databases.
Citation
E. Karatzas, M. Beracochea, F.A. Baltoumas, E. Aplakidou, L. Richardson, J.A.F. Yates , et al., "nf-core/proteinfamilies: a scalable pipeline for the generation of protein families," GigaScience, vol. 15, pp. giag009-giag009, 2026, https://doi.org/10.1093/gigascience/giag009.
Source
GigaScience
Conference
Keywords
31 Biological Sciences, 3101 Biochemistry and Cell Biology, 3102 Bioinformatics and Computational Biology, Computational Biology, Databases, Protein, Metagenomics, Molecular Sequence Annotation, Proteins, Sequence Alignment, Software
Subjects
Source
Publisher
Oxford University Press
Full-text link