A Nextflow-Based Automated Pipeline for Viral Assembly and Characterisation (EVEREST)
EVEREST (pipEline for Viral assEmbly and chaRactEriSaTion) is a comprehensive, end-to-end pipeline designed for virus discovery and characterization. Implemented in Nextflow, it processes Illumina single- and paired-end reads through five key phases: pre-processing, filtering, de novo assembly, refinement, and classification. The pipeline ensures high-quality data by trimming, removing host sequences, eliminating duplicates, and applying digital normalization. It then assembles viral genomes using a de novo assembly strategy, clusters similar contigs, captures viral genomes, and assesses their quality. Finally, EVEREST classifies viral contigs using the NCBI (nucleotide) and Uniprot (amino acid) databases, providing a robust framework for identifying and characterizing viruses from sequencing data.
History
Publisher
Stellenbosch UniversityContributor
Agudelo-Romero, P; Sharma, A; Conradie, T; Kicic, A; Caparros-Martin, J; & Stick, S.Date
2025-03-04Format
.pdf .yml .md .json .csv .txt .py .png .nf .toml .nfLanguage
enGeographical Location
GlobalAcademic Group
- Science
Recommended Citation
Agudelo-Romero, P, Sharma, A, Conradie, T, Kicic, A, Caparros-Martin, J & Stick, S. 2025. A Nextflow-Based Automated Pipeline for Viral Assembly and Characterisation (EVEREST). Stellenbosch University. Dataset. DOI: https://doi.org/10.25413/sun.28553732Sustainable Development Goals (SDGs)
- Goal 9: INDUSTRY, INNOVATION & INFRASTRUCTURE
- Goal 11: SUSTAINABLE CITIES & COMMUNITIES