Stellenbosch University
Browse

A Nextflow-Based Automated Pipeline for Viral Assembly and Characterisation (EVEREST)

dataset
posted on 2025-03-07, 11:17 authored by Patricia Agudelo-Romero, Abhinav Sharma, Talya Conradie, Anthony Kicic, Jose A Caparros-Martin, Stephen Stick

EVEREST (pipEline for Viral assEmbly and chaRactEriSaTion) is a comprehensive, end-to-end pipeline designed for virus discovery and characterization. Implemented in Nextflow, it processes Illumina single- and paired-end reads through five key phases: pre-processing, filtering, de novo assembly, refinement, and classification. The pipeline ensures high-quality data by trimming, removing host sequences, eliminating duplicates, and applying digital normalization. It then assembles viral genomes using a de novo assembly strategy, clusters similar contigs, captures viral genomes, and assesses their quality. Finally, EVEREST classifies viral contigs using the NCBI (nucleotide) and Uniprot (amino acid) databases, providing a robust framework for identifying and characterizing viruses from sequencing data.

History

Publisher

Stellenbosch University

Contributor

Agudelo-Romero, P; Sharma, A; Conradie, T; Kicic, A; Caparros-Martin, J; & Stick, S.

Date

2025-03-04

Format

.pdf .yml .md .json .csv .txt .py .png .nf .toml .nf

Language

en

Geographical Location

Global

Academic Group

  • Science

Recommended Citation

Agudelo-Romero, P, Sharma, A, Conradie, T, Kicic, A, Caparros-Martin, J & Stick, S. 2025. A Nextflow-Based Automated Pipeline for Viral Assembly and Characterisation (EVEREST). Stellenbosch University. Dataset. DOI: https://doi.org/10.25413/sun.28553732

Sustainable Development Goals (SDGs)

  • Goal 9​: INDUSTRY, INNOVATION & INFRASTRUCTURE
  • Goal 11:​ SUSTAINABLE CITIES & COMMUNITIES

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC