nf-core/ampliseq: a versatile, comprehensive, and reproducible amplicon sequencing analysis pipeline

DaS · February 1, 2022, 9:42am

I'd like to advertise nf-core/ampliseq (ampliseq: Introduction, GitHub - nf-core/ampliseq: Amplicon sequencing analysis workflow using DADA2 and QIIME2), a bioinformatics analysis pipeline used for amplicon sequencing.

The pipeline is simple to install (requires nextflow and any container software, identical to all other nf-core pipelines), allows a variety of input data (Illumina, IronTorrent, PacBio using DADA2), supports many databases out of the box (e.g. SILVA, UNITE, PR2, GTDB), produces intuitive and interactive output (using QIIME2), and reports extensive quality control measures. nf-core/ampliseq runs on almost any compute infrastructure including laptops (Unix; Windows only with WSL2), compute clusters, or AWS cloud.

There is comprehensive documentation on how to run any nf-core pipeline (Docs: Getting started) and install required components (Docs: Installation of nf-core dependencies). A video introduction to nf-core/ampliseq can be found at https://youtu.be/a0VOEeAvETs and documentation at ampliseq: Introduction, including usage, pipeline parameters, and example output.

In case there are any questions about the usage of the pipeline, open an issue in the github repo and/or join the nf-core community slack with general help (#help) or the ampliseq specific channel (#ampliseq).

The first release of the pipeline was in 2018 in the nf-core (https://nf-co.re/) pipeline set and it was continuously maintained since then. The pipeline is a community effort (as is nf-core) and contributions and suggestions for improvements are welcome (and were frequent in the past).

Nicholas_Bokulich · February 2, 2022, 5:43am

Thanks @DaS ! FYI, I am recategorizing this topic as "Other Bioinformatics Tools", which is the appropriate category for external QIIME 2-compatible tools. The "Community Contributions" category is rather specific to QIIME 2-specific resources (e.g., plugins, tutorials).

DaS · February 3, 2022, 8:38am

Thanks Nicholas, I was not exactly sure which category is appropriate! While I do browse occasionally topics in this interesting forum, I am not that familiar with its divisions.

Bark9299 · March 3, 2023, 5:48am

What type of analyses do you use after the pipeline is done? I am looking into the R packages vegan and phyloseq but am having a hard time finding code that other microbiome people use. I am also newer to the bioinformatic analyses so sorry if this question seems easily answered.

DaS · March 3, 2023, 8:21am

Dear @Bark9299 , well that depends on what you want to achieve. ampliseq does produce helpful results & figures already (using QIIME2), i.e. barplots, alpha diversity measures, rarefaction, PCoA, Adonis, ANCOM. But for more complicated experimental setup more downstream analysis tools are required, vegan & phyloseq are good packages for many tasks. I like ANCOM-BC for differential abundance analysis. Random forest can be also helpful, I do not have alot of experience there, but python & sklearn seems to be good choices.

colinbrislawn · March 3, 2023, 6:34pm

I think you found some great software!

Vegan is old-school numerical ecology software. It uses base-R programming style and serves as the reference implementation for many diversity metrics.

Phyloseq is more modern in some ways (ggplot2 graphs), but it is no longer under active development. I would also love to have a actively developed replacement for phyloseq!