Long read 16S amplicon Oxford Nanopore data

Shreya · December 20, 2024, 10:47am

Hello, I have recently got 16S v1-v9 full region ONT data from fecal samples (metagenomics). Can it be done by QIIME2 just like the short read sequencing analysis using DADA2 pipeline for generating ASVs? If no, then is there a step by step method of how to do so?

Thanks.

timanix · December 20, 2024, 3:38pm

Hello!
Dada2 is not appropriate for nanopore data.
If you want to use Qiime2, you can import reads to Qiime2 and dereplicate them with vsearch plugin, which provide you with representative sequences and feature (count) table. I would avoid clustering reads to 97% OTUs - it is barely reduces amount of unique features with nanopore long reads but takes a lot of time. Clustering to the lower threshold is also not the best option - what is the point of having long reads if you cluster them to 85% OTUs?
You still can use the data after dereplication directly for taxonomy annotation. I would collapse data then to taxonomy level (species or genera) for non-phylogenetic metrics and DA tests.

Outside of Qiime2, you can use either Kraken2 with standard or another database for shotgun data (works not so good with 16S databases and nanopore) or 16S WF Epi2Me.

There are also such tool as Natrix2 and NanoCLUST. Natrix2 looses a lot of reads and NanoCLUST is outdated and not supported anymore.

Personally I ended up writing my own pipeline that is similar to NanoCLUST but has some differences that I belive makes it better: NaMeco, but it is not yet published (though you can already run it).

Update:
There is also Qiime2 metagenomic distribution, which contains mosphit plugin, that can be used for kraken2 annotation of nanopore reads with various databases. However, if I am not mistaken (I tried it for nanopore when it was just released and didn't try the latest update) one will need to extract kraken2 reports from qiime artifact to parse it and get taxonomy abundances.

Best,

Nicholas_Bokulich · December 21, 2024, 1:57pm

Hi @Shreya ,
It is also possible to classify your reads using kraken2, as @timanix suggests, using the QIIME 2 plugin q2-mosphit. You can find a draft tutorial here:

Just don't try to use bracken to normalize the counts, as this method does not appear to be compatible with long-read nanopore data.

Good luck!