Kraken2 versus QIIME2 (dada2) for PacBio Hifi reads

rr220 · July 30, 2022, 3:12am

Hi,

I have pacbio hifi reads data (whole 16S gene amplified) and I ran it through Karken2 pipeline with green genes database using Partek flow software and I also ran the same dataset using QIIME2-DADA2-green genes database as described in Analyzing PacBio HiFi Mock Community 16S Data with QIIME 2 · PacificBiosciences/HiFi-16S-workflow Wiki · GitHub

The issue is that both outputs are showing exact opposite alpha diversity. Kraken pipeline shows that Shannon Index is significantly higher in Control vs. KO, whereas QIIME2-DADA2 output shows that Shannon Index is lower in Control vs. KO. Usually two different approaches never show contrasting trends. Significance might alter but not trend.

I need guidance in deciding which output is more reliable for publishing.

Please guide me in figuring this out.

Best,
R

colinbrislawn · August 6, 2022, 3:05am

Hello R,

This is tricky to compare because there are a lot of differences between the components of these pipelines and it's hard to isolate which is causing this change. But like you said:

That is concerning!

Did you include any mock communities with known compositions you could use to validate these two pipelines?

If not, you could use existing positive controls! When you run the ATCC msa 1003 mock samples from that PacBio tutorial through Partek, does it produce comperable results or are they different as well?

system · September 6, 2022, 9:06am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.