quality control with greengenes2

Dear Qiime2 community

I used to remove likely non-prokaryotic sequences using qiime2 quality control on old greengenes database (clustered at 88% identity) as discussed here: 16S V3-V4 length : what to do with the sequences of shorter than expected length after DADA2? - #3 by Mehrbod_Estaki with the following commands:

qiime quality-control exclude-seqs
--i-query-sequences rep-seqs-merged.qza
--i-reference-sequences 88_otus.qza
--p-method vsearch
--p-perc-identity 0.65
--p-perc-query-aligned 0.5
--p-threads 8
--o-sequence-hits hits.qza
--o-sequence-misses misses.qza ;

This filtering gave me good results, since I discarded ASVs shorter than expected and with strange or not-found blast hits.

I am now trying to do the same step but using greengenes2 with these commands:

qiime quality-control exclude-seqs
--i-query-sequences rep-seqs-merged.qza
--i-reference-sequences 2024.09.backbone.full-length.fna.qza
--p-method vsearch
--p-perc-identity 0.65
--p-perc-query-aligned 0.5
--p-threads 8
--o-sequence-hits hits.qza
--o-sequence-misses misses.qza ;

However, now I am getting almost nothing inside the “misses.qza”. I am afraid that things that did not align with the previous version of greengenes now they do (and they are non-prokaryotic sequences).

Similarly, when I used the rescript-curated silva 138.2 i get half of the misses I got with greengenes old taxonomy.

Any idea about what could be going on?

Thanks a lot for your help!

Hi @pau,

The positive filter used by Deblur, which looks like is what was suggested here, has only been evaluated to the best of my knowledge against an older version of Greengenes. Since the goal is only to keep sequences which are putatively 16S, and because the prior version of Greengenes spanned a vast diversity of candidate and recognized phyla, I don’t think it is likely necessary to revisit the filter. The exact steps used to determine the filter for Deblur can be found here.

Best,

Daniel

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.