changing --p-max-ee-r to obtain more filtered sequence in DADA2

Hi all,

I am running ITS1 analysis. And based on my readings, ITS works differently as it has varying length.
We did the primer trimming using bbduk and then imported into QIIME2.

From rawdata.qzv, the quality dropped quite significantly for reverse reads and the sequence length is around 279-300 bp.

rawdata.qzv (275.6 KB)

Then, I noticed that DADA2 filtered out lots of reads (refer dada2-stats-0.qzv). As we are more concerned to have more reads included in DADA2, we then changed the parameter --p-max-ee-r to 6 and it did include more reads(refer dada2-stats.qzv).

dada2-stats-0.qzv (1.2 MB)
dada2-stats.qzv (1.2 MB)

I think changing this parameter is better for my data but can anyone give any opinion about this?
Am I doing it right or should I just stick with the default parameter? I understand that ITS is a bit different than usual 16S or 18S but am I missing anything?

Thank you for the help.

Sorry about the radio silence, @afinaa! It’s been super busy in QIIME-land lately. I’m not experienced with ITS sequences personally, so please take the following as brainstorming, not as recommendation.

Changing max-ee is legal but should be justified in any publication - you are essentially telling DADA2 to work with more-erroneous reads, which it can do, but this could increase the number of “false” reads it might produce. This may be the best approach for your data, but exploring some other approaches might be worthwhile.

Are your bbduk parameters searching for reverse-complement sequences in addition to your primer sequences themselves? I think it does this by default, but it would be a shame if you were losing reads due to artifacts present due to sequence readthrough.

Would you lose too much data by using forward reads only? This would obviously be a compromise, but your forward reads look pretty clean out to about position 255, so you’d still have something to work with.

Finally, here are some other approaches to preprocessing ITS sequences (with q2-cutadapt, with q2-ITSxpress, in R) for DADA2, and an interesting discussion on same. I’m not sure these will be directly useful, but there’s some good conceptual discussion in them that might help you diagnose and correct the issue.

I wish I had more experience I could lend - hopefully these resources are useful. If not, let us know how things are going, and someone with more experience than I have may step in to help out.

Good luck!
Chris :leopard:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.