DADA2 - All features filtered out of ITS data

I'm working with 301 bp paired-end ITS1 reads. After trimming they are 281 bp.

Here are the quality plots for the trimmed sequences.

Historically, we haven't truncated ITS reads when denoising to ensure we have full coverage, but I found a post on this forum (can't find the link now) stating that truncating to leave the 12 bp overlap required by Dada2 may improve the Dada2 filtering results. It did slightly, but not to an acceptable level.
I denoised using Dada2 with the following:

qiime dada2 denoise-paired
--i-demultiplexed-seqs ./sequences/trimmed_demux_seqs.qza
--p-trunc-len-f 280 \ # allows for 12 bp overlap
--p-trunc-len-r 267 \ # allows for 12 bp overlap
--o-table ./Dada2/dada2_table.qza
--o-representative-sequences ./Dada2/dada2_rep_seqs.qza
--o-denoising-stats ./Dada2/dada2_stats.qza

This still filters out all features. While not ideal, I increased --p-max-ee-r which significantly improved the number of reads that pass the Dada2 filters (0% to ~60%), but I am hesitant to do that if it will increase the allowed error. I'm not really sure how to move forward from here.

Is the poor quality of towards the end of the reverse reads the reason Dada2 is filtering out all of the features with default parameters?

Does anyone have any suggestions on how to improve dada2 denoising here?

Thanks in advance!

Hello! Welcome back!

You are right, truncating reads at the ends may improve merging output by improving overall quality of overlapping region. However, Dada2 developers recommend not to trim ITS amplicons due to the high variability of their length. So, disabling truncation may actually improve merging outputs for longer amplicons. In addition, position 281 for forward reads can be to strict to your data and cause shorter reads to be filtered from the dataset.

I think it is the right move to do since otherwise you are losing too much of the data. I would go for it and add corresponding note to the materials and methods when publishing this data.

I addition, you can try to set minoverlap parameter to a lower than 12 value to see if it will recover some extra reads in the dataset.

So, in summary:

  • disable truncation
  • set lower minoverlap (4-6, for example)
  • increase allowed errors rate

Hope it will help to save more reads for the analysis

PS. If you sure that the length of your reads is enough to merge the amplicons you can truncate forward reads at lower position than 281 in combination with decreased minoverlap to compare the outputs.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.