I know this has been posted about several times and I have gone through many forum posts with the same issues, but I was hoping someone might be able to help me out on this. Any help would be very much appreciated.

Basically, my issue is low %filtered and %merged about running dada2. I'm looking at V3V4 regions, so I think I understand that my sequences should at least be (805-341)+20 = 484bps.

So to begin with, the quality scores of my reverse reads are not great:

But I decided to plow on with paired denoising because as someone pointed out, sequencing is expensive, so let's try to use everything I have first. These were my trimming and truncating parameters:

qiime dada2 denoise-paired
--i-demultiplexed-seqs Baywide_microbiome/all_samples/Baywide-full-demux-paired-end.qza
--p-trim-left-f 18
--p-trim-left-r 18
--p-trunc-len-f 290
--p-trunc-len-r 200
--p-n-threads 48
--o-representative-sequences Baywide_microbiome/all_samples/trim290f-200r/full-rep-seqs-dada2-trim290f-200r-18left.qza
--o-table Baywide_microbiome/all_samples/trim290f-200r/full-table-dada2-trim290f-200r-18left.qza
--o-denoising-stats Baywide_microbiome/all_samples/trim290f-200r/full-stats-dada2-trim290f-200r-18left.qza

Which I realise fails the overlapping required since 290-18+200-18 = 454. Which might explain my %merging issue. But I don't think I can lower the truncation any more because of my not great reverse reads. Here's a screenshot of my dada2-stats:

I've sort from the lowest, but basically my %filter range is from 0 - 71.27%. I wonder about the first four samples, which had very little input to begin with. But even discounting them, my %filter range is from 52.58 - 71.27%.

So I thought, ok fine, let's just try the forward reads then. I used basically the same parameters because I figured the forward read looks pretty good.

qiime dada2 denoise-single
--i-demultiplexed-seqs Baywide_microbiome/all_samples/only_forward/Baywide-full-demux-single-end.qza
--p-trim-left 18
--p-trunc-len 290
--p-n-threads 48
--o-representative-sequences Baywide_microbiome/all_samples/only_forward/rep-seqs-dada2-trim18-290.qza
--o-table Baywide_microbiome/all_samples/only_forward/table-dada2-trim18-290.qza
--o-denoising-stats Baywide_microbiome/all_samples/only_forward/stats-dada2-trim18-290.qza

However, that didn't really solve my %filtered problem, with it still ranging from 19.17 - 77.75%. I guess the bottom range has improved, but why not the top range? :persevere:

I am pretty new to all this and have been consulting a more experience user about all this as well. But figured extra eyes will definitely help as well.

I am running QIIME 2021.2.0 on Linux (Ubuntu).

I think you can expand the reverse reads quite a bit, personally I would try truncating at 240.

I think its safe to say those first 4 samples aren't recoverable, so that puts the effective range at 67%-77%, which I think is perfectly reasonable (IMO), and isn't cause for concern.


I've tried your suggested parameters. I'm guessing you were happy to keep everything else and just truncate at 240? Hopefully, cause that's what I did haha. It unfortunately hasn't helped my dada2 stats. If, like you said, we ignore the first 4 inputs, my %filter is still 19.42 - 59.92% and %merge 3.73 - 33.11% :weary:

full-stats-dada2-trim290f-240r-18left.qzv (1.2 MB)

@ymt89 - you'll have to play around with this a bit to figure out the best path forward - if you haven't already read the DADA2 paper and docs, I highly recommend you start there:

