DADA2 detected few representative sequences

Hi all,

I am new in QIIME2 and I am working with single-end reads that came from Illumina sequencing.

When I used DADA2 I kept with ~70% of total sequences per sample, but for some samples, I obtained few representative sequences. I don’t understand exactly how DADA2 cluster the sequences. I know that I’m not discarding sequences from unchimera step because if I sum the detected ASV per sample in the feature table it gives me the number of reads kept after unchimera process.

My line:

qiime dada2 denoise-single --i-demultiplexed-seqs Illumina_V4.qza --p-trunc-len 0 --p-n-threads 20 --o-table table_Illumina_V4.qza --verbose --o-representative-sequences rep_Illumina_V4.qza --o-denoising-stats stats_Illumina_V4.qza

Am I doing something wrong?
Should I add more trining examples with the option --p-n-reads-learn?



Welcome to the forum @Max!

You should check out the stats file to figure out where you are losing seqs. I strongly suspect you are losing seqs at the pre-filtering stage:

Unless if your reads are pristine, you will probably want to truncate (either with this parameter or the trunc-q param)... otherwise dada2 will filter out any reads with > 2 expected errors (by default, see max-ee param), which will probably be many if no truncation was applied.

Give that a spin and let us know how it goes!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.