Dada2 options for getting a whole length sequence

Hello everyone
I'm running Qiime2 for analysing my data from Ion s5.
but now I have a problem in dada2 step.
I tried this command.

qiime dada2 denoise-single
--i-demultiplexed-seqs demux.qza
--p-trim-left 0
--p-trunc-len 0
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza

I did quality check and used fastq data including forward and reverse primer sequences.
quality checking with fastqc:

Distribution of sequence lengths over all sequences

also I was setting trim-left and trunc-len to 0. because I wanted to get whole length sequence that I used.
but when I checked the file rep-seqs.qza, there are quite many short reads that ended with repeat sequence ‘TTTT..’, ‘AAAA..’, ‘GGGG..’ or ‘CCCC..’ just like below.


when I checked fastq file with this command. (bolded 10bases are primer sequence btw)


There are over 200 reads just like below.


I have no idea why I got many features that have short sequence.
Are there any other way to get whole length reads?

thank you in advance :grinning:

Hi there @Choe!

Did you happen to see the documentation for this parameter:

  --p-trunc-q INTEGER             Reads are truncated at the first instance of
                                  a quality score less than or equal to this
                                  value. If the resulting read is then shorter
                                  than `trunc_len`, it is discarded.
                                  [default: 2]

The trunc-q parameter will truncate sequences based on this quality score (by default, 2). Perhaps this is what is going on? Maybe try rerunning with --p-trunc-q 0?

Keep us posted! :t_rex: :qiime2:


I just tried rerunning with

--p-trunc-q 0

, and it is perfectly working!
Thanks :slight_smile:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.