dada2 denoise-ccs most reads were filtered out

Hello. I am using dada2 denoise-ccs for pacbio 16s result. After the denoise, around 80% reads were filtered out. After the primer removal, the result seems fine, so the issue should be from the filter step.
I have tried to extend the length to 500bp to 2000bp, and increase --p-max-ee to 10, but still didn't help. Below was the command I used and the result of dada2. I searched the forum, but couldn't solve this issue. Could some one help me please.

qiime dada2 denoise-ccs --i-demultiplexed-seqs import.qza
--p-front AGRGTTYGATYMTGGCTCAG --p-adapter RGYTACCTTGTTACGACTT
--p-min-len 500 --p-max-len 2000 --p-n-threads 45 --p-max-ee 10
--o-table table --o-representative-sequences rep-seqs --o-denoising-stats dada2-Stats


I also put the fastqc here, the quality is quite good.

Hello @feixiang1209,

Can you attach the demux visualization?

@colinvwood Hi, please find attached
dada2-stats.qzv (1.2 MB)

Hello Xiang,

First, these read counts are pretty good! 50k reads per sample gives you a lot to work with!

Second, you could add --p-trim-left 5 to remove those low-quality bases at 2 and 5, which could be causing more errors.

EDIT: I think Colin V Wood was asking about the quality score visualization from Qiime2 after demultiplexing the data. Do you have that file?

1 Like

@colinvwood Sorry, I misunderstood. I have attached the demux.qzv here. Thanks.
demux.qzv (305.8 KB)

1 Like

Thanks for the reply.
Actually this is just the first small batch of analysis, the second batch got even more percentage of reads were filtered out, which result in very low amount of reads.
I didn't try to trim left, but I did increase the max-ee to 100 and still got similar result. So I don't think it is related to quality.
Since there are more samples pending for analysis, I really would like to find out the reason. I have attached the demux visualization, could you please have a look and advise what was the issue? Thanks a lot.

1 Like

Qiime2 view directly link to demux.qzv

Based on the graph in Interactive Quality Plot, it looks like quality drops at 1500 sometimes.

Perhaps a shorter truncation length?

Thanks for the advice. I tried to truncate to 1400 to see if it is improved, but the filtered percentage is almost the same. I also tried dada2 in R, and got the same result. This is really strange. Could the high reads number affect the filtering?

1 Like

Hello @feixiang1209,

Sorry for the delay in getting back to you. Were you able to resolve things or are you still experiencing difficulties?