Problem with DAD2 (return code 1)

Hi @nricks,
It sounds like you are interested in using dada2 to dereplicate your sequences, rather than to denoise as it is intended (the use of fake quality scores implies this to me — which would render dada2 meaningless even if it did not result in errors). I would recommend using q2-vsearch to dereplicate (and possibly cluster) sequences. I am unfamiliar with analysis of PacBio data, but I would recommend the approach described in that tutorial link rather than dada2. QIIME2 does not currently contain any methods explicitly designed for quality filtering PacBio data. Would this address your need?

As far as I know, the error profiling method in dada2 is not tuned for PacBio data so I would not advise using dada2 (though I may be wrong. @benjjneb would know better about the suitability of pacbio data for dada2).

Is that the case, or do pacbio data use a different PHRED encoding? PHRED 93 may be off the scale expected for dada2 (Illumina 1.8+ Phred+33 does not go up to 93 as far as I know).

:fearful:
This would be highly unadvisable for inputting to any method that actually uses quality scores, let alone one for denoising (dada2). If you arbitrarily set quality scores for each read, what do you hope to accomplish with dada2?

I hope that helps!

1 Like