DADA2: Low non-chimeric input after denoising

Dali_benamor · October 24, 2024, 12:02pm

Hello,
I have two samples generated using 2 x 150 bp reads, targeting the 16S (V3-V4) region and ITS2. The sequencing was performed using the Element AVITI platform (Element Biosciences).
I worked directly with the raw data. The first step was a quality check,

and the reads showed good quality.
The second step was denoising, but I only used the forward reads because they were too short to be merged. I applied --p-trunc-len 146 and --p-trim-left 6

The issue is that I ended up with very low non-chimeric data. Is this normal?

timanix · October 24, 2024, 12:43pm

Hello and welcome to the forum!

They use different quality score scale, so there is no guarantee that dada2 will handle it properly.
However, I remember one discussion regarding using dada2 with such data and advice was to play with maxEE parameters. From your screenshot looks like you lost a lot of reads at filtering and denoising steps. Try to increase maxEE and hopefully it will increase the output.

Best,

Dali_benamor · October 24, 2024, 2:06pm

Hello, thank you for your reply.
Could you please show me what the command line would look like?

timanix · October 24, 2024, 2:20pm

You can always check the description to the plugin here: denoise-single: Denoise and dereplicate single-end sequences — QIIME 2 2024.5.0 documentation

If you are using dada2 for paired reads, check the corresponding settings for dada2 plugin for paired reads.

Or just run:
qiime dada2 denoise-single --help

Dali_benamor · October 31, 2024, 4:30pm

Thank you for the suggestion! I tried testing various values for --p-max-ee, specifically 6, 30, and 60, but unfortunately, I ended up with the same result each time.

colinbrislawn · October 31, 2024, 4:41pm

Based on that screenshot, it looks like you have 453k and 417k reads in those two samples?

Those are super big values compared to the Illumina MiSeq, so having 'only' ~30% of the reads remain after all the filtering is still quite good!

High chimeric counts can also be caused by high PCR cycle count. This can be adjusted during the wet lap phase the next time you prep some samples. Remember, it's better to remove all the chimeras then keep them in.

If I were you, I continue with analysis and see how it goes!

That's pretty high. An Expected Error of 60 would imply a full 43% of the bases in a read would be suspect and it would still pass the filter.
60/(146-6) = 60/140 = 0.43

On a side note, thank you for bringing AVITI data to the forums!

This is a pretty new platform (2022) so we have some work to do to make sure our tools handle it correctly. The Q-scores in the fastq files mean the same thing, so it should work. But like he said

Dali_benamor · November 11, 2024, 2:18pm

thank you very much Sir,
In this case I can use dada2 on my data ? I only have to use ---p-max-ee by default !!

colinbrislawn · November 11, 2024, 2:23pm

Yes, you can test out DADA2 on your data.
Because AVITI has higher q-scores than Illumina, the default --p-max-ee is fine.

If you have positive control samples with a known composition, these will help you measure the accuracy of this new platform.

Did this sequencing run include any positive controls?

system · December 12, 2024, 8:23pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.