QIIME2 non-chimeric reads

Hi Supporting Team.

I got a problem with regards to the dada2 output on non-chimeric reads during a dry run. Here is the output file regarding the dada2 result.

dental-test-dada2-stats.qzv (1.2 MB)

Observe that the non-chimeric reads were very low.

Note that I did the primer trimming, and the primer is 16 V3-V4 region (341F/785R). To remove the primer, I used the following command

Primer trimming with cutadapt (341F/785R), anchored at 5′ with IUPAC matching

qiime cutadapt trim-paired --i-demultiplexed-sequences dental-test-paired-end-demux.qza --p-front-f '^CCTACGGGNGGCWGCAG' --p-front-r '^GACTACHVGGGTATCTAATCC' --p-match-adapter-wildcards --p-match-read-wildcards --p-cores ${SLURM_CPUS_PER_TASK} --o-trimmed-sequences dental-test-demux-trimmed.qza

The result shows that the overall solid quality of the reads, except for the high sequence count of the reverse read.
dental-test-demux-trimmed.qzv (329.0 KB)

Afterwards, the DADA2 command ran as follows:

Run DADA2

qiime dada2 denoise-paired --i-demultiplexed-seqs dental-test-paired-end-demux.qza --p-trunc-len-f 250 --p-trunc-len-r 230 --p-n-threads ${SLURM_CPUS_PER_TASK} --o-representative-sequences dental-test-asv-sequences-0.qza --o-table dental-test-feature-table-0.qza --o-denoising-stats dental-test-dada2-stats.qza --verbose

Though no errors and the merge result looks okay, I am unsure why the non-chimeric reads have so low proportion? How can I check the chimeric reads quality before running DADA2?

Many thanks. As an amateur of this pipeline, I am looking forward to your guidance.

1 Like

Hello Jason,

Welcome to the forums! :qiime2:

You are off to a great start. Your --p-trunc-len-f 250 --p-trunc-len-r 230 settings make sense given this quality:

Most reads pass quality filter and are able to join. At most 34% of reads were not chimeric.

What if the problem is upstream, say low extraction product leading to extra PCR cycles during amplification? That would cause more chimera, especially if 'real/native' nucleic acid biomass is low.

The best resource here is a positive control sample with a known composition. :bar_chart:
Did you happen to sequence any of these?

1 Like

To whom it may concern,

I dried run the dada2 after the cutadapt one more time, but the data is not as great again.

Here is the new DADA2 result after I run the following command.

qiime dada2 denoise-paired --i-demultiplexed-seqs dental-test-paired-end-demux.qza --p-trunc-len-f 250 --p-trunc-len-r 250 --p-trim-left-f 17 --p-trim-left-r 21 --p-n-threads ${SLURM_CPUS_PER_TASK} --o-representative-sequences dental-test-asv-sequences.qza --o-table dental-test-feature-table-0.qza --o-denoising-stats dental-test-dada2-stats.qza --verbose

Where I used --p-trim command to trim the nucleotides where the number matches the number of nucleotides of the forward and reverse primer. The result looks far better:
dental-test-dada2-stats.qzv (1.2 MB)

I really don't know why cutadapt sometimes giving worse result in giving non-chimeric reads. I think it is due to some primers on the forward and reverse reads cannot match with the adapter nucleotide sequence in the p-front-f and p-front-r. I would try to evaluate the algorithm performance again to see whether I have to report to qiime2.

As most have at least 40% of non-chimeric reads, can I move on to the next steps?

Thank you.

Hello Jason. It's me again.

That's great to hear! It looks like removing this region at the start of the read works better with dada2 than cutadapt.

Very interesting.

Only if you are happy with the results.
Are you happy with the results you have seen from the DADA2 trimming?

Did you find any positive controls?