False positive in my samples

Dear community,
I have a question relative to the DADA2 denoising that I performed.
I am working with a ITS1 dataset with more than 400 samples.

Observing my table.qzv I noticed that I have a lot of ASVs, but these are not present with good coverage in the column "# of Samples Observed In".
of 9000 ASVs, 99% are present in just 1 sample.

Here is the protocol that I have used

qiime demux summarize
--i-data reads.qza
--o-visualization reads.qzv

TRIM CUTADAPT

qiime cutadapt trim-paired
--i-demultiplexed-sequences reads.qza
--p-adapter-f AYTTAAGCATATCAATAAGCGGAGGCT
--p-front-f AACTTTYRRCAAYGGATCWCT
--p-adapter-r AGWGATCCRTTGYYRAAAGTT
--p-front-r AGCCTCCGCTTATTGATATGCTTAART
--o-trimmed-sequences reads-trimmed.qza

DEMUX SUMMARIZE

qiime demux summarize
--i-data reads-trimmed.qza
--o-visualization reads-trimmed.qza

qiime dada2 denoise-paired
--i-demultiplexed-seqs reads-trimmed.qza
--p-trunc-len-f 190
--p-trunc-len-r 140
--o-representative-sequences dada2-rep-seqs.qza
--o-table dada2-table.qza
--o-denoising-stats dada2-stats.qza

I suspect that the denosising caused a lot of false positive here, because I was expecting a higher number of ASVs present in more samples.

Hi @Edoardo_Scali,
In this piece of your code:

I found that your adapters appear to match those found in this tutorial. This could be the problem! Are you using the adapters from your sequencer or from this tutorial? You will want to use the ones that are relevant to your study. I am happy to continue to help you work through this!
-Hannah

3 Likes

Hi @jphagen,

The problem is that I do not have the adapter sequences.
I did not performed
The only oinformation I have about the adapter was obtained from Fastqc reprot and is that these are "Universal Illumina Adapters".

Is there any way to get the adapter sequence from my fastq data?

I have the primers that have been used. Can I use this information for get the adapter sequence?

Thank you so much.

I was able to fix the problem by following the two step instruction that I found in this discussion

After proper primer sequences removal, I performed DADA2 denoising and I got a satisfactory coverage in the column "# of Samples Observed In".

Thank you @jphagen for pointing out that primer removal that I performed was inducing the error!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.