archaea 16s amplicon seq analysis

Desert · August 7, 2022, 11:14pm

I am working on analyzing the archaea 16s amplicon seq analysis of sand samples. i used f TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTTCCGGTTGATCCYGCCGGA, r GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGWATTACCGCGGCKGCTG) primers in PCR.
now I reached this step but I don't know the size of my PCR, so how can i trim the samples? i don't know how much I have to trim the samples, any help? i recognize the primers are too long

qiime dada2 denoise-paired
--i-demultiplexed-seqs trimmed-paired-end.qza
--o-table table-dada2.qza
--o-representative-sequences rep-seqs-dada2.qza
--p-trunc-len-f 260
--p-trunc-len-r 175
--p-n-threads 6
--verbose
--o-denoising-stats table-dada2-stats

any help?

gregcaporaso · August 7, 2022, 11:23pm

Hi @Desert, Welcome to the QIIME 2 Forum!

The size of your PCR product isn't used at this step, but rather the quality information of the sequences. You're trying to remove bases from positions that achieved low quality sequencing. The best way to assess that in QIIME 2 is using the demux summarize results, which you would generate from your trimmed-paired-end.qza artifact.

This is discussed in the Moving Pictures tutorial here. That tutorial covers single-end data, but the process is the same for paired end reads - you just need to review the forward and reverse read plots to define the trim and trunc parameters for the forward and the reverse reads. You can also find some discussion of these parameters in the DADA2 tutorial, here (See Inspect read quality profiles.)

If you do need to trim PCR primers or anything else, see the QIIME 2 cutadapt plugin.

Desert · August 8, 2022, 7:03am

Thank you, if i have a long primers around 40 each can i still do the trimming? or it will affect the overlapping between F and R ?

gregcaporaso · August 10, 2022, 1:54am

@Desert, it shouldn't impact the overlapping of forward and reverse reads, since the trimming happens on the 5' end of the sequences, and the overlapping of forward and reverse reads takes place on the 3' end of the reads.

I recommend reviewing the the logs generated by denoise-paired to assess how many reads you're losing due to reads not being joined. I would also denoise with both denoise-single and denoise-paired, and compare the summarized feature tables (result of running qiime feature-table summarize) to see if there is a big difference in the distribution of reads associated with each sample.