Import - trim and truncate - overlapping

Greetings, everyone! I'm genuinely excited to be in the presence of experts.

I'm new to Qiime 2 and have thoroughly gone through forum posts to anticipate potential challenges. I've successfully imported my data and am currently in the process of denoising using DADA2. Any assistance you could provide would be greatly appreciated, please.

I have 29 samples, which are organized into three separate folders based on sampling locations (one of the folders contains just one sample). These samples are related to the V4 region, using 515F and 806R primers, and are expected to yield amplicons of approximately 300–350 bp in size, as indicated on the website 16S Illumina Amplicon Protocol : earthmicrobiome, which was used for sequencing information.

I have a few questions:

First, can I incorporate all the samples into a single folder for import, considering that we have a naming system that specifies the sampling location for each sample?
Secondly, if I import all the samples together, should I denoise them collectively using DADA2? The samples are already pre-demultiplexed.
Thirdly, is the calculation for overlapping length performed before or after trim and truncation?

I've also attached the forward and reverse IQP for three folders.



Hello @Beh_Yaad,

First, can I incorporate all the samples into a single folder for import, considering that we have a naming system that specifies the sampling location for each sample?

So long as no two files have the same name, yes you can. You can keep track of the differences in your metadata.

Secondly, if I import all the samples together, should I denoise them collectively using DADA2? The samples are already pre-demultiplexed.

Yes, if they were sequenced in the same run.

Thirdly, is the calculation for overlapping length performed before or after trim and truncation?

You need to have overlap after trimming and truncation. Dada2 defaults to requiring 12bp of overlap.

2 Likes

Thank you for your clear response.
Regarding the third question, is there a way to detect if there is at least 12 bp of overlap before trimming and truncation?

Hello @Beh_Yaad,

There is no tool that I know of, but you can do some approximate math if you know the length of your primers and where you plan to truncate each read.

1 Like

I've generated multiple visualization outputs with DADA2, detailed below, each with its corresponding file extension.

rep-seqs-filename.qzv

stats-dada2_filename.qzv

Now, how can I calculate the overlapping length?

Hello @Beh_Yaad,

expected overlap = read 1 length + read 2 length - amplicon size

I mean, is there a tool or command that displays the Dada2 output files including "rep-seqs-filename.qzv" and "stats-dada2_filename.qzv" as an Interactive Quality Plot?
like the following command:
qiime feature-table summarize
--i-table table.qza
--o-visualization table.qzv

According to you, the overlap length can also be calculated before denoising by Dada2, and basically, Dada2 is for eliminating poor-quality reads, and the produced files of DADA2 are to create a table to continue the analysis; right?

Hello @Beh_Yaad,

I mean, is there a tool or command that displays the Dada2 output files including "rep-seqs-filename.qzv" and "stats-dada2_filename.qzv" as an Interactive Quality Plot?

Sequences output by dada2 no longer have quality scores attached to them, so this isn't possible.

According to you, the overlap length can also be calculated before denoising by Dada2, and basically, Dada2 is for eliminating poor-quality reads, and the produced files of DADA2 are to create a table to continue the analysis; right?

The overlap amount can only be estimated because the amplicon length varies, but yes. Dada2 does take quality information into account, among other things. Yes, the feature table output by dada2 is used in downstream analysis.

2 Likes