Merged read lengths on DADA2

I would like to ask about the lengths of merged reads generated from DADA2.
I am using QIIME2 version 2023.9 (Amplicon Distribution) for the analysis of the V1−V2 region of the 16S rRNA gene.
After exporting ASV representative sequences to FASTA files, I noticed that some reads are shorter (255 bases) or longer (400 bases) than the expected amplicon size.
I am concerned that the difference in read lengths may affect the further analysis, as some reads only capture a partial region of V1−V2.
Is there any way to control the length of the merged reads?

Thank you very much.

Hello @microbiome_25,

I found this related post. Let us know if this doesn't solve the problem.

Thanks,
Colin

1 Like

Hello @colinvwood ,
Thank you very much for the information.
It solved the problem.

I would like to ask you one more question.
Is it common to set a threshold for the length of ASVs before further analysis, or is it unusual to discard generated ASVs based on read lengths? I am uncertain how variations in read lengths impact these analyses in QIIME2.

Thank you very much.

Hello @microbiome_25,

I would say it depends on the threshold. If the threshold is low enough so as not to exclude ASVs that are shorter due to biological variation but high enough to filter sequencing or PCR artifacts, then it's reasonable.

1 Like

Hello @colinvwood ,

Thank you very much for your answer.
Is there a reference value for the threshold to discard sequencing or PCR artifacts?
Thank you.

Hello @microbiome_25,

That's a good question. This is going to depend on the region obviously, but I'm not aware of any. Worth a google search though.