Dear colleagues,
Thank you so much for the fantastic QIIME2 pipline, it brings me to amplicon world.
My name is Yilang Wang, a current PD. fellow of CAS, IUE, China.
I'm now working on some environmental bacterial amplicon data amplified by different primer sets (V3-V4, V4, V4-V5).
Previously, I have processed these data respectively under pipeline like: "primer removing" >> "PE data joining" >> "quality-filter q-score" >> "denoising via deblur"; After denoising, I merged the rep-seqs and feature-tables, finally I got about 55000 ASVs and got about 5000 unique taxonomic assignments via silva database.
At this step, I realized that there must be some ASVs that may biologically amplified by DNA templates of the same specie/organism though amplified by different primer sets. The feature-table merging was not able to really merge ASVs from the same specie/organism, and may even bring spurious alpha diversity and fake beta diversity on ASVs level, regardless of effects of different primer-sets, sampling methods, library construction, sequencing method/depth & etc.
Therefore, I am puzzled to know if I can go to further analysis with the merge feature-table on ASVs level. Or I just can go further with table on a taxonomic level (a merged feature-table collapsed on species level under the same silva database assignment).
Now, I also puzzled if two assumed feature-tables, which are amplified with the same primer-set, such as V4(515F/806F), but denoising with different deblur parameters, such as one table with --p-trim-length 130 and --p-left-trim-len 0 and the other one with --p-trim-length 120 and --p-left-trim-len 10, will give me the really merged ASVs from the same species/organisms.
Qiime2 website (Fecal microbiota transplant (FMT) study: an exercise — QIIME 2 2022.2.0 documentation' ) declares that "denoise-single
are directly comparable (in this case, the feature id is the md5 hash of the sequence defining the feature)."
I did some web searches, benjjneb suggested to merge the feature-table after normal DADA2 denoising pipline following by ASVs sequences trimming by primer-set of shared amplified region (Comparing data from two Illumina chemistries (16S amplicon sequencing) · Issue #509 · benjjneb/dada2 · GitHub). joey711 raised a warning (merge_phyloseq of two different phyloseq objects (non matching OTU labels) · Issue #508 · joey711/phyloseq · GitHub).
Now, I am switching to use cutadapt to cut the shared amplified region (V4 by 515F/806R for V3-V4, V4, V4-V5), before deblur denoising. However, I run into another problem that about a half of q-score artifacts seemed to have a 0-246/253/254nt trim length, the other seemed to have longer (>254nt) trim length. And I don't know if the cutadapt step gives the right shared amplified V4 rigeon. And I am also concerned if further denoising and merging will give the truely merged ASVs.
0-246/253/254 part
fdp.02s.qza.qzv (301.4 KB)
fdp.05pjs.qza.qzv (309.1 KB)
fdp.31pjs.qza.qzv (310.3 KB)
longer part
fdp.13pjs.qza.qzv (308.6 KB)
fdp.16pjs.qza.qzv (307.6 KB)
fdp.23s.qza.qzv (302.3 KB)
(I choose deblur as it does not require pool sequencing data, while DADA2 does)
I hope to receive some advice or comment from you.
Thank you