Hello! I am using a web tool pipeline for clustering, alignment, and taxonomic assignment due to the computational limitations of my device. However, I noticed that it does not remove the chimera from my reads (I'm using de novo ABG OTU clustering algorithm). Since the pipeline run would give me the entire alignment and clustering output, including the taxonomy.qza artifact, is it ok to filter the chimera sequences later or I must filter chimera before clustering? My computer fails to run the clustering processes because I only work with limited RAM so I was thinking of running them in the web tool pipeline that I use and perform chimera removal after.
Basically this is my process:
- Make contigs from paired reads
- Dereplicate
- Alignment and OTU Clustering
- Chimera removal
- Filtering table and rep-seqs (rare OTUs, unwanted taxa, etc.)
- Alpha rarefaction
- mafft-fastree tree construction
- other downstream analysis (diversity metrics, etc.)
I perform steps 1 to 3 using online web tool and I receive the frequency table, sequences, and taxonomy artifacts. Then with those artifacts, I'm planning to do to do steps 4 to 8 locally.
I actually tried doing it and my frequency table shows considerable drop in my number of features in frequency table.qzv (which I assume are OTUs). But I am not sure if it's technically correct or logical because most of the posts here run chimera removal before clustering.
Also according to tutorial overview
q2-vsearch
implements three different OTU clustering strategies: de novo, closed reference, and open reference. All should be preceded by basic quality-score-based filtering and followed by chimera filtering and aggressive OTU filtering (the treacherous trio, a.k.a. the Bokulich method)
Thank you. I hope you are all well.
P.S. the web pipeline I use do not support dada-denoise so I'm stuck with OTUs rather than ASVs.