I've got a weird conundrum, and I'm hoping for some hive mind wisdom.
I've got a data set that is 16S V34 Illumina paired end data and I'm running in qiime2-2022.2. Because of several project related constraints, I'm sort of stuck with this version.
My reads are part of a meta analysis, and so I've been clustering them closed reference. My current pipeline for hte data is:
- Trim primers using cutadapt; keep untrimmed reads since they were trimmed before processing
- Join paired ends using q2-vsearch
- Quality filter iwth q2-quality-filter using default parameters
- Denosing using deblur-16S, trimming like the first 15nt and a reasonable length for the ASVs
- Apply a full length Silva 138.1 feature classifier to the ASVs and check the taxonomy using classify-sklearn
- Cluster the data closed reference at 99% against the same Silva 138.1 reference sequences I used to build the classifier using q2-vserach.
When I look at the high level ASV taxonomy, it looks reasonably good. The community composition reflects the expected enviroment, there's reasonable variation, and it passes the sniff test.
None of the representative sequences are clustering against the reference database, and the ones I do get to cluster don't make sense. (Mostly Bacilli for a fecal community.)
- Switching the primer trimming (no dice)
- Running single and paired ends
- Changingt the denoising trim length
- Relaxing my clustering identity
- Allowing mixed orientation reads
Thus far, nothing has worked.
I'm hoping someone here might have some brilliant insight?