I’d like to try using q2-clawback to assemble taxonomic weights. In the tutorial you assemble weights from Qiita. But my Illumina sequence data is from the V3-V4 region, and I don’t see that in Qiita. What do you recommend I use to assemble weights? Or should I trim my data to just V4?
You can use
summarize-Qiita-metadata-category-and-contexts or query qiita directly using redbiom to see what “contexts” are available (this will detail the sequence domains available). It looks like there are several thousand V3-V5 samples present, and a smaller number of V3-V4. So you have a few options:
- Use the V3-V5 samples from QIITA (if they are appropriate sample types), and train a V3-V5 classifier.
- Use a custom collection of samples (e.g., from outside of Qiita) to assemble taxonomic weights.
- Trim your reads to V4, though I agree that is a very unappealing option.
We are working on some solutions to make this easier in the future, e.g., use V4 class weights for any other domain, but right now it’s complicated.
I assume the samples from QIITA have to have been processed the same as my samples (ie GreenGenes 97% OTU vs Deblur), is that correct?
that sounds about right — the qiita context info gives some of those details. You can also check those qiita studies manually to see how the reads were processed (though theoretically that should be unnecessary; the context is all you need)