From the tutorial
–p-f-primer GTGCCAGCMGCCGCGGTAA
–p-r-primer GGACTACHVGGGTWTCTAAT \
My primers:
FWD:GTGYCAGCMGCCGCGGTAA; REV:GGACTACNVGGGTWTCTAAT
If it different primers were used to extract the amplicon region out of Silva 119 than my 515F/806R primers, would you recommend I extract the region again using the primer sequences I used for the V4 region? Or do you think it would not make much of a difference…?
I will answer the first part of your question here, and then turn it over to @Nicholas_Bokulich for part two!
We can use provenance to learn about the parameters that were used to create that artifact! If you load up the Silva 119 515F/806R trained classifier at view.qiime2.org:
Honestly, I don't expect it will make a big difference. It looks like the difference between your primers and those used for the pre-trained classifier are a single degenerate base near the 5' end of each primer. The extract-reads step is not that sensitive (e.g., looking for an exact match) that this will impact what reads are extracted.
If you do have sufficient memory to train your own classifier, I would suggest doing so just for peace of mind. But if you do not, or run into problems training your own classifier, then the pre-trained classifiers we provide should be fine.
@thermokarst, I was wondering if there was some way to see how the files were generated. Obviously I am still figuring out QIIME2 (trying to force myself to do this rather than defaulting to QIIME1.9). This is awesome. Thank you!
Hi @Nicholas_Bokulich, thanks for helping me think about this a bit more. Perhaps I'll give the training the classifier myself a try, and otherwise I'll use the pre-trained classifier, hoping that what you suggest
The extract-reads step is not that sensitive (e.g., looking for an exact match) that this will impact what reads are extracted
For example, I am just trying to confirm which Silva reference_reads and reference_taxonomy files were used, so I can use the same ones (or equivalent from newer Silva versions). For example was it the rep_set/99/Silva_119_rep_set99.fna (and associated taxonomy)...?
Hi @ctekellogg --- Click on the boxes above your currently select ones - they should correspond to the import steps for the input artifacts used to fit the classifier:
The md5 sum is over to the right: a86c94ce8d58ea9154fb88b05c123b02, as well as the name of the imported file. You can compute the md5sum of the file in question and compare the hashes to verify if that is the same file.
Also worth noting, Silva 119 is the latest version of Silva that is easily imported into QIIME 2, see this thread for more details: