Hello,
I have a problem with silva-138.1-ssu-nr99-classifier.qza
Finally more RAM in machine let me install this silva, but I have new complications with them.
I get data in tables and bar plots in only form: d_Bacteria, d_Eukaryota and a lot as "Unassigned". I used primer SSU (SSUF04 GCTTGTCTCAAAGATTAAGCC , SSURmod CCTGCTGCCTTCCTTRGA) for data from aquatic ecosystem.
I have not this problem with other primer - EUK ( 1391F 5′-GTACACACCGCCCGTC-3′ , Euk B 5′-TGATCCTTCTGCAGGTTCACCTAC-3′ ). I set "Level 6" in Taxonomy Level on qiime2view in both cases.
Forster, D., Filker, S., Kochems, R., Breiner, H. W., Cordier, T., Pawlowski, J., & Stoeck, T. (2019). A comparison of different ciliate metabarcode genes as bioindicators for environmental impact assessments of salmon aquaculture. Journal of Eukaryotic Microbiology, 66(2), 294-308.
1391F 5′-GTACACACCGCCCGTC-3′
Euk B 5′-TGATCCTTCTGCAGGTTCACCTAC-3′
Günther, B., Jourdain, E., Rubincam, L., Karoliussen, R., Cox, S. L., & Arnaud Haond, S. (2022). Feces DNA analyses track the rehabilitation of a free-ranging beluga whale. Scientific Reports, 12(1), 1-7.
I've been discussing this question offline with @SoilRotifer, and we think the issue may be related to read orientation.
During Illumina sequencing, the amplicon primers can be used as sequencing primers to that all the reads are in the 'forward' orientation. So R1 is always SSUF04 and R2 is always SSURmod.
Or, the normal Illumina primers can be used, so R1 is a mix of SSUF04 and SSURmod. Half the reads are 'backwards' after sequencing, and can lead to many unassigned reads.
Do you think that could be going on here?
If you want to try this, run the rescript orient-seqs option from the SILVA tutorial. If that makes your ASVs get taxonomy, that's a good step forward!
When you re-run feature-classifier classify-sklearn ... , can you set the option --p-read-orientation same. Sometimes the default --p-read-orientation auto can be tricked and result in spurious results, so your command should be:
I noticed you mislabeled your output as table_ssu_oriented.qza, when it should probably be repseqs_ssu_oriented.qza.
Anyway, if setting the orientation does not help... then I'd try running the following using the reference sequences and taxonomy files (those used to make the classifier) as input:
Vsearch will work regardless of orientation. But if you obtain the same spurious results... then I would think the sequencing run did not work as intended. Probably manually BLAST a few of the sequences online with the megablast setting, and excluding environmental sequences.