Bos taurus is everywhere in my 18S SILVA 138 taxonomy!

Hello Qiime2 community.

I'm seeing a strange result when I look at 18S taxonomy. My samples are DNA extracted from a test pit dug 1.5 meters deep in East Iceland. Samples were taken every 10 cm. We're looking to see any evidence of human habitation - the test pit was dug right next to a known stone foundation, outside the house.

I'm running Qiime2021.4, and using the SILVA138 classifier downloaded from the Qiime2 website (I did not train my own - yet). I'm getting a very strange result when I look at Vertebrates:
Total # Reads: 928204
Reads classified as Bos Taurus: 144269 (a full 15% of the entire dataset)
Reads assigned to Vertebrates that are not Bos Taurus: 12

Has anyone else seen anything this strange? I'm guessing the problem is somewhere in the assignment of reads step. My next step will be to train my own classifier on my data, but I'm happy to get more advice on this.

For those interested, here is a table of the highest % reads:

% of total reads Phylum Class Order Family Genus Species Species
15.54 p__Vertebrata c__Actinopterygii o__Teleostei f__Teleostei g__Teleostei s__Bos_taurus s__Bos_taurus
15.51 p__Basidiomycota c__Agaricomycetes NA NA NA NA NA
11.91 NA NA NA NA NA NA NA
7.48 p__Peronosporomycetes c__Peronosporomycetes o__Peronosporomycetes f__Peronosporomycetes NA NA NA
5.82 p__Basidiomycota c__Agaricomycetes o__Agaricales NA NA NA NA
5.79 p__Phragmoplastophyta c__Embryophyta o__Magnoliophyta f__Magnoliophyta g__Magnoliophyta NA NA
5.72 p__Mucoromycota c__Incertae_Sedis o__Mortierellales f__Mortierellaceae g__Mortierella NA NA
3.08 p__Ascomycota c__Leotiomycetes o__Helotiales NA NA NA NA
2.99 p__Ascomycota NA NA NA NA NA NA
2.40 p__Cryptomycota c__LKM11 o__LKM11 f__LKM11 g__LKM11 NA NA
2.20 p__Cercozoa c__Glissomonadida o__Glissomonadida f__Glissomonadida g__uncultured NA NA
1.94 p__Ascomycota c__Dothideomycetes o__Pleosporales NA NA NA NA
1.59 p__Ascomycota c__Archaeorhizomycetes o__Archaeorhizomycetales f__Archaeorhizomycetaceae g__Archaeorhizomyces s__Archaeorhizomyces_borealis s__Archaeorhizomyces_borealis
1.24 p__Basidiomycota c__Tremellomycetes o__Filobasidiales NA NA NA NA
1.09 p__Cercozoa NA NA NA NA NA NA
1.09 p__Cercozoa c__uncultured o__uncultured f__uncultured g__uncultured s__uncultured_Eimeriidae

The code I ran:

qiime feature-classifier classify-sklearn
--p-n-jobs 8
--i-classifier silva-138-99-nb-classifier.qza
--i-reads Iceland-2019-STP-FWDs-rep-seqs-dada2-run3-18s.qza
--o-classification Iceland-2019-STP-taxonomy-silva-run3-18s.qza &

qiime metadata tabulate
--m-input-file Iceland-2019-STP-taxonomy-silva-run3-18s.qza
--m-input-file Iceland-2019-STP-FWDs-rep-seqs-dada2-run3-18s.qza
--o-visualization Iceland-2019-STP-taxonomy-silva-table-run3-18s.qzv

qiime taxa barplot
--i-table Iceland-2019-STP-FWDs-table-dada2-run3-18s.qza
--i-taxonomy Iceland-2019-STP-taxonomy-silva-run3-18s.qza
--m-metadata-file Skalanes-Test-Pit-Metadata.tsv
--o-visualization Iceland-2019-rSTP-taxonomy-silva-bar-plots-run3-18s.qzv

Did you manually blast the cow sequences and confirm that they are in fact cow?
Did anyone have beef stew etc.. for lunch?
Was it present in your control samples?
There are lots of ways to get DNA in a sample without the organism having been present, we see things like cows, chickens, pigs, dogs & humans in our samples occasionally, especially if the samples does not have much else for the 18S to amplify.
It's hard to argue that they are not contaminants unless you've taken specific measures to control that.

Hi @16sIceland,

If you notice, the upper-level taxonomy does not fit that of Bos taurus, in fact this is from a fish of some kind. The SILVA database we provide was made using the --p-include-species-labels flag. Which basically appends the host organism name (i.e. usually the organism from which community DNA was amplified from), as the species label. However, the host organism name is not always that of the sequenced DNA.This is outlined in more detail within this forum post.

Specifically, read the warning we have under the under the drop-menu "Species-labels: caveat emptor!" :warning:. That is, SILVA does not currently curate taxonomy down to the species level and we offer the option to append the organism name, as it is sometimes helpful. If you'd like you can use RESCRIPt to make your own reference database without the organism names appended as the species labels. Or you can use RESCRIPt to make your own database from NCBI (see the tutorials on the RESCRIPt page.

It could be that this was a contaminant, or some diet bycatch :thinking:? But the key point is, when observing an odd result like this, make sure the entire taxonomy is reasonable, then confirm via BLAST or other tool. :mag:

Hope this helps!