I have a question. I performed a taxonomy classification using 16S with the SILVA database and the Naive Bayes classifier, and I found a small number of environmental bacteria that are not present in stools.
Is this logical? I mean, if the majority of the taxa in the output are from the human gut, but a small proportion of bacteria are not present in the human gut, is that wrong? Or is this something that we know can happen?
I think the answer is generally It Depends Could you give an example of an enviromental bacteria you're finding in the gut that you don't expect?
I can think of cases where I've seen unexpected bacteria that made sense, I can think of cases where I saw unexpected bacteria and it was contamination. So, having more context can help a lot ot make this determination.
As I look through the table, its more or less what I woudl expect. I might filter things missing a class label, our of concerns for chimeras, but that's neither here nor there. I always recommend a stacked barplot as a comparison for the enviroment, becuase I think it's easier to find what you want to see and compare to existing literature. But, none fo the taxa are glaring at me as not belonging.
So, which bacteria is your intutition suggesting are wrong and why?
Thanks, what is their relative abundance and prevalence in your table? Are they present in a lot samples or only a few? High or low abundance? Is there secondary metadata that might explain their presence?
For example, if they're primarily found in one person, it's possible you might be seeing them becuase they're an ocean swimmer or exposed to a marine enviroment. Small studies I've seen have shown people who surf or swim have different taxa than people who exercise on land due to marine exposure. Annecdotal, but possible.
The other possibility might be contamination. I would look at your reagents and sequencing run. If this is the case, you should expect to see the bacteria across multiple samples. For example, if your samples were sequenced with marine samples on the same run, you might see contamination. (We once found tick bacteria in a disease study! It was super exciting.. until we realized the samples had been run with ticks and it was likely index hopping.)
Are there other reasons these marine bacteria might appear in the samples?
In general, my policy is to leave stuff in for diversity unless there's a very good reason to exclude (mitochondria; spike in, poor classification), and then to use prevelance/abundance filtering in my differential abundance testing.
I'll also remind you that there's a lot of bacteria out there that is poorly characterized. So, they may have that marine tag because that's where they were found, but they could be happy hanging out in humans because we are salty bags of water and nutrients, too. Although If ind some days to be saltier than others.