Taxonomy classification does not match for my positive control (Mock community)

Hello again,

I have a question regarding the taxonomy classifier. I used the "Qiime feature-classifier skearn" using the pre-trained classifier created by Sydney_Morgan as I mentioned on this post. Although I was able to get taxonomic classifications, when I looked at the taxa-barplot created, the mock community does not match with the known taxa. I am using the zymo mock community that contains two known fungal species, Saccharomyces cerevisiae and Cryptococcus neoformans, neither which are present in the mock community taxonomy results.

I am unsure of what I could have done wrong and or what I should do to fix this. Should I train my own classifier, or rerun the code?

Below is the script I ran, in case it helps, as well as the taxonomy bar plot and the file.

qiime feature-classifier classify-sklearn
--i-classifier Train-Feature-Classifier/developer/Unite-ver8-dynamic-classifier.qza
--i-reads Fungal-RepSeqs-dada2-1.qza
--p-read-orientation reverse-complement
--o-classification Fungal-taxonomy-paired-RC.qza

Fungal-taxonomy-paired-RC.qzv (1.6 MB)
Fungal-taxa-bar-plots-RC.qzv (1.6 MB)

1 Like

I’d say your mock community failed — it looks a lot like your negative control, which suggests to me that maybe it did not amplify sufficiently. You should review your PCR amplification results and sequence read counts for those samples.

Looking at your cutadapt results from a separate topic, it looks like you do get plenty of reads in the mock community — but these classifications are too suspicious. They are not remotely close to the expected yeasts. The Pyronema is abundant in many of your samples and could be cross-contamination. Is this a species you expect in your other samples? The Mollisia, on the other hand, looks really bad — it is mostly only found in your negative controls and mock community, which suggest some sort of background contamination. The fact that it is most abundant in those samples and almost never found in the others is very suspicious, maybe a contamination event that occurred when you were preparing those samples separately?

So this really does not look like what we usually see with mock communities (or any samples) — where the observations do not match the expectations, usually due to lack of taxonomic resolution. Instead it looks like technical error most likely on the wet-lab side.

None of this is to say that you cannot use your other data — I’d recommend making a critical review of the organisms found there and decide if the data look “clean”. The good news is that your real samples do not look like your negative controls, so this is probably not a widespread problem. The bad news is just that the mock community sample looks really bad, I’d say that sample is unusable.

I think the problems occurred before or during sequencing, so unfortunately the only way to save this sample is probably to start at the very beginning. Your other samples are probably okay, since they did not suffer the same fate as this sample.

This does not appear to be a classifier issue or bioinformatics issue. Re-running the code will not help.

I’d say give it all a careful review and determine if/why you can recover the other data even if the mock community cannot be saved. Then pick up the pieces and move on! Again, the other data look fine and that’s the good news to be had here.

:mushroom: :disappointed: :mushroom:

Bummer, I did see there was contamination becuase of the negative controls, but we are working with a completely new protocol and this was the tester group so we definitely did something wrong.

The Pyronema is abundant in many of your samples and could be cross-contamination. Is this a species you expect in your other samples?

Yes, Pyronema is expected as it is a very common fungal group in post-fire samples, and I did notice it on the mock and negative controls, which is why I figured contamination occurred.

The Mollisia, on the other hand, looks really bad — it is mostly only found in your negative controls and mock community, which suggest some sort of background contamination.

I did notice that but I was being hopeful in the sense that I made an error in the pipeline, in hopes that it would be an easy fix.

Thanks again for being so helpful.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.