I’m trying to filter out sequence variants present in my extraction (-) and PCR (-) from the rest of my dataset. I noticed similar posts in this forum and I’ve gone through them to see if any of the recommended fixes worked for me. So far they haven’t, but maybe I missed something obvious.
Using identity based filtering (this metadata file contains all the samples except the PCR (-) and Ext (-):
qiime feature-table filter-features
–i-table table-final.qza
–m-metadata-file BrazilMicrobMetadata_filtered.tsv
–o-filtered-table id-filtered-table.qza
I then try to summarize the filtered table using this command:
qiime feature-table summarize
–i-table id-filtered-table.qza
–o-visualization id-filtered-table.qzv
–m-sample-metadata-file BrazilMicrobMetadata_filtered.tsv
And I get this error:
Plugin error from feature-table:
All IDs were filtered out of the Metadata, resulting in an empty Metadata object.
I used ls -ls to verify that none of the files that I use in this command are empty. I’m using QIIME2-2018.2.
After classifying the sequence variants in my dataset and creating a list of taxa that I believe are contaminants, I am still running into difficulty eliminating certain taxa.
Namely, I cannot filter out taxa that don’t have genus or species identified:
However, when I reclassify rep-seqs-final_filtered.qza and then build the taxa barplot, “k__Bacteria; p__Bacteroidetes; __; __; __; __” is still present in the data.
Is there another way to filter out these less specifically identified taxa that I think are contaminants?
Thanks,
Kelly
If you were to run qiime metadata tabulate on taxonomy_unfiltered.qza do you still see the offending entry? I suspect it will look like this instead: k__Bacteria;p__Bacteroidetes
If I’m remembering correctly, taxa barplot pads out the taxonomy strings to make rendering simpler. In other words k__Bacteria;p__Bacteroidetes;__;__;__;__ is just a side-effect of trying to make the depth match everything else, it’s still really k__Bacteria;p__Bacteroidetes.
However if you saw k__Bacteria;p__Bacteroidetes;c__;o__;f__;g__ (with the prefixes) then that would mean there’s a un-named Greengenes OTU with that taxonomic resolution.
Also, unrelated to what I think is happening, there’s another issue with your command.
If we compare your search string to the list we see that they are not exactly the same:
k__Bacteria;p__Bacteroidetes;__;__;__;__ # Your list
k__Bacteria; p__Bacteroidetes; __; __; __; __ # Your --p-exclude
In particular it looks like your query has spaces in it which the computer considers as different.
If I’m remembering correctly about the behavior of taxa barplot your --p-exclude should look more like: --p-exclude 'k__Bacteria;p__Bacteroidetes'.
Thanks so much @ebolyen! I tabulated the taxonomy_unfiltered and copied the taxon ID directly from the metadata. After I did this, the targeted taxa were removed from the dataset. Thanks so much for pointing that out.
As a suggestion, it might be helpful to allow people to filter based on feature ID when using --p-mode exact. These don’t suffer from minor variations in the way they are presented, like presence/absences of spaces. This may already be implemented?
It sure is! That was actually our first form of filtering: qiime feature-table filter-seqs (granted feature-table might not be the first place one would look for this). There's other filtering operations in q2-feature-table as well which are based on IDs.
Hello everyone, here is a little script to show how you can use the "p-mode exact" to show the micro-organisms that are resolved:
1.) only at the kingdom level
2.) at the family level
I ran into a problem again, and if you have already done this, you probably went through the same steps that I did, and then realized you were maybe going down a rabbit hole : )
I need to have an “p-mode exact” search for the ‘k__Bacteria’, but also have the ‘p-mode contains’ for all the other taxons! I am a little stuck! Do I have to use the “qiime feature-table filter-features” instead and create several tables that I will then merge into one? It really is getting complicated. Is there an easier way, and I am just missing the boat!?
If your filtering query is more complex than those supported through qiime taxa filter-table, you should use qiime feature-table filter-features.
Sounds like you are a perfect candidate for this, since your query is becoming a bit complex now, and feature-table filter-features supports SQL-based queries, which are super powerful!
(double-check my spelling in that long taxon string - I am sure I typoed somewhere...)
What this SQL clause is saying is: I want all the features that have the exact (=) taxon string of k__Bacteria. I also want (OR), all the features that start with (LIKE; % at the end of the query string which is a wildcard match) k__Bacteria; p__Planctomycetes; c__Planctomycetia; o__Pirellulales; f__Pirellulaceae.