Feature IDs found in the table are missing from the taxonomy (after filtering)

Hi everyone! This question was answered in other scenarios but I am still confused.

I am simply filtering out archea and eukaryotes from my 16srRNA data and whwnever I try to make the veiwable bar plots I get this error followed by a list of my samples. “Feature IDs found in the table are missing from the taxonomy:”

I assume it is because the table and the taxonomy did not filter out the same things but I am unfamiliar with QIIME2 and have no idea how to check if this is the case or how to gfix it. I have included my code below. All taxonomy assignments prior to filtering worked without a hitch but for some reason filtering is breaking everything.

qiime taxa filter-seqs \
  --i-sequences rep-seqs.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "d__Archaea,d__Eukaryota " \
  --p-include p__ \
  --o-filtered-sequences filt-rep-seqs.qza
  
  qiime feature-classifier classify-sklearn \
  --i-classifier /path/silva-138.2-F04-R22-classifier.qza\
  --i-reads filt-rep-seqs.qza \
  --p-n-jobs 8 \
  --o-classification filt-taxonomy.qza

                                               # use the absolute pathname of the trained database and assigns taxonomy to ASVs, silva better for microorganisms

qiime metadata tabulate \
  --m-input-file filt-taxonomy.qza \
  --o-visualization VIEWABLE_filt-taxonomy.qzv   
  
   

qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "d__Archaea,d__Eukaryota " \
  --o-filtered-table filt-table

qiime taxa barplot \
  --i-table filt-table.qza \
  --i-taxonomy filt-taxonomy.qza \
  --o-visualization VIEWABLE_filt-taxa-bar-plots.qzv

Hello Jahlen,

Welcome to the forums! :qiime2:

Try this:

qiime taxa barplot \
  --i-table filt-table.qza \
  --i-taxonomy taxonomy.qza \
  --o-visualization VIEWABLE_filt-taxa-bar-plots.qzv

Notice how I've changed the input file to --i-taxonomy!

Try that and let me know how it works!


We have fantastic amplicon tutorials with lots of examples you can try with your data!

1 Like

Thank you so much for such a swift reply! I am a little confused why this works, wouldnt we want to use the filtered taxonomy for the bar plote? why use the original taxonomy file? Also when I check the bar plot, I still see d__Eukaryota;p__Diatomea represented, is there a way to filter again or did I miss something?

Hello Jahlen,

Thank you for your patience while I work through your post:

Yeah, that's not what's happening in this code! The feature-classifier is being run with 8 jobs in parallel, for one thing.

Here is how to filter out Archs and Euks and make a bar plot:

qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "d__Archaea,d__Eukaryota " \
  --o-filtered-table table_filtered.qza

qiime taxa barplot \
  --i-table table_filtered.qza \
  --i-taxonomy taxonomy.qza \
  --o-visualization VIEWABLE_table_filtered_barplots.qzv

That uses existing table.qza and taxonomy.qza files you mentioned in your post. It's as direct as possible.

Unless you want to run classify-sklearn another time to compare databases?

Thank you so much for all your help, I was finally able to make the bar plots! I simply wanted to filter out euks and archaea and in a tutorial that was made in my lab it had us make new filtered table, taxonomy and rep-seqs files then use that for all downstream analysis. I copied the code below. To my understanding, the way you did it uses the filtered table and the original taxonomy instead of the filtered taxonomy, why is that? I am a little confused why that works.

qiime taxa filter-seqs \
  --i-sequences rep-seqs.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "d__Archaea,d__Eukaryota " \
  --p-include p__ \
  --o-filtered-sequences filt-rep-seqs.qza
  
  qiime feature-classifier classify-sklearn \
  --i-classifier /path/silva-138.2-F04-R22-classifier.qza\
  --i-reads filt-rep-seqs.qza \
  --o-classification filt-taxonomy.qza

                                               # use the absolute pathname of the trained database and assigns taxonomy to ASVs, silva better for microorganisms

qiime metadata tabulate \
  --m-input-file filt-taxonomy.qza \
  --o-visualization VIEWABLE_filt-taxonomy.qzv   
  
     # makes viewable taxonomy file


qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-exclude "d__Archaea,d__Eukaryota " \
  --o-filtered-table table_filtered.qza

qiime taxa barplot \
  --i-table table_filtered.qza \
  --i-taxonomy taxonomy.qza \
  --o-visualization VIEWABLE_table_filtered_barplots.qzv

This is my full code, is making a filt-taxonomy file redundant?

Hi @mainly.microbe,
Good question. It basically comes down to trying to simplify data management for users.

We tend to think of the information in these other artifacts (FeatureData[Taxonomy], FeatureData[Sequence], Phylogeny) as annotations of the features in a feature table - similar to how we think of sample metadata. Since feature tables tend to get filtered in various ways (on both the sample and feature axes), allowing the ids represented in these other artifacts to be a superset of the ones represented in the feature table prevents the need for lots of filtered versions of the other artifacts. For example, you might have two or three different feature tables that you're working with, but if they all use the same FeatureData[Taxonomy] artifact, that's easier to keep track of and the extra annotations in the FeatureData[Taxonomy] just get ignored. Same goes with sample metadata: there can be extra sample ids in there with respect to a feature table, but all sample ids represented in feature table must be present in the metadata.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.