Only show genus taxonomy on ancombc graphic?

Hi @colinvwood ,

Thanks a million!!! :grin: :grin: :grin: I was trying to do this for a long time and was even trying to do a script of my own to filter the taxonomy levels to try to input lol.

Now I have adjusted the --p-significance-label parameter for "p_val" as well --p-significance-threshold 0.05 and --p-level-delimiter ";". I also noticed I was trying to input --m-feature-ids-file genus.tsv as a metadata but it was only not needed and was also wrong for what I was trying to do. Jeez. :sweat_smile: :melting_face:

Let me also take the opportunity to ask: is it also possible to filter by a category? For example if I have healthy individuals and bipolar, is it possible to generate graphics to only healthy, only bipolar, and both? Because the graphic already gets the statistically significant independly from which individual it comes from, right?

1 Like

Hello @Liviacmg,

The qiime feature-table filter-seqs command should be able to do that.

1 Like

Hi @colinvwood ,

Thank you again!!! :slightly_smiling_face: :slightly_smiling_face: :slightly_smiling_face:

Hi @colinvwood ,

I tried to use the qiimefeature-table filter-seqs, by using the metadata like this:

qiime feature-table filter-seqs
--i-data rep-seqs.qza
--m-metadata-file metadado-validado.tsv
--p-where "[Host_disease]='BipolarDisorder'"
--o-filtered-data bipolar-rep-seqs.qza

But then were this error:

Plugin error from feature-table:

All features were filtered out of the data.

Debug info has been saved to /tmp/qiime2-q2cli-err-jceyjop0.log

Traceback (most recent call last):
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2cli/commands.py", line 468, in call
results = action(**arguments)
File "", line 2, in filter_seqs
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
outputs = self.callable_executor(
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 509, in callable_executor
output_views = self._callable(**view_args)
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_feature_table/_filter.py", line 118, in filter_seqs
raise ValueError('All features were filtered out of the data.')
ValueError: All features were filtered out of the data.

And then I saw that in this post there is exactly the same problem as I am facing:

And I tried to exclude the --p-where but then the same error appeared.

I didn't understand the solution reported in the post, for example how this clean-table.qza was generated and what it contains:

qiime feature-table filter-seqs
--i-data rep-seqs.qza
--i-table clean-table.qza
--o-filtered-data clean-rep-seqs.qza

How can I separate the bipolar for example to generate the ancombc graph with qiime composition da-barplot? Because I exported the ancombc and counted on the "p_val_slice.csv" that there are a lot of genus that passed the p-value < 0.05 for bipolar but when the graphic is generated it only seems to show the healthy.

1 Like

Hello @Liviacmg,

If you feel comfortable doing so, can you attach your metadata file?

Hi @colinvwood ,

Of course, no problem:

metadado-validado.tsv (76.5 KB)

Hello @Liviacmg,

What are the rep-seqs from? Dada2? Could you attach this artifact so I can look at the provenance?

Hi @colinvwood ,

Yes, they are from DADA2, as it follows:

rep-seqs.qza (201.5 KB)

Hello @Liviacmg,

The seqs directly outputted from dada2 aren't really relevant here. Using qiime feature-table filter-features to filter your feature table the way you want then passing that to da-barplot using --m-feature-ids-file should work.

1 Like

Hi @colinvwood ,

Thank you so much again! Well, I tried using qiime taxa collapse (as I did before, to show the taxonomic classification on level of genus):

qiime taxa collapse
--i-table table-dada2.qza
--i-taxonomy taxonomy.qza
--p-level 6
--o-collapsed-table genus.qza

And then I used the input to qiime feature-table filter-features as you said (in this case i filtered only for bipolar):

qiime feature-table filter-features
--i-table genus.qza
--m-metadata-file sample-metadata.tsv
--p-where "[Host_disease]='BipolarDisorder'"
--o-filtered-table feature-frequency-bipolar.qza

And then I tried to calculate ancombc:

qiime composition ancombc --i-table feature-frequency-bipolar.qza --m-metadata-file sample-metadata.tsv --p-formula Host_disease --o-differentials ancombcbipolar.qza

But then this error appeared:

Plugin error from composition:

('Value provided in reference_levels parameter not associated with any IDs in the feature table. Please make sure the value(s) selected in each column::value pair are associated with IDs present in the feature table. \n\n Value not associated with any IDs in the table: "BipolarDisorder"', ' IDs not found in table: "Index(['SRR7690036', 'SRR7690039', 'SRR7690040', 'SRR7690043', 'SRR7690044',\n 'SRR7690046', 'SRR7690047', 'SRR7690048', 'SRR7690049', 'SRR7690053',\n ...\n 'SRR7690201', 'SRR7690203', 'SRR7690204', 'SRR7690205', 'SRR7690206',\n 'SRR7690209', 'SRR7690210', 'SRR7690211', 'SRR7690212', 'SRR7690213'],\n dtype='object', name='SampleID', length=115)"')

Debug info has been saved to /tmp/qiime2-q2cli-err-ga1w7cqh.log

Debug info:

Traceback (most recent call last):
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2cli/commands.py", line 468, in call
results = action(**arguments)
File "", line 2, in ancombc
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
outputs = self.callable_executor(
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 509, in callable_executor
output_views = self._callable(**view_args)
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_composition/_ancombc.py", line 41, in ancombc
return _ancombc(
File "/usr/local/miniconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_composition/_ancombc.py", line 207, in _ancombc
raise ValueError('Value provided in reference_levels'
ValueError: ('Value provided in reference_levels parameter not associated with any IDs in the feature table. Please make sure the value(s) selected in each column::value pair are associated with IDs present in the feature table. \n\n Value not associated with any IDs in the table: "BipolarDisorder"', ' IDs not found in table: "Index(['SRR7690036', 'SRR7690039', 'SRR7690040', 'SRR7690043', 'SRR7690044',\n 'SRR7690046', 'SRR7690047', 'SRR7690048', 'SRR7690049', 'SRR7690053',\n ...\n 'SRR7690201', 'SRR7690203', 'SRR7690204', 'SRR7690205', 'SRR7690206',\n 'SRR7690209', 'SRR7690210', 'SRR7690211', 'SRR7690212', 'SRR7690213'],\n dtype='object', name='SampleID', length=115)"')

Am I doing something wrong?

1 Like

Hello @Liviacmg,

Can you please explain what you're trying to accomplish again? My understanding is that you've performed da analysis between the bipolar disorder and healthy control groups and want to filter the barplot figure in what way exactly?

Hi @colinvwood ,

Yes, I am trying to see the results of ancombc for the healthy group and the bipolar group. I have successfully accomplished to generate the ancombc calculations and graphic, but the output of the da-barplot only shows the healthy group:

But on the calculation of the ancombc itself it shows that there are a lot of significant genus with p-value < 0.05 for the bipolar group as well (shown as Intercept):

That's why I don't know if I need to recalculate the ancombc for each group separately. I don't understand why the ancombc da-barplot is only showing the results for healthy control.

Hello @Liviacmg,

I think there's some fundamental misunderstanding going on about what ancombc analysis does. Recalculating ancombc for each group separately does not make sense because the analysis is a comparison between your two groups. There is only one comparison, what could it mean to calculate ancombc for each of the two groups? You could switch the reference to be the other level I suppose but the results will not be meaningfully different, just have the opposite interpretation. I'm not sure how feature-table filtering is involved in any way.

The p-values assigned to the intercept have a different interpretation than the p-values assigned to the non-reference level. It's not very informative in this context, I don't believe. If you want to learn more about this I would recommend doing some searching online or in a stats textbook.

Hi @colinvwood ,

Thank you so much again!!!

According to the article available at qiime2 page of ancombc ( Analysis of compositions of microbiomes with bias correction | Nature Communications, doi:10.1038/s41467-020-17041-7.) "One of the deficiencies of ANCOM is that it does not provide p value for individual taxon". So, is ANCOM-BC like an "upgrade" of ANCOM, that, through a bootstrap procedure to determine the statistical significance of the W-statistic for each taxon, it allows ANCOM-BC to assign a p-value to each taxon?

Anyway, I think the solution to my problem was just a matter of perspective:

When you compare two groups, such as bipolar vs. healthy, and you observe that a particular taxon is "increased" in the healthy group, it implicitly means that the same taxon is "decreased" in the bipolar group, and vice versa. In other words, saying a taxon is increased in one group inherently provides information about its status in the other group when only two groups are being compared. So it is not that the bipolar group was not being considered, it was. But what is happening is that the healthy group is being considered as the "reference" from the taxa represented.

So, given the image below does it mean that all genus enriched on healthy control (i.e. g__Faecalibacterium, g__ER4, g__CAG-41.... ) are depleted on bipolar? And that all genus depleted on healthy control are enriched on bipolar group? Does my interpretation make sense?

Hello @Liviacmg,

So, given the image below does it mean that all genus enriched on healthy control (i.e. g__Faecalibacterium, g__ER4, g__CAG-41.... ) are depleted on bipolar? And that all genus depleted on healthy control are enriched on bipolar group? Does my interpretation make sense?

Yes, it's relative.

Hi @colinvwood ,

Thank you for all your support!

1 Like

Hi @colinvwood ,

Sorry for bothering you again :sweat_smile: :sweat_smile:, but do you know if there is a way to extract only the y axis of the da-barplot that I uploaded above, i.e. on a .tsv archive?