Differential abundance analysis

Dear all,

I performed Taxonomy classification and taxonomic analyses referring to hitdb. I performed a differential abundance analysis using ancom and i received this message :
Plugin error from composition:

Unable to allocate 82.0 TiB for an array with shape (3357899, 3357899) and data type float64

Debug info has been saved to /tmp/qiime2-q2cli-err-1fcfq_sh.log

Thanks

Hi @M_F!

I'm not really sure what you're trying to do here, can you please give us more details? Also, filenames don't really mean anything in QIIME 2, since you can name a file anything you want, so it would be helpful if you kept that in mind when elaborating on your description a bit more. Thanks!

Hello @thermokarst

Thanks for the reponse. Sorry the question is wrong. I performed a differential abundance analysis using ancom and i received this message :
Plugin error from composition:

Unable to allocate 82.0 TiB for an array with shape (3357899, 3357899) and data type float64

Debug info has been saved to /tmp/qiime2-q2cli-err-1fcfq_sh.log

What can i do ? Thanks for your help

Thanks for clarifying, @M_F.

The error message is here:

You have a huge amount of features in your FeatureTable: 3,357,899. ANCOM is crashing because the amount of memory necessary for that calculation is 82.0 TiB. That kind of memory is basically impossible to come by on a single machine:

2 TB of RAM on a single computer is about as good as it gets right now, so realistically there is no way for you to perform ANCOM on 3.3 million features.

Keep us posted! :t_rex:

2 Likes

@thermokarst

Thank you so much for the response.
i run the job sample per sample.
I tried to filter and collapse the table, the job was run with success. However actually i have a problem to run the ancom test with the command --m-metadata-column
qiime composition ancom
–i-table comp-gut-table-l6.qza
–m-metadata-file sample-metadata.tsv
–m-metadata-column Disease
–o-visualization l6-ancom-subject.qzv
i received this message when i used the aforementioned command line
Plugin error from composition:

All values in grouping are unique. This method cannot operate on a grouping vector with only unique values (e.g., there are no ‘within’ variance because each group of samples contains only a single sample).

Debug info has been saved to /tmp/qiime2-q2cli-err-axqj1bma.log

In the table i have a column called Disease (Healthy vs Affected) and i would like that ancom test will be performed for SAM.1 since i run the job for each sample separately.
What can i do to run Ancom test for the column Disease taking into consideration the data specific to only SAM.1 or sample 1 in the first row ?
Thanks a lot

Hi @M_F.

You can't run ancom on a single sample, because then there will only be a single possible value for the metadata group - make sense? That is what the error message is telling you:

The "grouping" in this case is your Disease column. If there is only one sample, then there can only be one value, which means that all the values in that column are unique.

Dear @thermokarst

Thank you for the message. I filtered the table and i performed differential abundance analysis using the following command lines

  1. qiime feature-table filter-samples --i-table table.qza --m-metadata-file samples-to-keep.tsv --o-filtered-table H-AFF-filtered-table.qza

  2. qiime composition add-pseudocount --i-table H-AFF-filtered-table.qza --o-composition-table H-AFF-comp-table.qza

  3. qiime composition ancom --i-table H-AFF-comp-table.qza --m-metadata-file samples-to-keep.tsv --m-metadata-column Disease --o-visualization H-AFF-ancom-Disease.qzv

i received this message

Plugin error from composition:

Unable to allocate 43.5 TiB for an array with shape (2445046, 2445046) and data type float64

Debug info has been saved to /tmp/qiime2-q2cli-err-ntm81n4x.log

how can i analyze this huge amount of data i filtered the table and it seems that the problem remains always, i need all those information ?

Thanks

Hi @M_F!

You will need to filter out a lot more than half the data - like I said, 2 TB of RAM is really about as good as it gets these days, so your target is <2TB.

Taking a step back, I am pretty skeptical that you will even be able to discern any kind of meaningful signal out a dataset this large. Surely you can reduce the size of the scope somehow, right?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.