Encountering unknown error in ancombc

Hello fellow qiime2 users,
I've been trying to figure out differentially abundant microbial population in diabetic patients but facing so many obstacles that I can't figure out alone as a novice metagenomic analyst, so I need your help and I'm attaching all the files I used and errors I got below. For the analysis, I merged my feature and metadata tables, filtered out mitochondrial and chloroplast sequence and now trying to use the final FeatureTable [Frequency] for the qiime composition ancombc to generate a barplot. And somehow I was successful before using the following code:

qiime composition ancombc
--i-table filtered_final_table_old.qza
--m-metadata-file metadata_final.tsv
--p-formula Diabetes_typ
--o-differentials differentials.qza
--verbose

qiime composition da-barplot
--i-data differentials.qza
--o-visualization DA_diffbar.qzv

files:
final_filtered_table(old).qza (3.1 MB)
DA_diffbar.qzv (850.4 KB)
differentials.qza (736.5 KB)
metadata_final.tsv (708.0 KB)

However, the barplot that I prepared didn't have taxa name and was too large for any good use. so I was trying to resolve how to get past that. I collapsed my table down to phylum and genus level but the tables were now in FeatureTable [composition] format so they couln't be used in the plugin/code I used before.
merged_table_genus.qza (673.9 KB)
merged_table_phylum.qza (372.5 KB)

However, now that I'm trying to run the same code I ran before (after a while), I'm 'getting errors no matter what file I use, filtered or unfiltered feature tables also with different metadata.



I've also tried using other codes that I saw in the forum, such as:
filtered_combo_table.qza (3.1 MB)
combined_deblur_table.qza (2.9 MB)

qiime ancombc ancombc --i-table filtered_combo_table.qza --m-metadata-file metadata_final.tsv --p-formula "Diabetes_typ" --o-differentials da_test.qza --verbose

And also checked conda list as one post asks, and found something unusual maybe:

With my deadlines approaching I'm a bit lost so I'd be very grateful if someone could help me solve this mess. I searched for a proper step-by-step tutorial for the analysis but couln't find it anywhere on the documentation or forum, so proper documentation or tutorial would also help.
Thanks in Advance.

UPDATE:
I updated my qiime2 environment and was able to resolve the whole error above by using collapsed table for DA test, however,
DA_gen.qzv (720.1 KB)

there were too many samples to I tried filtering the table using the following code:

qiime feature-table filter-features-conditionally
--i-table filtered_combo_table.qza
--p-abundance 0.01
--p-prevalence 0.30
--o-filtered-table filtered_table_30.qza

and then collapsed this table:

qiime taxa collapse
--i-table filtered_table_30.qza
--i-taxonomy combo_taxonomy.qza
--p-level 6
--o-collapsed-table col_tab_genus_30.qza

However now that I'm trying to use this table, there's another error-

I tried to check documentation and added parameter to filter missing value if there was any,

qiime composition ancombc
--i-table col_tab_genus_30.qza
--m-metadata-file metadata_final.tsv
--p-formula Diabetes_typ
--p-no-filter-missing TRUE
--o-differentials gen_diff_30.qza
--verbose
but the parameter wasn't accepted for some reason.
para_error

Can someone help me with how to shorten the list of features, or why my filtered table isn't working?

Hi @Infection_Biology,

This looks like you filtered all the samples out. Can you summarize col_tab_genus_30.qza and see if it is empty like I am hypothesizing

1 Like

Yes, you're absolutely correct.

Does that mean I don't have any samples with a minimum 30% prevalence?

However when I run this code instead, with manually computed 30% number of my samples it does show a viable feature table.

qiime feature-table filter-features \
  --i-table filtered_combo_table.qza \
  --p-min-samples 204 \
  --p-min-frequency 50 \
  --o-filtered-table filtered_table_30.qza

qiime feature-table summarize \
      --i-table col_tab_genus_30_freq50.qza \
      --m-sample-metadata-file metadata_final.tsv \
      --o-visualization col_tab_genus_30_freq50.qzv

Pardon my lack of knowledge, Could you kindly point out the differences here?

Hi @Infection_Biology,

I would think that that means that you dont have any features (ASVs/OTUs) that are present in 30% or more of your samples/you dont have any features that are above 1% abundance.

I would try re-running this command

without the --p-abundance parameter. Lets see if that result is more similar to your manual attempt at filtering to 30% prevalence.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.