Differential abundance testing with ANCOM-BC

I tried differential abundance testing with ANCOM-BC using qiime2 tutorial (version : 2024.10)

This is my code :
qiime composition ancombc
--i-table gut-table.qza
--m-metadata-file manifest.tsv
--p-formula 'subject'
--o-differentials ancombc-subject.qza

I got the following error:

The error is deeply shown as:

Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_ancombc.R --inp_abundances_path /nfsshare/devika/mytmpdir/tmpo5ca91_r/input.biom.tsv --inp_metadata_path /nfsshare/devika/mytmpdir/tmpo5ca91_r/input.map.txt --md_column_types $

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
:heavy_check_mark: dplyr 1.1.3 :heavy_check_mark: readr 2.1.4
:heavy_check_mark: forcats 1.0.0 :heavy_check_mark: stringr 1.5.0
:heavy_check_mark: ggplot2 3.4.4 :heavy_check_mark: tibble 3.2.1
:heavy_check_mark: lubridate 1.9.3 :heavy_check_mark: tidyr 1.3.0
:heavy_check_mark: purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
:heavy_multiplication_x: dplyr::filter() masks stats::filter()
:heavy_multiplication_x: dplyr::lag() masks stats::lag()

used this code for viewing the error: cat /nfsshare/devika/mytmpdir/qiime2-q2cli-err-3kr63o5l.log
The error is deeply shown below:

Can anyone please give the solution for this error?

Hello @Gaaviya,

The command is failing at the step where the prevalence cutoff is being applied to your data. The default prevalence cutoff is 0.1, meaning taxa that are not at least 10% prevalent across the samples of interest are removed from the dataset. Once this has been applied to your data there are no more taxa left in your dataset and thus the error.

The 10% threshold is not high at all so it might be the case that you simply don't have many (or possibly) any taxa in the samples of interest.

The first thing I would do is check the metadata to see how the "subject" variable is partitioning your data. Subject often stores unique identifiers for participants in a study--are you sure this is the most interesting/correct variable to use in your model?

The second thing I would do is to try lowering the prevalence cutoff to 0 to see if this resolves things, though if it does I would ask myself how useful results are for taxa that are prevalent in less than 10% of my samples.