ANCOM error - tuple index out of range

Hi all.

I am trying to run ANCOM analysis for my samples which consists of 3 different groups. I followed Moving Pictures tutorial and run the following code:

qiime composition ancom \
--i-table qiime-result/ancom-result/pseudo-collapse-table.qza \
--m-metadata-file metadata.tsv \
--m-metadata-column group \
--o-visualization qiime-result/ancom-result/ancom-group.qzv

But I got this error:
tuple index out of range

Can anyone help what is wrong with my code?

Hi @afrinaad,
Your code looks ok to me, assuming this is regular feature-table count data, I’m going to guess that the issue is perhaps with your metadata category ‘group’. Have you validated your metadata file by chance? Can you describe what this column contains? It should just be your grouping categories without any weird symbols/characters.

1 Like

I have validated my metadata file and everything is good.
Here is how my metadata.tsv looks like.

|sample-id|filepath-forward|filepath-reverse|group|

|W1|/home/11052/W1_R1.fastq.gz|/home/11052/W1_R2.fastq.gz|W|
|W2|/home/11052/W2_R1.fastq.gz|/home/11052/W2_R2.fastq.gz|W|
|W3|/home/11052/W3_R1.fastq.gz|/home/11052/W3_R2.fastq.gz|W|
|H1|/home/11052/H1_R1.fastq.gz|/home/11052/H1_R2.fastq.gz|R|
|H2|/home/11052/H2_R1.fastq.gz|/home/11052/H2_R2.fastq.gz|R|
|H3|/home/11052/H3_R1.fastq.gz|/home/11052/H3_R2.fastq.gz|R|
|HIN1|/home/11052/HIN1_R1.fastq.gz|/home/11052/HIN1_R2.fastq.gz|HIN|
|HIN2|/home/11052/HIN2_R1.fastq.gz|/home/11052/HIN2_R2.fastq.gz|HIN|
|HIN3|/home/11052/HIN3_R1.fastq.gz|/home/11052/HIN3_R2.fastq.gz|HIN|

Hi @afrinaad,
The metadata file looks ok, assuming the | you showed are not actually in your file (I doubt they are otherwise keemei would have tagged them).
Could you copy & paste the full error message you get please, adding the --verbose flag also to your command. Also, would you mind sharing the pseudo-collapse-table.qza file? You can DM this if you prefer.

The | was added when I pasted it in here therefore it is not a part of my metadata file.

Below is the full error message:

qiime composition ancom \
> --i-table qiime-result/ancom-result/pseudo-collapse-table.qza \
> --m-metadata-file metadata.tsv \
> --m-metadata-column group \
> --o-visualization qiime-result/ancom-result/ancom-group.qzv --verbose

Traceback (most recent call last):
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in __call__
    results = action(**arguments)
  File "</home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-474>", line 2, in ancom
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 240, in bound_callable
    output_types, provenance)
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 445, in _callable_executor_
    ret_val = self._callable(output_dir=temp_dir, **view_args)
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_composition/_ancom.py", line 75, in ancom
    transform_function, axis=1, result_type='broadcast')
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/frame.py", line 6928, in apply
    return op.get_result()
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/apply.py", line 176, in get_result
    return self.apply_broadcast()
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/apply.py", line 393, in apply_broadcast
    result = super().apply_broadcast(self.obj.T)
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/pandas/core/apply.py", line 241, in apply_broadcast
    res = self.f(target[col])
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/skbio/stats/composition.py", line 465, in clr
    gm = lmat.mean(axis=-1, keepdims=True)
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/numpy/core/_methods.py", line 138, in _mean
    rcount = _count_reduce_items(arr, axis)
  File "/home/customercare/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/numpy/core/_methods.py", line 57, in _count_reduce_items
    items *= arr.shape[ax]
IndexError: tuple index out of range
1 Like

Hi @afrinaad,
Thanks for the update. The error message doesn’t really give me any obvious clues, but I noticed in your table’s provenance that you go through 2 filtering steps whereby you filter any features that don’t have a minimum frequency of 32985. You do this once at the ASV level then you collapse these down to the genus level and filter them again. Was this intentional? In a typical dataset this kind of filtering will remove almost all of your features leaving you with a pretty empty feature-table. That may actually be causing the error we see. Could you visualize/double check this table to see if it actually has some features in it? I’m guessing maybe you meant to remove any samples that have less than 32985 reads?

1 Like

Hi @Mehrbod_Estaki,

Thank you for the prompt reply! I really appreciate it.

I didn't realize that I did the filtering twice. Thank you for pointing it out.

I summarized the filter+collapse+filter file (not the pseudo file as it is FeatureData[Composition] type) and there are features available for all samples.
filtered-collapsed.qzv (419.3 KB)

I rerun the code for the following:
Collapse taxa > filter low abundance > add pseudocount > ancom = success
ancom-group.qzv (400.3 KB)

Hi @afrinaad,
Even though ANCOM finished running, this is not a correct use of ANCOM and you should not trust the results you see there. The problem wasn’t that you were filtering twice, it was that you were filtering everything out of your table.
As I mentioned, the issue is that you are filtering any features that don’t occur at least 32985 times for some reason. This doesn’t make sense at all. In fact in the filtered-collapsed.qzv visualization if you go to the “feature-detail” tab you’ll see that there is only a single taxa in your table left. This is why ANCOM was struggling to run, there was nothing to compare that feature with. In the second viz there is only 3 features in your table, so again not reliable at all.
My recommendation is
a) Don’t collapse your table, just use the feature-table at the ASV level
b) Filter any samples with less than 2 thousand reads
c) Filter any features that don’t occur at least 50-100 times
d) Filter any features that don’t occur in at least 25% of your samples (you’ll have to manually calculate the # as the plugin doesn’t take percentages)
Then run ANCOM on this table.
Hope this helps!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.