feature-table summarize plug in error and metadata length of index

Hifor a little context, im working with 16s nanopore data, i already did preprocessing filtering clustering with a previous pipeline- proname(GitHub - benn888/PRONAME: PRONAME is an open-source bioinformatics pipeline that allows processing and significantly increasing the accuracy of Nanopore metabarcoding sequencing data.), the results were a rep_table.qza(feature table) and rep_seqs.fasta and rep_seqs.qza tghat were all exported from the pipeline.

However continuing downstream processing in qiime2(amp-2024.10) ,first i generated phlogenetic trees(rooted ands unrooted) just before core diversity metrics i ran into some errors about metadata , i decided to visualise the rep_table.qza to inspect if the sample columns in the metadata matches it ...by exporting it and then i ran into these problems. Im sharing the log for you here

 qiime feature-table summarize \
>   --i-table rep_table.qza \
>   --o-visualization rep_table_summary.qzv
/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/numpy/core/getlimits.py:542: UserWarning: Signature b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\xcc\xfb\xbf\x00\x00\x00\x00\x00\x00' for <class 'numpy.longdouble'> does not match any known type: falling back to type probe function.
This warnings indicates broken support for the dtype!
  machar = _get_machar(dtype)
Plugin error from feature-table:

  Length of values (0) does not match length of index (20)

cat /tmp/qiime2-q2cli-err-xl65v5zz.log

Traceback (most recent call last):
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-393>", line 2, in summarize
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 619, in _callable_executor_
    ret_val = self._callable(output_dir=temp_dir, **view_args)
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_feature_table/_summarize/_visualizer.py", line 100, in summarize
    sample_summary, sample_frequencies = _frequency_summary(
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_feature_table/_summarize/_visualizer.py", line 367, in _frequency_summary
    frequencies = _frequencies(table, axis=axis)
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_feature_table/_summarize/_visualizer.py", line 363, in _frequencies
    return pd.Series(data=table.sum(axis=axis), index=table.ids(axis=axis))
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/pandas/core/series.py", line 575, in __init__
    com.require_length_match(data, index)
  File "/home/joecliff/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/pandas/core/common.py", line 573, in require_length_match
    raise ValueError(
ValueError: Length of values (0) does not match length of index (20)
```[rep_table.qza|attachment](upload://779U86Arrlg2LAGYB7DioCyXOz5.qza) (7.8 KB)
[sample_metadata.tsv|attachment](upload://mkZI37wRFnzNk6KDoIqMPNSBzXp.tsv) (314 Bytes)
[rep_seqs.qza|attachment](upload://1aQ9Jd1hk80Ty0ZBoDxWiBKC1Jo.qza) (7.1 MB)
[sample_metadata.tsv|attachment](upload://mkZI37wRFnzNk6KDoIqMPNSBzXp.tsv) (314 Bytes)
[sample_metadata_updated.tsv|attachment](upload://wCoU4n7KwccpeD5FLNTCMIy0VCZ.tsv) (541 Bytes)

Hello @Joecliff, I'm not familiar with PRONAME, but it looks like you may have somehow ended up with an empty feature table. Can you DM me your feature table so I can look at it?

Additionally, what you mean by exporting when you say

What did you export exactly?

HI Anthony, thank you for prompt and directive response ProNAME pipeline at the refine step high-quality reads are first clustered using VSEARCH (v2.22.1) according to a sequence similarity threshold provided by the user, and singletons are removed. Importantly, within each cluster, the read distribution and the link between each read and its sample provenance are recorded. ensuring that, at the end of the script, an OTU-like table reporting the frequency of every consensus sequence in each sample is generated that can be ouputed optionally in as a qza file for downstream integration in qiime

secoNdly when i say thAT I EXPORTED THE REP_TABLE.QZA I MEAN THAT I TRIED CONVERTing IT IN QIIME2 TO A BIOM FILE OR A QZV FOR VISUALIZATION IF IT IS EMPTY.
HERE IS THE REP_TABLE.QZA FILE.
IN ADDITION WITH THE REP_SEQS.FASTA, BOTH ARE OUTPUTS FROM proname if the qiime export option is selected as shown below

proname_refine \

--clusterid 0.90
--inputpath concat_seqs
--clusterthreads 6
--medakamodel r1041_e82_400bps_sup_v5.0.0
--chimeradb /opt/db/rEGEN-B/rEGEN-B_sequences.fasta
--qiime2import yes

**finaLLY JUST TO ADD MORE CONTEXT; THE AVERAGE LENGTH OF MY READS ARE ABOUT 1463 (16S REGION) ONLY BUT I CLUSTERED TO GENERATE CONSENSES SEQS FROM A DATABASE WITH AVERAGE LENGTH AROUND 4600(THE ENTIRE 16S-ITS-23S OPERON) , DO YOU RECKON THAT THIS IS WHY ABOUT 66% OF MY READS WERE FLAGGED AS CHIMERIC AND THEREFORE THE FEATURE TABLE EMPTY **

FILE ATTACHEMENTS
rep_seqs.qza (7.1 MB)

Hello @Joecliff, I apologize for the delay, I was out of office.

Are you still encountering this issue? If so, you only sent your FeatureData not your FeatureTable. I will need the FeatureTable as well and ideally also the raw data to properly disentangle what's happening here. I would add that we do not officially support PRONAME, so there is some potential for the issue here to be something outside of the QIIME 2 ecosystem and outside of my control.

1 Like