I can't import Gene families file from HUMAnN 3 to QIIME2

Dear all,

I'm trying to import data from HUMAnN 3 to QIIME2. I didn't have any problem to import Pathway abundances, but I'm struggling with gene families.

I followed these instructions for HUMAnN 3: GitHub - biobakery/humann: HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).; and these instructions for importing to QIIME2: GitHub - gregcaporaso/q2-sapienns

I'm trying to import a file hmp_subset-genefamilies_rnx-cmp-names.tsv. My script looks like:


qiime tools import \
        --input-path $input_path/named/hmp_subset-genefamilies_rnx-cmp-names.tsv \
        --output-path $output_path/gene_families/humann-genefamilies.qza \
        --type HumannGeneFamilyTable

And a part of the original file looks like:

|# Gene Family|HUNIMED_2402_107_merged_Abundance-CPM|HUNIMED_2402_111_merged_Abundance-CPM|HUNIMED_2402_117_merged_Abundance-CPM|
|---|---|---|---|
|UNMAPPED|534675|463558|570314|
|UNGROUPED|428955.7696|497956.2546|388147.0901|
|UNGROUPED|g__Acutalibacter.s__Acutalibacter_muris|0|0|2452.739228|
|UNGROUPED|g__Akkermansia.s__Akkermansia_muciniphila|19823.28661|10505.00169|19245.46405|
|UNGROUPED|g__Alistipes.s__Alistipes_indistinctus|0|0|0|
|UNGROUPED|g__Alistipes.s__Alistipes_shahii|0|0|0|
|UNGROUPED|g__Anaerotruncus.s__Anaerotruncus_sp_G3_2012|4616.59077|4181.893834|5561.050689|

Nevertheless, I'm receiving this error message:

... is not a(n) HumannGeneFamilyFormat file:

  Expected sample ids (e.g., HUNIMED_2402_107_merged_Abundance-CPM) to end with unit descriptor RPKs

I'm not able to figure it out what's wrong with this file. Do you have any suggestions?

Thank you very much!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.