Dear all,
I'm trying to import data from HUMAnN 3 to QIIME2. I didn't have any problem to import Pathway abundances, but I'm struggling with gene families.
I followed these instructions for HUMAnN 3: GitHub - biobakery/humann: HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).; and these instructions for importing to QIIME2: GitHub - caporaso-lab/q2-sapienns
I'm trying to import a file hmp_subset-genefamilies_rnx-cmp-names.tsv. My script looks like:
qiime tools import \
--input-path $input_path/named/hmp_subset-genefamilies_rnx-cmp-names.tsv \
--output-path $output_path/gene_families/humann-genefamilies.qza \
--type HumannGeneFamilyTable
And a part of the original file looks like:
|# Gene Family|HUNIMED_2402_107_merged_Abundance-CPM|HUNIMED_2402_111_merged_Abundance-CPM|HUNIMED_2402_117_merged_Abundance-CPM|
|---|---|---|---|
|UNMAPPED|534675|463558|570314|
|UNGROUPED|428955.7696|497956.2546|388147.0901|
|UNGROUPED|g__Acutalibacter.s__Acutalibacter_muris|0|0|2452.739228|
|UNGROUPED|g__Akkermansia.s__Akkermansia_muciniphila|19823.28661|10505.00169|19245.46405|
|UNGROUPED|g__Alistipes.s__Alistipes_indistinctus|0|0|0|
|UNGROUPED|g__Alistipes.s__Alistipes_shahii|0|0|0|
|UNGROUPED|g__Anaerotruncus.s__Anaerotruncus_sp_G3_2012|4616.59077|4181.893834|5561.050689|
Nevertheless, I'm receiving this error message:
... is not a(n) HumannGeneFamilyFormat file:
Expected sample ids (e.g., HUNIMED_2402_107_merged_Abundance-CPM) to end with unit descriptor RPKs
I'm not able to figure it out what's wrong with this file. Do you have any suggestions?
Thank you very much!