I would like to import a merged feature table with relative abundances (relative frequencies) obtained from MetaPhlAn into QIIME to use the diversity plugin. An example can be seen here. The table is in tab seperated values (TSV) fromat:merged_cases_profiled_reformatted.txt (78.8 KB)
If QIIME can not handle the pipe character that divides taxonomic ranks I can also reduce the table to a common tax level, e. g. species.
From your Import Tutorial I did not see the option how to import. Can QIIME handle this data structure (i. e. features in rows, different samples in columns)?
Traceback (most recent call last):
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/q2cli/builtin/tools.py", line 158, in import_data
view_type=input_format)
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/sdk/result.py", line 241, in import_data
validate_level='max')
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/sdk/result.py", line 266, in _from_view
recorder=recorder)
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/core/transform.py", line 59, in make_transformation
(self._view_type, other._view_type))
Exception: No transformation from <class 'q2_types.feature_data._format.TSVTaxonomyFormat'> to <class 'qiime2.plugin.model.directory_format.BIOMV210DirFmt'>
An unexpected error has occurred:
No transformation from <class 'q2_types.feature_data._format.TSVTaxonomyFormat'> to <class 'qiime2.plugin.model.directory_format.BIOMV210DirFmt'>
The error tells you that your data is not a TSVTaxonomyFormat. That format maps between feature ID and feature metadata (i.e. the taxonomy string). Based on what I'm seeing in the import tutorial, you will need to import your data as a biom format.
In :qiime2: , the taxonomy is a separate artifact that you can act on with or without the sample metadata and a separate semantic type. It gets incorperated where needed. You can learn more about the semantic type philosophy in qiime2 here, but essentially, the data is linked by an ID and kept in two files. In your case, you need to pull off the taxonomy into a two column .tsv file, and import that as a taxonomy semantic type.
You may also want to consider pre-filtering your table so you have a single level before you import. I might also replace the pipe with a semi colon .
Many thanks for your help @jwdebelius!
So there is no easy procedure of importing the original tsv file into qiime? Because when importing biom, i would have to converte each biom file into tsv, collapse it to a specified, convert back into biom and then import it.
I am relatively new to the microbiome field and would like to calculate alpha diversity on relative abundance data (the sum of all taxa on a given level [e. g. species] within a sample is 1). Do you know an appropriate metric for alpha diversity? As far as I understand is, that most approaches require full count data instead of relative abundances. However, e. g. Simpson and Gini-Index should work fine. But when trying to calculate that with qiime diversity alpha it states to to require a frequency table instead of relative frequency
Traceback (most recent call last):
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/q2cli/commands.py", line 329, in call
results = action(**arguments)
File "", line 2, in alpha
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
output_types, provenance)
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/sdk/action.py", line 484, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/q2_diversity/_alpha/_pipeline.py", line 28, in alpha
vector, = action(table=table, metric=metric)
File "", line 2, in alpha_passthrough
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/sdk/action.py", line 208, in bound_callable
self.signature.check_types(**user_input)
File "/home/plicht/anaconda3/envs/qiime2-2020.11/lib/python3.6/site-packages/qiime2/core/type/signature.py", line 342, in check_types
name, spec.qiime_type, parameter.type))
TypeError: Parameter 'table' requires an argument of type FeatureTable[Frequency]. An argument of type FeatureTable[RelativeFrequency] was passed.
Plugin error from diversity:
Parameter 'table' requires an argument of type FeatureTable[Frequency]. An argument of type FeatureTable[RelativeFrequency] was passed.
Sorry for the delay in answering. I am slowly losing my mind -> .
I think the issue is a QIIME-specific issue. The diversity calculations require a count table because of assumptions around other metrics. I'm not sure if you can get a frequency table, or spoof a frequency table (multiple everything by a constant, say 100,000 or something) and then import the data. It's not a perfect solution, but it might get you there.
Unfortunately, I don't work much with MetaPhlan, so i dont know if you can do an abundance approximation.
no worries, I am totally pleased by your help and the forum in general as I am new to Metagenomics and soaking up information.
As how MetaPhlAn works (Mapping shotgun reads against a precomputed database of marker genes specific for a clade) it is not possible to get full read counts because it maps I) only against few markers and not whole genomes and II) each clade consists of varying number of marker genes. So multiplying a clade's relative abundance with the total number of reads of a sample's library will result in a biased output. However, the MetaPhlAn author also proposed your Idea of multiplying by a constant and then rounding to the closest integer to get "pseudo-counts".
I am just searching for alternative analysis approaches that work with relative abundance data as this would fit the original idea of MetaPhlAn better.
I'd highly recommend reading this paper by Baker et al. 2021. They provide some insight into how they processed and imported MetaPhlAn and HUMAnN data into QIIME 2. You can search more in the biobakery forum too.