Biom import from mgrast

divyaprince321 · January 18, 2021, 2:42pm

Hello All
While being stuck in the analysis due to some issues of memory in my laptop. On thing to mention here it is difficult for me to manage laptop with higher ram.
So I have downloaded my biom files from mgrast and want to import that files for further analysis.
However I don't know are these biom files in the appropriate format or not.
I am attaching one biom file here,
Please let me know do I need to convert the format of my files. If yes what is the exact command or the format in which I need to change my files .
Please help out.
N_10.biom (708.9 KB)

TurboQiimer · January 19, 2021, 1:08am

Hi,
I have no experience of MG-RAST, but I suggest you to run this command with your file name:

qiime tools import
--input-path feature-table-v100.biom
--type 'FeatureTable[Frequency]'
--input-format BIOMV100Format
--output-path feature-table.qza

OR

qiime tools import
--input-path feature-table-v210.biom
--type 'FeatureTable[Frequency]'
--input-format BIOMV210Format
--output-path feature-table-2.qza

Good Luck
Qiimer

divyaprince321 · January 19, 2021, 6:26am

Thank You Sir
for your quick replies
I tried with both the commands, with both generating error
qime tools import --input-path N_20.biom.gz --type FeatureTable[Frequency] --input-format BIOMV100Format --output-path feature-table-N-20.qza
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/builtin/tools.py", line 158, in import_data
view_type=input_format)
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 241, in import_data
validate_level='max')
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 267, in _from_view
result = transformation(view, validate_level)
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/transform.py", line 70, in transformation
new_view = transformer(view)
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/transform.py", line 221, in wrapped
file_view = transformer(view)
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_types/feature_table/_transformer.py", line 125, in _8
return _table_to_v210(data)
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_types/feature_table/_transformer.py", line 72, in _table_to_v210
data.to_hdf5(fh, generated_by=_get_generated_by())
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/biom/table.py", line 4355, in to_hdf5
data=[i.encode('utf8') for i in ids],
File "/home/qiime2/miniconda/envs/qiime2-2020.8/lib/python3.6/site-packages/biom/table.py", line 4355, in
data=[i.encode('utf8') for i in ids],
AttributeError: 'int' object has no attribute 'encode'

An unexpected error has occurred:

'int' object has no attribute 'encode'
and the second one
qiime tools import --input-path N_20.biom --type FeatureTable[Frequency] --input-format BIOMV210Format --output-path featuretable-N-20.qza
There was a problem importing N_20.biom:

N_20.biom is not a(n) BIOMV210Format file

TurboQiimer · January 19, 2021, 2:30pm

You have used N_20.biom.gz => gz file! You already converted biom to .gz! You should have biom input file! Give it a try! I hope it works! If you got stuck share your result admins will help you!
Qiimer

TurboQiimer · January 19, 2021, 2:33pm

By the way, it is a biom file. As an example you can provide yours like this.
I think you should use such a file, not a .gz file.

Screenshot from 2021-01-19 15-31-14

divyaprince321 · January 19, 2021, 3:26pm

Thank you very much sir
for your quick responses.
Sir I am.done with otu clustering.
Now moving forward I need diversity analysis and most importantly I need relative abundance table along with representative sequence for use in some where else.
Here I am attaching the file for ur convenienceExampleOtuTable.csv (33.0 KB)

TurboQiimer · January 19, 2021, 3:38pm

To provide OTUs table, you should follow classifier commands! You should take one of them: Vesearch, Blast+ and so on. Take a look at the feature classifier plugin in the document.
Good luck
Qiimer

TurboQiimer · January 19, 2021, 4:06pm

I forgot something. To evaluate diversities please take a look at here!

divyaprince321 · January 19, 2021, 6:27pm

Thank You again Sir (TurboQiimer) for your help
Slowly I am getting the things done, right now I am done with diversity, however in case of diversity I have 1 query that is ( I was calculating the diversity by giving the parameters in a single command, while it got error, then I searched on the forum that the p parameters can't be passed with in a single command). Any ways things are done but the main point here is that I was willing to calculate shannon index however I got result in the form of shannon entropy. Now here is the main question how to get shannon index only.
Secondly In case of getting relative abundance table and rep.seq.fasta ( as mentioned above) which file I have to pass as reference-reads and reference taxonomy
as I have files from greengenes as otu_map, rep_seq aligned and rep_set. Which files among these are used

divyaprince321 · January 20, 2021, 6:52am

Thank You again
I am getting an error in importing my taxonomy files.
qiime tools import --input-path 97_otus_taxonomy.Tsv --output-path 97-otus_taxonomy.qza --type FeatureData[Taxonomy]
There was a problem importing 97_otus_taxonomy.Tsv:

97_otus_taxonomy.Tsv is not a(n) TSVTaxonomyFormat file:
qiime tools import --input-path 97_otus_taxonomy.Tsv --output-path 97-otus_taxonomy.qza --type FeatureData[Taxonomy]
There was a problem importing 97_otus_taxonomy.Tsv:

97_otus_taxonomy.Tsv is not a(n) TSVTaxonomyFormat file:

['Feature ID', 'Taxon'] must be the first two header values. The first two header values provided are: ['367523', 'k__Bacteria; p__Bacteroidetes; c__Flavobacteriia; o__Flavobacteriales; f__Flavobacteriaceae; g__Flavobacterium; s__'] (on line 1).

['Feature ID', 'Taxon'] must be the first two header values. The first two header values provided are: ['367523', 'k__Bacteria; p__Bacteroidetes; c__Flavobacteriia; o__Flavobacteriales; f__Flavobacteriaceae; g__Flavobacterium; s__'] (on line 1).
I tried to change it but all in vain . I tried to attach the file but it says maximum file size exceeded.
Some one please help me out how to change the format of the file.
If Possible I can attach the file on personal account

divyaprince321 · January 20, 2021, 1:19pm

Hello All
I tried and ran the command with --input-format aa headerless tsv taxonomy and it worked well.
However I have some.doubts can it lead to any issue in the downstream analysis.