csv to BIOM from dada2 output

lynneitelson · August 10, 2022, 7:00am

Hi!

I tried to convert the dada2 output csv to BIOM via the command

biom convert 
-i origianl_dont_touch.csv
-o table.from_txt_json.biom
--table-type="OTU table"
--to-json

the output of this is: 
Traceback (most recent call last):
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/parse.py", line 671, in load_table
    table = parse_biom_table(fp)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/parse.py", line 415, in parse_biom_table
    t = Table.from_tsv(file_obj, None, None, lambda x: x)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/table.py", line 5012, in from_tsv
    t_md_name) = Table._extract_data_from_tsv(lines, **kwargs)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/table.py", line 5128, in _extract_data_from_tsv
    md_name = header[-1]
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/a/home/cc/students/lifesci/lynneitelson/.local/bin/biom", line 8, in <module>
    sys.exit(cli())
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/cli/table_converter.py", line 113, in convert
    table = load_table(input_fp)
  File "/a/home/cc/students/lifesci/lynneitelson/.local/lib/python3.10/site-packages/biom/parse.py", line 673, in load_table
    raise TypeError("%s does not appear to be a BIOM file!" % f)
TypeError: origianl_dont_touch.csv does not appear to be a BIOM file!

I added the csv file, it said it dosnt appear to be a BIOM file however i am trying to convert it into BIOM.

ASV.csv (1.5 MB)

Thank you!

Lynne

lizgehret · August 12, 2022, 8:04pm

Hi @lynneitelson,

Thanks for reaching out! :qiime2:

I don't have much experience using the biom convert method, but I did look over their documentation, and I am wondering if they only support conversion from TSV or TXT format (as opposed to CSV format). Here is the documentation that I'm looking at, in case you haven't gone through this already on your end:
http://biom-format.org/documentation/biom_conversion.html

I'd start by modifying your file to TSV format and see if that works for you. Hope this helps!

Cheers

lynneitelson · August 14, 2022, 11:34am

Thank for the reply!

I tried to run the same code with a txt format and got the same output. Is it possible that the headers of the csv could be a problem?

Thankyou,

Lynne

lizgehret · August 18, 2022, 6:28pm

Hi @lynneitelson,

That's a great question - although I suspect the primary issue is due to your file format. When you modified your file to .txt format, did you just modify the file extension? Or did you change the separation between data from commas to tabs? The file extension is essentially meaningless if the contents aren't consistent with the file type that your extension refers to.

With that being said, I am wondering what your big picture goals are with this data - are you attempting to run further analysis on your DADA2 output in QIIME 2? Or another bioinformatics tool? And are you able to provide your original QIIME 2 output from DADA2 for us to take a look at?

Thanks!

lynneitelson · August 31, 2022, 6:29am

species_Taxonoy_Table.csv (1.5 MB)

Hi!

Thanks for the reply!
I did just used an online converter from csv to txt so it sounds like that could be the issue. How to you recommend modifying the file?

My goal is to use phyloseq and make a relative abundance graph from the data.

Thanks you!

Lynne

lizgehret · September 6, 2022, 10:40pm

Hi @lynneitelson,

It can be hard to trust many of those free online file converters - oftentimes they will just modify the file extension (e.g. change the filename from example_file.csv to example_file.txt) without making any actual changes to the file contents.

However, since you will need an OTU table to create your relative abundance graph, you can actually skip converting your csv to BIOM format altogether. The otu_table() method in R takes an integer matrix as its input argument, so you can just pass your CSV file into read_csv() which will provide you with a dataframe object as the output. This can then be passed into otu_table() after which you should be ready to create your relative abundance graph from the resulting output OTU table.

Hope this helps!
Cheers

lynneitelson · September 7, 2022, 7:35am

Thanks!
I tried to convert it back to a phyloseq object, however, it won't accept the non-numeric names of taxa.
I also added the OTU table csv.
data_filtered.csv (1.4 MB)

Thankyou,

Lynne

ebolyen · September 9, 2022, 10:58pm

I think you'll need to slice up your dataframe a little bit. It appears you have both taxonomy and the otu table combined together.

I will also mention that the DADA2 tutorial includes a section on handing the data off to phyloseq. You might read that and see where you can adapt your workflow to make this a bit easier. (I presume you are doing this all in R already.)

system · October 11, 2022, 4:59am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.