CSV to a qiime2 readable format

Hey there,
I was given an OTU table in csv format and told to play with it. I would like to get out a weighted unifrac and eventually some heatmaps. I understand that I cannot plug a csv into qiime2, how can I get into a format recognised by qiime2? Would the output be a feature table? A rooted tree? I’m super lost, SOS.

Hi @coralgal! We don't currently support importing CSV or TSV formatted OTU tables. We have an open issue tracking this feature, and we'll follow up here when it's available in a release!

In the meantime, you can use the biom convert tool (which comes installed with QIIME 2) to convert your CSV OTU table into a .biom file. See the biom convert tutorial for details; you'll first need to convert your CSV file to TSV. Both JSON and HDF5 versions of the .biom file format are supported by the QIIME 2 importer. Once you have a .biom file, follow this section of the importing guide to import your .biom file into a FeatureTable[Frequency] artifact (or FeatureTable[RelativeFrequency] if your data are relative abundances instead of counts).

Once you have a feature table artifact, you can use diversity core-metrics-phylogenetic or diversity beta-phylogenetic to perform weighted UniFrac analyes. See the Moving Pictures tutorial for examples of alpha and beta diversity analyses. To produce heatmaps of your feature table, check out feature-table heatmap.

The imported .biom file will be a FeatureTable[Frequency] artifact. If you have a phylogenetic tree, where tree tips correspond to features in the feature table (e.g. OTUs), you can import a Newick file following this section of the importing guide. Only an example of importing an unrooted tree is shown, but if you have a rooted tree the process is largely the same, you'd just use --type Phylogeny[Rooted]. Once you have a rooted phylogeny, you can use the tree in phylogenetic diversity analyses such as weighted UniFrac (the Moving Pictures tutorial describes the tree building and rooting steps).

If you're new to QIIME 2 or these types of analyses, I recommend checking out the Getting Started guide. The tutorials are a great way to not only learn QIIME 2, but also to discover different techniques for analyzing and visualizing the data you have on hand. Let us know if you have follow-up questions (preferably in separate forum topics).

1 Like

Hi there,
Thanks for your reply!
When converting to the BIOM format I am told that my .txt file is not in BIOM format. To be clear, my OTU table contains the following columns:
OTU (number), Percentage match with Greengenes (title is Green), Sample ID (s), Kingdom, Phylum, Class, Order, Family, Genus, Species. I followed the instructions on the biom conversion page, I also tried another version which I modified to not having any of the phylogenetic info, but got a similar error:

(qiime2-2017.11) qiime2@qiime2core2017-11:~/gbstrial2$ biom convert  -i otutableshadingwithphylo.txt -o table.from_txt_json.biom --table-type="OTU table" --to-json 
Traceback (most recent call last):
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/bin/biom", line 6, in <module>
    sys.exit(biom.cli.cli())
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/cli/table_converter.py", line 114, in convert
    table = load_table(input_fp)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/parse.py", line 652, in load_table
    with biom_open(f) as fp:
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/util.py", line 444, in biom_open
    raise ValueError("The file '%s' is empty and can't be parsed" % fp)
ValueError: The file 'otutableshadingwithphylo.txt' is empty and can't be parsed

Is there anything else I can do? Any ideas? Thanks!

Hi @coralgal! The biom convert tool requires that files follow some strict formatting guidelines:

  • The file must be in TSV format (i.e. tab-separated values). CSV files, Excel files, etc. won’t work but you can easily convert those file types to TSV (e.g. by exporting from Excel).

  • The first cell in the file must be #OTU ID

  • The first column contains “feature” identifiers (e.g. OTU IDs, Greengenes IDs, ASV IDs, etc.)

  • Each subsequent column contains the abundances of features within a sample. Each column name corresponds to a sample ID.

Here’s an example of a file containing two features (feature1 and feature2) and three samples (sample1, sample2, and sample3):

#OTU ID   sample1  sample2  sample3
feature1      1.0      3.0      0.0
feature2      0.0      1.0      5.0

Can you try formatting your data in this manner and then convert it with biom convert? If you’re having issues after that, feel free to send me your TSV file and I can take a closer look.

Hi there,
I managed to sort this out, but in fact it wasn’t a problem with the format of the table. It was because the curl import was downloading an empty file from my google drive link. I looked in the file which was empty, then copy and pasted my data into that empty file and saved it. It has since been working! Bazinga!

2 Likes

Hi there,
I have tried to do this again with another subset of my data, and ended up with this error:
biom convert -i otutableshadingwithoutphylo.txt -o table.from_txt_hdf5.biom --table-type=“OTU table” --to-hdf5
Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/table.py”, line 4556, in _extract_data_from_tsv
values = list(map(dtype, fields[1:]))
ValueError: could not convert string to float:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/parse.py”, line 654, in load_table
table = parse_biom_table(fp)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/parse.py”, line 408, in parse_biom_table
t = Table.from_tsv(fp, None, None, lambda x: x)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/table.py”, line 4412, in from_tsv
t_md_name) = Table._extract_data_from_tsv(lines, **kwargs)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/table.py”, line 4560, in _extract_data_from_tsv
raise TypeError(msg % (lineno, badidx+1, badval))
TypeError: Invalid value on line 1, column 3, value

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2017.12/bin/biom”, line 6, in
sys.exit(biom.cli.cli())
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/click/core.py”, line 722, in call
return self.main(*args, **kwargs)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/click/core.py”, line 697, in main
rv = self.invoke(ctx)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/click/core.py”, line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/click/core.py”, line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/click/core.py”, line 535, in invoke
return callback(*args, **kwargs)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/cli/table_converter.py”, line 114, in convert
table = load_table(input_fp)
File “/home/qiime2/miniconda/envs/qiime2-2017.12/lib/python3.5/site-packages/biom_format-2.1.6-py3.5-linux-x86_64.egg/biom/parse.py”, line 656, in load_table
raise TypeError("%s does not appear to be a BIOM file!" % f)
TypeError: otutableshadingwithoutphylo.txt does not appear to be a BIOM file!

The first lines of it are
#OTU ID SXsCr145 SXsCr042 SXsCr044 SXsCr001 SXsCr260 SXsCr088 SXsCr081 SXsCr002 SXsCr092 SXsCr036 SXsCr255 SXsCr003 SXsCr113 SXsCr110 SXsCr104 SXsCr127 SXsCr258 SXsCr267 SXsPn001 ASdCr134 AWtCr053 ASdCr075 ASdCr005 ASdCr257 AwtCr076 AWtCr274 AWtCr136 SXsCr015 SXsCr511 SXsCr498 SXsCr510 SXsCr007 SXsCr006 SXsCr013 SXsCr009 SXsCr021 SXsCr509 SXsCr364 SXsCr369 SXsCr355 SXsCr302 SXsCr359 SXsCr294 SXsCr025 SXsCr011 SXsCr018 SXsCr020 SXsCr360 SXsCr299 SXsCr512 SXsCr520 SXsCr514 SXsCr518 SXsCr522 SXsCr312 SXsCr311 SXsCr363 SXsCr375
1 7838 9928 5285 5128 6950 4632 4408 6206 2740 3294 4281 2329 4206 2485 1560 778 4298 1407 226 63 46 64 31 118 27 89 45 2832 6329 1622 6136 2789 1516 372 2119 3591 6568 2116 3235 1759 428 1888 1873 611 376 308 412 1082 200 5084 85 806 805 517 42 23 25 48
2 802 565 948 1197 834 1665 1773 746 837 1193 1007 1926 1215 1360 984 1298 794 917 2071 165 49 90 67 116 55 95 43 1277 1475 862 1172 2664 645 5657 1052 3435 1158 958 2204 1911 2103 1523 6635 4563 1345 1664 163 3092 3561 1845 1290 2031 416 2670 146 110 71 532
3 444 2543 2940 1546 1683 2863 1429 979 2026 2856 2913 2021 3437 1836 4724 2385 2785 2185 1628 78 43 85 41 124 41 56 59 1098 2108 245 950 1217 438 2027 1170 3896 985 2523 3779 2943 1936 2257 4906 3116 506 711 136 2814 2789 1880 504 1382 339 1711 125 46 76 133
4 672 404 1319 988 1186 2277 1050 2238 757 573 959 2261 1263 1416 1433 1175 2018 1979 1595 151 76 114 69 153 44 135 56 4280 2610 480 807 2614 495 4533 921 3421 1161 410 2581 2366 1642 1852 2948 3206 298 494 83 1902 1438 823 458 1335 365 2437 119 94 117 126
5 980 3672 1409 65 577 3886 558 899 152 1083 2580 690 2226 963 883 228 2747 18734 183 23 815 36 9 26 371 927 354 5864 678 141 82 159 325 40 930 399 886 675 5479 2668 394 671 1823 74 46 57 194 172 96 393 13 8 39 2 1 0 0 1
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 30 0 0 0 37 37 9 24 30 45184 60 18 39 17 21 35 33 28 19 54 71 22 18 17 68 34 0 0 0 0 0 0 0 0 0
7 660 295 1081 1510 1563 2389 1076 1441 1228 709 965 3095 579 873 1634 2798 1756 1768 1493 221 148 172 84 250 102 214 137 1633 3368 631 883 1226 561 8278 1380 3906 1315 441 1608 1594 1316 2106 4388 8374 607 1193 109 3642 2494 930 1049 3017 383 2884 230 173 189 229

Is there anything else I can do? Could it be because it is in .txt form and not .tsv?

The file extension (.txt vs .tsv) shouldn't matter to the biom convert command. Since the forum may have mangled the contents of the file you posted, can you please upload the file, either in this forum topic or as a direct message to me? Sending the entire file would be best, if that is possible. If uploading via the forum doesn't work, you could share the file via Dropbox, Google Drive, etc. Thanks!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.