Export .biom file including taxon assignments and sample metadata

After picking taxa for my reads in qiime 1, I usually did the differential abundance testing with phyloseq. I’d like to continue doing that until I am really familiar with the testing and visualization options in qiime 2.

I’m looking for any suggestions about getting qiime 2 artifacts into phyloseq, preferably using qiime 2 tools to construct a fully annotated .biom file.

I understand that there are also biom tools for adding metadata to .biom files, and that phyloseq has lots of import options, but I’m not having much luck with that yet.

I know I can export my feature table to .biom like this:

qiime tools export table.qza --output-dir exported-feature-table

(qiime2-2017.10) [email protected]:~$ biom head -i exported-feature-table/feature-table.biom
# Constructed from biom file
#OTU ID Cow108-Pd1-w5d7-Feces   Cow108-Pd1-w5d7-Fluid   Cow108-Pd1-w6d1-Feces   Cow108-Pd1-w6d1-Fluid   Cow108-Pd1-w6d7-Feses
63e5c6324a03998e1cce27444295679a        74.0    0.0     117.0   0.0     87.0

but it doesn’t include taxon assignments or sample metadata.

I can export the taxon assignments like this

(qiime2-2017.10) [email protected]:~/$ less exported-feature-table/taxonomy.tsv
Feature ID      Taxon   Confidence
949459eaddbbae44bf50e88057b4da60        k__Bacteria; p__Firmicutes; c__Clostridia; o__Clostridiales     0.9544522399761912

And of course I have sample metadata

(qiime2-2017.10) [email protected]:~/gressley/20171113pm$ head SampleMetadata.tsv
#SampleID Period Type Day Treatment Cow DayVal Description
Cow108-Pd1-w5d7-Feces 1 Feces w5d7 Bov 108 42 Cow108-Pd1-w5d7-Feces
Cow108-Pd1-w5d7-Fluid 1 Rumen Fluid w5d7 Bov 108 42 Cow108-Pd1-w5d7-Fluid
Cow108-Pd1-w6d1-Feces 1 Feces w6d1 Bov 108 43 Cow108-Pd1-w6d1-Feces
Cow108-Pd1-w6d1-Fluid 1 Rumen Fluid w6d1 Bov 108 43 Cow108-Pd1-w6d1-Fluid
Cow108-Pd1-w6d7-Feses 1 Feces w6d7 Bov 108 49 Cow108-Pd1-w6d7-Feses
Cow108-Pd1-w6d7-Fluid 1 Rumen Fluid w6d7 Bov 108 49 Cow108-Pd1-w6d7-Fluid
Cow108-Pd2-w5d7-Feces 2 Feces w5d7 Con 108 42 Cow108-Pd2-w5d7-Feces
Cow108-Pd2-w5d7-Fluid 2 Rumen Fluid w5d7 Con 108 42 Cow108-Pd2-w5d7-Fluid
Cow108-Pd2-w6d1-Feces 2 Feces w6d1 Con 108 43 Cow108-Pd2-w6d1-Feces

1 Like

Hi @mamillerpa!

We don’t currently support directly exporting this from QIIME 2 at the moment (please see this thread for more discussion around that topic).

@jairideout wrote up a nice little post on manually adding metadata to your feature table, if you haven’t had a chance to peek at that, I would recommend starting there!

We don’t have a phyloseq plugin (yet?), but if you haven’t had a chance to play with q2-gneiss yet, I highly recommend it!

Let us know how it goes! :t_rex:

3 Likes

Dear all, I wanted to export full Qiime to .biom tables to R (phyloseq) and this is what I came up with after several hours of trying. You will need to adjust both scripts for your on needs and file paths and I would absolutely not advise to run this without understanding it. Tested with mac sed and Qiime 2017 11.

In a bash script do:

# input files 
# ----------
qdir="Zenodo/Qiime"
mdir="Zenodo/Manifest"
intab="070_18S_feature_table.qza" 
intre="110_18S_tree_mdp_root.qza"
intax="170_18S_taxonomy.qza"
inseq="070_18S_represe_seqs.qza"
mapp="05_metadata.txt"

# output files
# ----------
conv="Zenodo/Conversion"
ottre="200_18S_tree_mdp_root.tre"

# conversion steps
# ----------------
# export files from Qiime
printf "Exporting Qiime files to "$bpth"/"$qdir"/...\n"

qiime tools export "$bpth"/"$qdir"/"$intab" --output-dir "$bpth"/"$conv"/
qiime tools export "$bpth"/"$qdir"/"$intax" --output-dir "$bpth"/"$conv"/
qiime tools export "$bpth"/"$qdir"/"$inseq" --output-dir "$bpth"/"$conv"/
unzip -p "$bpth"/"$qdir"/"$intre" > "$bpth"/"$conv"/"$ottre"

# modifying taxonomy file to match exported feature table
printf "Replacing 1st line...\n"
new_header='#OTUID  taxonomy    confidence'
sed -i.bak "1 s/^.*$/$new_header/" "$bpth"/"$conv"/taxonomy.tsv

# adding taxonomy information to .biom file 
printf "Adding taxonomy information...\n"
biom add-metadata \
  -i "$bpth"/"$conv"/feature-table.biom \
  -o "$bpth"/"$conv"/feature-table-w-taxonomy.biom \
  --observation-metadata-fp "$bpth"/"$conv"/taxonomy.tsv \
  --observation-header OTUID,taxonomy,confidence \
  --sc-separated taxonomy

# adding metadata to .biom file 
printf "Adding metadata information...\n"
biom add-metadata \
  -i "$bpth"/"$conv"/feature-table-w-taxonomy.biom \
  -o "$bpth"/"$conv"/feature-table-w-taxonomy-w-md.biom \
  --sample-metadata-fp "$bpth"/"$mdir"/"$mapp" \
  --observation-header OTUID,taxonomy,confidence

In R do

# load packages
library("ape")          # read tree file
library("Biostrings")   # read fasta file
library("phyloseq")     # filtering and utilities for such objects
library("biomformat") # perhaps unnecessary 

bim_fpath = "/.../feature-table-w-taxonomy-w-md.biom" 
tre_fpath = "/.../200_18S_tree_mdp_root.tre"
seq_fpath = "/.../dna-sequences.fasta"

# read data into R  
phsq <- import_biom (bim_fpath)
tre <- ape::read.tree(tre_fpath)
fas <- Biostrings::readDNAStringSet(seq_fpath)  

# construct object  
phsq <- merge_phyloseq(phsq, tre, fas)

Thanks,

Paul

7 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.