How to handle downstream analysis of .BIOM files and feature tables?

Sean_K_Bay · January 2, 2018, 5:39pm

Hi,

QIIME 2 is running beautifully on my virtual box and I just completed the Moving Pictures Tutorial …

Now I want to repeat this using my data which consists of non-barcoded paired-end demultiplexed fastq files (R1/R2) which have been quality filtered using QIIME 1 by my sequencing facility.

I now have a normalised and filtered .biom file which I succesfully imported using

qiime tools import
–input-path otu_table_norm_filtered.biom
–output-path feature-table.qza
–source-format BIOMV100Format
–type “FeatureTable[Frequency]”

and summarized

biom summarize-table -i otu_table_norm_filtered.biom --qualitative -o otu_table_norm_filtered_qual_summary.txt

Num samples: 4
Num observations: 344
Observations/sample summary:

Min: 147.000
Max: 331.000
Median: 230.500
Mean: 234.750
Std. dev.: 68.893
Sample Metadata Categories: None provided
Observation Metadata Categories: None provided
Observations/sample detail:

SB3537: 147.000
SB3536: 199.000
SB3534: 262.000
SB3535: 331.000

All good so far…at this stage the only artefact I have from my BIOM is:
feature-table.qza

But to continue with phylogenetic diversity analysis I need FeatureData[Taxonomy], presumably I can get this from my a taxonomy .tab file provided with the initial .biom file. ?

If so, how would I import a tab file and what format (column heading) should it be in?

Many thanks in advance.

Cheers,
Sean

wasade · January 2, 2018, 7:28pm

Hi @Sean_K_Bay,

That’s great to hear!

For the phylogenetic analyses, you’ll need to provide or generate a phylogenetic tree using FeatureData[Sequence]. While taxonomy and phylogeny are related, they aren’t 1-1 and the taxonomy does not include estimates of divergence. In the moving pictures tutorial, the phylogeny is estimated here using a de novo reconstruction from the sequence fragments. However, your trajectory here will depend on how the FeatureTable[Frequency] was constructed. If the table was produced using closed reference OTU picking in QIIME1, then you should be able to import the existing reference tree as Phylogeny[Rooted] and provide that to downstream analyses. Similarly, if the OTUs were assessed using open reference OTU picking, then you can import the existing rep_set.tre file as Phylogeny[Rooted] and proceed.

Hope that helps, let me know how it goes!

Best,
Daniel

Sean_K_Bay · January 3, 2018, 3:34pm

Thanks Daniel,

I finally worked it out and I was able to complete the moving picture tutorial steps using my own data

Maybe my approach is slightly convoluted but it appears to work, here is what I did:

For FeatureData[Sequence] I created a spreadsheet in excel consisting of two columns OTU IDRepSeq, saved as tab-delimited rep_seq.txt then converted to FASTA and saved as rep_seq.fna.

qiime tools import
–input-path re_seq.fna
–output-path rep-seqs.qza
–type FeatureData[Sequence]

For 'FeatureData[Taxonomy I did as above with two columns OTU IDTaxon, saved as tab-delimited taxonomy.txt

qiime tools import
–type ‘FeatureData[Taxonomy]’
–source-format HeaderlessTSVTaxonomyFormat
–input-path taxonomy.txt
–output-path ref-taxonomy.qza

I also created a metadata file containing information of my samples, sites, condition etc.
I actually did this in Keemei and exported as Metadata.tsv

qiime metadata tabulate
–m-input-file Metadata.tsv
–o-visualization tabulated-sample-metadata.qzv

One thing I still need to figure out is how format text in this forum…ah well maybe one day

Thanks again,

Sean

system · February 3, 2018, 9:34pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.