Final OTU tables with DNA sequences included

Hi. So thanks to all of you for your awesome machine-learning algorithm applied to DNA sequencing! I am just wondering about the OTU tables at the end. At the end, I would like to have the taxonomic IDs, counts, which is what I am able to do until now. However, it would be wonderful to have the corresponding sequences included aligned with corresponding tax IDs and sample count values. Is there a way of doing this? I am able to convert my classifier into a table, but am not sure how to merge this with our actual sequencing fragments. Any hints? Thank you! :smiley:

1 Like

Hey Carla!

What method did you use to produce your OTU tables?

I first turned all of my feature-tables into grouped biom artifacts with this command:

qiime feature-table group
–i-table feature-table-v7v9.qza
–p-axis sample
–m-metadata-column new-id
–m-metadata-file s_clotMetadata.txt
–p-mode sum
–o-grouped-table v7v9-biom.qza

Then after I finished doing this for each amplicon I merged everything:

qiime feature-table merge
–p-overlap-method sum
–i-tables v1v4-biom.qza
–i-tables v1v5-biom.qza
–i-tables v1v8-biom.qza
–i-tables v1v9-biom.qza
–i-tables v3v4-biom.qza
–i-tables v3v5-biom.qza
–i-tables v3v8-biom.qza
–i-tables v3v9-biom.qza
–i-tables v4v4-biom.qza
–i-tables v4v5-biom.qza
–i-tables v4v8-biom.qza
–i-tables v4v9-biom.qza
–i-tables v6v4-biom.qza
–i-tables v6v5-biom.qza
–i-tables v6v8-biom.qza
–i-tables v6v9-biom.qza
–i-tables v7v4-biom.qza
–i-tables v7v5-biom.qza
–i-tables v7v8-biom.qza
–i-tables v7v9-biom.qza
–o-merged-table clots-merged.qza

I would love to have a table with the sequences, taxonomies, and abundances. Is this possible with qiime2? Thank you!

Great!

Abundances

In order to get an OTU table, you can export your merged qiime artifact: qiimie tools export --input-path clots-merged.qza --output-path clots-merged.biom.

You can then convert this biom file to a tsv with: biom convert -i clots-merged.biom -o clots-merged.tsv --to-tsv.

You can then use this OTU table to get abundances for each OTU in each sample.

Taxonomies

You can then classify your sequences using one of the feature-classifiers listed here: https://docs.qiime2.org/2020.6/plugins/available/feature-classifier/ and similarly export the resulting qiime artifact for a table of classifications for each sequence.

I have a pure-Q2 solution that will give you the visualization you are after. Better yet, you can also merge other feature data, like the sequence, and create a searchable table... I think including the sequence in the output is something that you wanted that is not addressed in @michaelsilverstein's great advice:

So here is how to do it:

First transpose your feature table (so that feature IDs are the index, i.e., row IDs):

qiime feature-table transpose \
    --i-table table.qza \
    --o-transposed-feature-table table-transposed.qza  

Then you can merge this feature table with feature metadata (taxonomy, sequences, anything else you like) like so:

qiime metadata tabulate \
    --m-input-file taxonomy.qza \
    --m-input-file sequences.qza \
    --m-input-file table-transposed.qza \
    --o-visualization table-with-metadata.qzv

Note that the order of "input files" will alter the order that they appear in the output.

Maybe you would even like to include feature importances in that output visualization?

qiime metadata tabulate \
    --m-input-file taxonomy.qza \
    --m-input-file sequences.qza \
    --m-input-file feature-importances.qza \
    --m-input-file table-transposed.qza \
    --o-visualization table-with-metadata.qzv

Give that a spin and let me know what you think :grin:

6 Likes

It is WAY cool but can it be used directly in plugins like DEICODE? :upside_down_face:

Best,
Carla

No, but this is unrelated to your original question. DEICODE takes a FeatureTable[Frequency] as input, so no additional information like sequence, taxonomy, etc, are allowable input, no matter how you create that table.

That said, note that the visualization I described above will allow you to download the contents as a TSV, e.g., to import in R, excel, etc, to manually modify or use as input for other methods outside of QIIME 2.