Combine the taxonomy table with the ASV count table

I want to use RDP database, so I exported rep-seqs.qza to a fasta file and used R and function assignTaxonomy to assign taxonomy.
I need to create a file like the file below:


how to create a file where also sample names and those counts are included? and how to import them later into qiime2?

1 Like

Hey there @SarJur!

Great! This is one of the design goals of QIIME 2 - you should be able to supplement your QIIME 2 analysis with other tools, via exporting and importing from QIIME 2.

I think we can get you very close to something that looks like that, using just QIIME 2!

I'll start off with a few assumptions: you need a feature table, representative sequences, and taxonomy data. It sounds like you have all of that on hand, so let's start with getting it all in order. In QIIME 2 we opt for keeping these kinds of data files separate, because this allows for more portability and flexibility downstream in your analysis.

Representative Sequences

You mentioned above that you have representative sequences (FeatureData[Sequence]) from QIIME 2, so I don't think you'll need to import anything here, but in case you did, the importing guide has a section on this type of data.

Feature Table

I am assuming you have a FeatureTable[Frequency] artifact from QIIME 2 (produced at the same step that made your representative sequences, above). If you don't you'll need to import this, as well.

Taxonomy Data

This is what you computed using the RDP database outside of QIIME 2 - you'll need to import this as a FeatureData[Taxonomy] artifact - the specifics are up in the air here, we will need to see an example of the taxonomy data you have computed outside of QIIME 2 in order to help with this step.

Example

I'll use the Moving Pictures dataset to demonstrate.

First I'll download the data I'll need. You should already have all of this available, if you followed my suggestions above regarding the representative sequences, feature table, and taxonomy data.

wget https://docs.qiime2.org/2021.2/data/tutorials/moving-pictures/rep-seqs.qza
wget https://docs.qiime2.org/2021.2/data/tutorials/moving-pictures/table.qza
wget https://docs.qiime2.org/2021.2/data/tutorials/moving-pictures/taxonomy.qza

The next thing I'll do is transpose the feature table, to get in in the orientation you have in your screenshot, above:

qiime feature-table transpose \
  --i-table table.qza \
  --o-transposed-feature-table transposed-table.qza

I think we have everything in order now - time to generate the final table! We can do this by running a tool called metadata tabulate - this tool will merge one or more files that can be "viewed" as metadata. As you might've guess it, the three inputs we are working with above are all viewable as metadata.

qiime metadata tabulate \
  --m-input-file rep-seqs.qza \
  --m-input-file taxonomy.qza \
  --m-input-file transposed-table.qza \
  --o-visualization merged-data.qzv

qiime tools export \
  --input-path merged-data.qzv \
  --output-path merged-data

Finally, I'll open up the merged-data/metadata.tsv file in my favorite spreadsheet editor:

There we go! Things aren't identical to what you posted above, so if you need the data in precisely that format (there are some extra columns, and the taxonomy is consolidated into one column, rather than one per taxonomic level), I would suggest editing this spreadsheet to get what you need. Unfortunately we can't really help with those specifics - we are QIIME 2 experts here, not Excel experts.

My final note is a question to you - what are you looking to do with this file? I wonder if you have a specific tool (outside of QIIME 2) you want to use, and it needs this format? Or perhaps you're just looking for this kind of file as more of a summary, an overview of the data?

Let us know!

:qiime2:

10 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.