I’m using qiime2-2023.5 installed via conda. I am attempting to extract some information from my Qiime2 artifacts. What I need is a similar to information I already have from the rep-seq, taxonomy and table visualizations but with added details in samples column.
At the moment I have the table visualization created by:
qiime feature-table summarize \
--i-table cd_16s_table.qza \
--o-visualization cd_16s_table-viz.qzv \
Which gives me this :
And the table of taxonomy and rep-sequences created by the command:
qiime metadata tabulate \
--m-input-file cd_16s_taxonomy_results.qza \
--m-input-file cd_16s_rep-seqs.qza \
Which results in this:
What I would like is a something with four columns: Feature-ID, Sequence, taxon, Samples observed in. Where ‘Samples observed in’ is not a number but a list of the sample names it is found in. So the table I need would look like the following:
A similar question was asked here and someone mentions using sample-id instead of features-id as an index – however I would like to retain feature-ID.
QIIME feature-table tabulate-seqs will get you feature-id, sequences, and taxon in one table. You will need to pass in your rep-seqs and your taxonomy.
QIIME feature-table tabulate-seqs can add metadata as well so if you had metadata that had the 'samples observed in' info QIIME feature-table tabulate-seqs could create this table
Unfortunately, there is not an easy way to create 'samples observed in' list metadata in QIIME 2. This might be something you could do in R or in python.
I hope that helps!
Thanks very much. Just incase anyone else finds this and needs some other format or to extract which samples had which ASV's, I actually decided to export my taxonomy and table information:
qiime tools export \
--input-path cd_16s_table.qza \
qiime tools export \
--input-path cd_16s_taxonomy_results.qza \
then convert the feature table, as it's in biom format:
biom convert -i feature-table.biom -o feature-table.tsv --to-tsv
I'm writing a python code to merge the formats on Feature ID. I'll just keep the per sample read out from the feature-table.tsv (much more sensible!). So my table will eventually have the columns Feature-ID, Sequence, Taxonomy, SampleA, SampleB, SampleC.... etc. Where the sample columns contain the number of times that ASV was found in that sample.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.