How to add taxonomy to feature-table.qza and get otu id of each taxonomy and number of otus

Hi @Yanfei-Geng,
I think I understand - it sounds like you’d like to know which genbank ids were most similar to your sequences (in the rep-seqs.qza file). Is that right? We unfortunately don’t have a way to get that information at the moment, though it is something we would like to add. This is in part because, with taxonomy assignment, typically more than one of the reference sequences is informing each individual taxonomy assignment.

If I understand what you’re looking for, you may be able to get this information using vsearch directly (not through QIIME 2). vsearch will already be installed in your QIIME 2 environment. A command that I’ve used for this before is:

vsearch --db dna-sequences.fasta \
  --usearch_global queries.fasta \
  --alnout out.aln \
  --blast6out out.bl6 \
  --id 0.0 \
  --maxaccepts 10 \
  --qmask none

This will do a BLAST-like search of all of the sequences in queries.fasta against the sequences in dna-sequences.fasta, and will report back the 10 best matches for each sequence in queries.fasta. We plan to make this accessible as a method in QIIME 2, but we haven’t done that yet. If this does do what you’re looking for, can you let me know? That type of feedback helps us to prioritize new functionality.

1 Like

Dear @gregcaporaso,
What i look for is exactly what you mentioned in your anwser. Now I am working on a project on diets of wild animals. We have our plants in animal feces analysed by vsearch, but the rumen bacterial community analysed by qiime2. So now it is hard for me to do the mantel test between plants diversity and bacterial diversity. Thank you again for your help.

1 Like

Great!

I'm happy to try to help with that if you explain the issue that you're having. This should definitely be possible to achieve with QIIME 2 if you have a feature table where the features are plant taxa (e.g., species identifiers) and the samples are the same (i.e., have the same identifiers) as the ones for which you have bacterial data.

what confuses me now is not mantel test any more, i finally find a way for mantel. I wonder how could I have an otu table, not the taxon table of every sample (like in the image)? Best.

Hi @Yanfei-Geng, I think the OTU table that you’re looking for should be the feature-table-2.qza artifact that you’ve been working with. You could export that with qiime tools export, and then convert the resulting .biom file to tab-separated text with biom convert --to-tsv. Would that get you what you need? If not, can you describe what the features are in your feature-table-2.qza file, and what you would like the features to be in the table you’re trying to generate?

hi@gregcaporaso the one you told me the output is like this


but the one I want is like following, is this possible?
7

@gregcaporaso hi,
I found something in alpha-rarefaction.qzv


after downloading, the table is like this
did i understand this table correctly? Thank you so so much.

@Yanfei-Geng, I don’t think the data from alpha-rarefaction is exactly what you’re looking for. Does the following get you what you need? The table.qza file that I’m starting with is available here.

$ qiime tools export table.qza --output-dir ./
$ biom convert -i feature-table.biom -o feature-table.tsv --to-tsv
$ biom head -i feature-table.tsv
# Constructed from biom file
#OTU ID	L1S105	L1S140	L1S208	L1S257	L1S281
4b5eeb300368260019c1fbc7a3c718fc	2222.0	0.0	0.0	0.0	0.0
fe30ff0f71a38a39cf1717ec2be3a2fc	5.0	0.0	0.0	0.0	0.0
d29fe3c70564fc0f69f2c03e0d1e5561	0.0	0.0	0.0	0.0	0.0
868528ca947bc57b69ffdf83e6b73bae	0.0	2276.0	2156.0	1205.0	1772.0
154709e160e8cada6bfb21115acc80f5	812.0	1176.0	713.0	407.0	242.0 

Here, the first column contains feature or OTU ids (e.g., 4b5eeb300368260019c1fbc7a3c718fc), the first row contains sample ids (e.g., L1S105), and the values in the table indicate the number of times each feature was observed in each sample. So, in this case, feature 4b5eeb300368260019c1fbc7a3c718fc was observed 2222 times in sample L1S105, and zero times in each of the other samples.

2 Likes

2 off-topic replies have been split into a new topic: What does the number represent in observed_otus.csv

Please keep replies on-topic in the future.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.