# What is the simplest way of working out how many representative sequences exist in each sample of a QIIME2 experiment?

What is the simplest way of working out how many representative sequences exist in each sample of a QIIME2 experiment?

I understand that I can obtain the total number of rep-seqs (or what used to be called OTUs) across the entire experiment, and the taxon (if any) that they are applied to, but for the life of me I can’t work out a simple way of finding out how many total rep-seqs exist in each individual sample (since some rep-seqs occur in some samples but not others).

Addendum: I understand that I can get this information from the CSV file that is derived from the barplots, but this only gives me the number of rep-seqs in each sample that have been assigned to a taxonomic group. All the “unassigned” rep-seqs are lumped into a single column, but within the “unassigned” group there will be multiple different rep-seqs that have not been assigned to a taxonomic group.

From the top of my head, I don’t think there isn’t an immediate simple way to get this in qiime2, but there are simple enough ways. What you are describing is essentially the ‘richness’ of each sample.

1. If you are comfortable with R, (or even Excel) you can simply export your feature-table there and count the non-zero counts in your samples plus additional flexibility of doing whatever you want with that.
2. If you rather stick within qiime2, you can also get these values by using the `diversity alpha` plugin with the metric selected to `observed_otu`s. That will give you the same thing and you can simply export the data file from there which will give you a table with the per sample ‘rep-seqs’ numbers.
Hope that helps.
2 Likes

Hi Dan,
You can use the Artifact API to do this fairly easily.
The python script would look like:

#!/usr/bin/env python
from qiime2 import Artifact
from pandas import DataFrame
rep_seq_count = (table > 0).sum(axis = 1)
print(rep_seq_count.to_csv())

You can put that in a file then run it like python filename it will print a csv of the samples and number of rep-seqs.
Devin

4 Likes

Absolulutely fantastic. Much appreciated.

1 Like

thanks, this is a good alternative to try.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.