OTU identifiers

Hi,

I have a curiosity:
why are OTU identifiers partially encoded as strings with numbers (long) but some in just numeric (short) as in the previous qiime1?

Thanks a lot

Michela

Good morning Michela,

Those are sha1 hashes of the read itself. The sequence of ASV_1 is going to be different in each study. But ASVs_sha1 will be the same sequence in any study, making meta-analysis much more elegant!

Colin

2 Likes

Yes, I was curious, but what dictates that string? Is it just a random number generator (OTU string)? Ben

Here’s how to find hashes manually:

>read header
ACTGATGATC

How to find a hash:
echo "ACTGATGATC" | shasum -a 1

Now put that hash as the read ID

>ae4131fcdcd212ff7f94874d53e3f8cffa543533
ACTGATGATC
2 Likes