How cluster ASV with different length in OTU ?

Hi everyone,
I'm want to work with ASV/OTU 99% from COI primers. I have some little variations in my ASVs lengths after merging and I want to cluster them at 99% of similarity by actually ASV 100% of similarity with different length are considering as several OTU and I would like to merge them. Any idea how I can perform this ?
Thanks !


you need to trim your ASV. Depending on how little is minimal length, you can either trim everything to a minimum (that was the case in the American Gut study) or discard sequences, that are too short.
In this thread, OP asks similar question about trimming.


1 Like

Thanks Crusher083 for your prompt response !
I have already filter out too short and too long reads. My target gene normally 313bp and I keep reads >300 and <320bp. I think I can't trim my sequences as is, I should probably align my sequences to keep the same portion of genes for all my OTUs?

1 Like

Yes, exactly. I assumed that you already aligned them from the fact that you know similarity.
Basically need to you align a bunch of sequences using any MSA software, and then trim poorly aligned ends.


1 Like

Hi @marioncdl and @crusher083,

Hopefully its okay if I offer another suggestion?

Although I've not used it, you might also look into dbOTU. I've not used it, but the idea is that you keep single nucleotide resolution or length variation when you need it, and cluster when you don't. Its potentially a nice balance between innate biological variation and sequencing mistakes.



Thanks @jwdebelius for your proposition ! I thinks it's will do what I want ! I tried on subset dataset and it's seems to work. I'm currently trying in my global dataset, but it's running since one full day without output in the writing file .. I just made the request to install the plugin on a server, I hope it will work! :crossed_fingers: :crossed_fingers: :crossed_fingers:


1 Like

Hi @crusher083, I have been looking for a way to do this as mentioned above, but I am stuck...
I know the similarity of my OTUs because after assigning my OTUs I saw that I have many OTUs assigned 99% to the same sequence. I put some of these sequences in Geneious Prime (I try to use as much freeware as possible but I haven't found an equivalent yet :confused: ) and I saw that my sequences were 99% similar but had small variations in size...
So I tried to align my sequences with "qiime alignment mafft" but I get sequences with a size of 675bp...
How can I proceed?
Thanks for your help ?


For MSA, try online MAFFT, with one of their built-in viewers.

That online viewer will also let you play with mafft settings to see if you can get an alignment that works for you. MSAs are very sensitive to settings, so you may have to experiment to get a useful MSA, then pass those settings into Qiime2.

(This does not solve your feature creation problem. Let's see how dbOTU works for you!)

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.