I would like to confirm with you that the representative sequence (rep-seq with its unique feature ID) is actually the consensus sequence that represents each cluster of sequences with 97% similarity, is this correct? In other words, rep-seq is not one of the sequences from the cluster where it is picked, it is actually the consensus sequence of that cluster. If so, this explains why I could not find the rep-seq in original fastq files.


That is correct — as described in the plugin documentation, the representative sequences are the centroid for each OTU cluster.

(I am assuming you are asking about q2-vsearch, since you mention 97% similarity sequences. If not, please specify what method you are referring to.)

I hope that helps.

