UniFrac and Phylogenetic Methods with Full-length 16S-ITS-23S Sequences

This is a fascinating question!

Is this hypothetical or have you sequenced this? What are your primers and sequencing platform? :face_with_peeking_eye:

Yeah, this feels like it would break something, but does it?
Steps: reads -> MSA -> tree building
Programs: nanopore -> mafft -> fasttree

The missing regions would show up as gaps in the MSA, especially dangerous because you would still get an MSA, but it may be totally useless for inferring relatedness. Would a clever scoring matrix for the MSA solve this problem?

You may have found this already, but it not, check out this long discussion about multi-region sequencing. All their regions are separate, which is equivalent to your proposal of cutting out each region from the full length read. They don't know what regions are connected together, but that's not an issue for you!

Thank you for bringing this question to the forums.