Hello again Louise,
Cool! So it sounds like the goal is the downstream biology, insead of algorithm development.
With that in mind, I think it makes sense to focus on an elegant way to integrate these 40 studies, without getting side tracked by differences between hypervariable regions.
Sure. De novo (Latin 'from nothing') OTU clustering makes new OTUs based on the reads provided. Closed-ref OTU clustering does not make new OTUs at all, it simply counts matches to existing OTUs provided in a database.
Take a look at this thread. Their ion-torrent data also spans different regions, so they are dealing with the same underlying problem as you. They also consider using closed-ref OTUs for these reasons:
This is essentially ‘counting database hits’ so
- resulting OTUs are 100% biased by the database
- resulting OTUs are 100% consistent with the database
- resulting OTUs are literally just the ones from the database
Modern ASV methods aim to be just as consistent without introducing database bias, but for this project we are knowingly using this strong bias to normalize across regions.

However, the resulted number of OTUs (~900 features) after clustered against HOMD is much smaller than those clustered against Greengenes and Silva (~ 2500)
This makes sense. You can only get OTUs in that specific database, so this bias is huge. Hopefully alpha and beta are similar

(the results of taxonomy bar plots, alpha & beta analyses look similar)
oh thank goodness!

Is this because the reference sequence provided by HOMD is already very small? Or I did something wrong? If the former, will it have any adverse effects on downstream analyses (e.g. Differential taxa)?
I'm not sure about the HOMD database, but it makes sense that a special-use database would have fewer taxa than a general use database like Silva. Any OTUs not in the database will be totally missing from all downstream analysis, but that's the downside of this strong regional normalization
Now that I know more about your study, I want to go back to the beginning:
I think you choose the right method to move forward quickly: closed-ref + maybe some ASV analysis of specific taxa
Do you want to craft that vsearch command?
Do you have any other theory questions about closed-ref OTUs?
Colin