Hi @biojack ,
I recommend checking the 3 papers that I suggested to you here:
All of those the mockrobiota mock communities were used in at least the first paper, and we have done quite some benchmarks in there and the 2 others, also describing various metrics to quantitatively assessing accuracy of (a) methods and (b) databases.
90% genus-level detection sounds about right. 30% species-level sounds low, but it depends on the quality and length of the mock community data so this I cannot remember off-hand...
good luck!