Welcome to the forum, @bloman2 !
This is a great General Discussion topic, and those are some great ideas.
How are you performing this comparison? If you are not already, you could use some quantitative methods to perform this comparison, e.g., to measure how accurately your mock community represents the expected results. See the "visualizers" section of this plugin:
https://docs.qiime2.org/2021.2/plugins/available/quality-control/
We have a tutorial demonstrating how to use some of these actions here:
Mock communities never look perfect (because of the various sources of bias including those you list) but these methods can be used to get a sense of overall quality... in practice, though, the first time you use a mock community you cannot tell whether its accuracy reflects the quality of the mock community used, or the quality of the run. So you put the same community on every run you do, and you can detect when the quality degrades (indicating a run error) as shown in this paper:
This sort of process (using mock communities as a "canary in the coal mine") would accomplish this idea:
As for this idea:
There is definitely room for improvement. I plan to update q2-quality-control at some point very soon, including to add this method (which uses negative controls, not mock communities):
https://benjjneb.github.io/decontam/vignettes/decontam_intro.html
I think I saw a publication in the past 2 years that used mock communities in a similar way, but I cannot find it now. One issue with using mock communities for this is that they can have their own issues — misannotation, poor contruction, poor quantification, contamination, etc — that are separate from the sequencing run. These can of course be controlled and validated, and validated mock communities can be purchased commercially (maybe this is what you are using), but they still poorly reflect the diversity of real samples so there may be issues with using mock communities for "denoising" data (data correction), as you describe.
So long story short is that I recommend combining negative and positive controls as run standards, and negative controls can certainly be used for decontaminating data, mock communities for assessing data "quality" (with the caveats considered above).
P.S., here is some relevant discussion from the forum past: