Using any software from Ryan Wick is a good idea in my opinion !
The last time I ran EPI2ME was over 8 months ago, and the classifier under the hood was Centrifuge. A quick glance of the Community Nanopore forum doesn’t suggest that anything’s changed, but that was just a cursory search. It might be worth posting your questions on that forum too.
In my humble opinion, there are a few things to consider before trying to shoehorn the Nanopore EPI2ME workflow into QIIME. First, note that EPI2ME is set up for speed, and that’s exactly why it’s using Centrifuge. It’s a short read aligner that leverages a kmer sketch of a database - it’s not a global alignment analogous to something like VSEARCH (see their paper for more details). This means you can rapidly classify loads of sequences; that’s great for real time sequencing when you are shooting for a sort of 30,000 foot view perspective. I’m not so sure it’s what you want if you’re going to calculate alpha or beta diversity though, especially if you haven’t corrected your raw reads.
To further complicate matters, it’s important to note that Centrifuge (and therefore EPI2ME) is not using the same database typical to most QIIME users - it’s not Greengenes, it’s NCBI. Does that matter to you? Could your resulting classifications be different in part because of a database that is perhaps less well curated? Note that you can run Centrifuge with your
.fastq files directly without using EPI2ME at all, and you can build whatever database you want for Centrifuge to work with. It might be interesting for you to test how their default NCBI database compares with something like Greengenes. I’d certainly like to know.
One other thing to circle back to: EPI2ME is probably not correcting your reads prior to classification, and this is absolutely something to resolve if that’s the case. It always seemed like the prepackaged workflows through EPI2ME were a few versions behind of their standalone software, so I’d suspect that even though you ran the data through Guppy, you probably could improve your read and consensus accuracy with Nanopolish. It’s unclear whether that’s the case though, because it’s not clear which version of Guppy you’re running - if you can post the specific versions of the software you’ve used that’ll help. See Ryan’s preprint about basecaller comparisons - you’ll find that Guppy certainly is the way to go if you’re using the most recent version, but the larger improvements to cleaning up the noisy reads can also be related to training your classifier with your own data ahead of time.
Let’s circle back to your original question:
The short answer is of course you can perform the tests; the question you’re going to wrestle with is if the results are worth considering if you go about it the EPI2ME way using Centrifuge, or if you want to take those fastq files and make more of a manual effort to classify things with a different approach. Both this dog study and this sludge paper use MinION 16S data, but both use something other than EPI2ME to get their data classified.
What you’d probably want to do - shout out to @Nicholas_Bokulich here - is to run a 16S experiment with one or several mock communities. Until you have a known community, you’re just guessing at which method is better. One step in that direction is this paper which did this for the Zymogen mock community, but they did full metagenomic, not just 16S, so you can’t really use it as a benchmark for what you’re doing. But hey, that’s good news - an opportunity for an experiment!
I think you want to check in on the Nanopore forums first before QIIME to get a sense of how to tackle that question. There are hundreds of Nanopore users doing 16S work - connecting with those folk might be your best bet to get help with workflows tackling questions of diversity. A few Twitter folks to consider following: Arwyn Edwards (@arwynedwards), Devin Drown (@ArcticBiology), Mads Albertsen (@MadsAlbertsen85)… there are many others.