Hi @chaeyeon,
Thanks for your interest in the QIIME 2 shotgun distribution. A couple of disclaimers, before I answer your question.
First, remember that the QIIME 2 shotgun distribution is in alpha release, and should be considered useful for testing purposes. We do not yet consider it "production ready" and we recommend confirming your findings with other shotgun workflows. We are very interested to hear about the results of any such comparisons.
Second, filtering of human and other host-associated reads from shotgun metagenomics data is an active area of research, and for human hosts in particular updated approaches need to be developed and validated that filter based on the human pangenome, rather than a single human genome sequence. I don't have a recommend on a tool for this right now, but I know that some are in development. Recent publications (1, 2) suggest that personally identifying information can be present in human-associated shotgun metagenome data, even when following current best practices for host read removal. QIIME 2 has not solved this problem.
All of that said, you can use qiime quality-control filter-reads
to filter reads that hit the host genome based on alignment to that genome. qiime quality-control bowtie2-build
can help you build a bowtie2 database for use with filter-reads
.
I also recommend filtering following taxonomy assignment with kraken2 with qiime taxa filter-seqs
. I would apply this to only include features that are assigned to a bacterial phylum (see here), though note that this is not perfect due to the issues pointed out in the first publication I referenced above.
In summary, you can do this filtering with QIIME 2, but the current best practices in general (not only in QIIME 2) are known to be insufficient. Developing and validating new best practices is ongoing work.
I hope this helps!