ITS analysis for Pacbio generated data sets

Keegan-Evans · February 1, 2023, 4:58pm

Any there is a question of workflows and pipelines come up, there are a couple of great resources on our website that often answer basic questions and help get you oriented to the problem: QIIME 2 for Experienced Microbiome Researchers and Overview of QIIME 2 plugin workflows.

That being said, your case is somewhat complicated by a few things that are a little outside of typical QIIME 2 workflows, namely your PacBio sequencing data and your targeting of ITS.

For quality control and deneoising with PacBio sequencing data, the best tool that we have right now is denoise-ccs found in q2-DADA2(DOCS). As far as I am aware, there are not really "forward" or "reverse" reads, rather the genetic material is sequenced continuously in a loop until the quality scores rise to an acceptable level, so what you are seeing with your raw sequencing data makes a lot of sense, and the DADA2 method does dereplicate them as needed as well. You will want to import your raw sequencing data with the type SampleData[SequencesWithQuality] to feed it into q2-DADA2.

After running through q2-DADA2, you should be left with a FeatureTable[Frequency] and a FeatureData[Sequence] that you can use for any of the typical downstream QIIME 2 workflows. To look at ITS specifically, checkout the ITS Fungal Analysis tutorial and the q2-ITSexpress tutorial.