importing | analysing pacbio HiFi (CCS) V1-V9 reads

splaisan · February 1, 2021, 10:20am

I have Sequel V1-V9 HiFi data (demultiplexed) which I would like to import and process for 16S analysis.
How should I import this fastQ data, I am now using the following command but fear that only the forward reads will be considered when aligning to the database. Also, the sequel gfastq headers are not illumina formated which makes the import complain.

Are there dedicated import type/format which would better fit PB HiFi data?
Is there a tutorial for V1-V9 PB analysis somewhere?

Thanks for feedback

sample-id,absolute-filepath,direction
smpl1,$PWD/reads/demultiplex.bc1005--bc1033.Q20.fastq,forward
smpl2,$PWD/reads/demultiplex.bc1005--bc1035.Q20.fastq,forward


qiime tools import \
  --input-path read_manifest.csv \
  --input-format SingleEndFastqManifestPhred33 \
  --type SampleData[SequencesWithQuality] \
  --output-path read_demux-seq.qza

andrewsanchez · February 4, 2021, 8:11pm

Hi @splaisan,

Have you tried searching the forum for previous discussions about PacBio? From what I have gathered, dada2 itself currently supports working with long reads but porting that functionality to q2-dada2 is still underway. Until q2-dada2 is upgraded to wrap the a newer version of dada2, you can work with long-read data in dada2 and then import those results to QIIME 2. That's what I have seen recommended elsewhere on the forum, at any rate. I hope that helps!

refs:

github.com/qiime2/q2-dada2

Add long-read support

opened 09:17PM - 20 Sep 18 UTC

closed 08:50PM - 15 Nov 22 UTC

benjjneb

**Improvement Description** Add long-read support **Current Behavior** We'v…e added support for PacBio long-read amplicon sequencing to the devel version of the R package and it seems to work quite well. Preprint: High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution. **Proposed Behavior** I think it will make sense to add this as a tech-specific ```denoise-pacbio``` command in the plugin. This is similar to the ```denoise-pyro``` approach already in the plugin, with the purpose of having a dedicated command being to automatically turn on the right flags and options for PacBio data rather than relying on the user to do so. There is a downside in the repetition of much of the code between the different ```denoise-[technology]``` commands. **References** 1. Preprint: [High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution](https://www.biorxiv.org/content/early/2018/08/15/392332).

system · March 8, 2021, 2:11am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.