qiime2 2024.5 and 16S PacBio import and filtering

Hi,
I am trying to analyse PacBio 16S. I tried the guidelines from the below github to use seqtk and then did cutadapt before importing the already demuxed reads into qiime2/2022.8. PacBio CCS Amplicon SOP v1 (qiime2) · LangilleLab/microbiome_helper Wiki · GitHub
However, I could only retain 20% of the reads. Below are the comands I used. My question is, will it be better to use qiime2 2024.5. And if yes, do I need to use the same seqtk before importing the reads to qiime?
Sorry I can not find a good source regarding PacBio and version 2024 of qiime.

outside qiime
parallel -j 4 'seqtk seq -r {} > raw_data_rc/{/.}_rc.fastq' ::: raw_data/.fastq
parallel -j 4 --link 'cat {1} {2} > raw_data_cat/{1/.}_cat.fastq' ::: raw_data/
.fastq ::: raw_data_rc/*_rc.fastq

parallel -j 4 'cutadapt -g AGRGTTYGATYMTGGCTCAG...AAGTCGTAACAAGGTARCY
--discard-untrimmed --no-indels -j 1 -m 1200 -M 1800
-o trimmed_reads/{/.}_trim.fastq {}'
::: raw_data_cat/*_cat.fastq

using qiime2/2022.8
qiime dada2 denoise-single --i-demultiplexed-seqs reads_qza/trimmed_reads.qza --p-trunc-len 0 --p-max-ee 3 --p-n-threads 4 --p-n-reads-learn 100000 --output-dir dada2_output

Hi @lida56,
I am not quite sure why you aren't retaining alot of your samples. If you would attach your dada2-stats.qzv so I can look at what step is causing the low retention, that would give me the info I need to provide more suggestions!

Also, Have you tried our dada2 denoise-ccs command? This is specifically for pac-bio sequences and could lead to less preparation steps. denoise-ccs: Denoise and dereplicate single-end Pacbio CCS — QIIME 2 2024.5.0 documentation

1 Like