I have many samples (~650) sequenced by PacBio platform. RStudio can not process dada function due to the memory limit of the software. The total size of the samples is 30 GB and I need to use pool=TRUE parameter. So, I thought about splitting the script between qiime2 and RStudio, where the first part up to
qiime dada2 denoise-ccs --i-demultiplexed-seqs ./samples.qza \ will be executed in Qiime2, then export and convert the output qza into rds to complete the rest of the steps up to phyloseq object in RStudio.
My questions are:
1- Could I use pool = TRUE in qiime2 as follows:
qiime dada2 denoise-ccs --i-demultiplexed-seqs ./samples.qza \ --o-table dada2-ccs_table.qza \ --o-representative-sequences dada2-ccs_rep.qza \ --o-denoising-stats dada2-ccs_stats.qza \ --p-min-len 1300 --p-max-len 1600 \ --p-front ACACTGACGACATGGTTCTACAAGAGTTTGATCMTGGCTCAG --p-adapter TACGGTAGCAGAGACTTGGTCTTACGGYTACCTTGTTAYGACTT \ --p-pooling-method TRUE \ --p-n-threads 8 \ --verbose
2- How can I convert ASV-table and rep-seq qza into rds to assign the taxonomy, construct the tree, and eventually create phyloseq object in R (I already have the R codes but need to convert the files)? Should I export the files as biom format then convert them into csv then into rds to complete DADA2 R script?