I am trying to run Dada2 denoise-paired for more than 12 hours using several theads with a subsample of 16 samples. My real set consists of 350 samples. currently using 425 % CPU I imported the demultiplexed files like this:
qiime tools import **
--type 'SampleData[PairedEndSequencesWithQuality]' **
--input-path “mypath” **
--output-path bobcat_sas_paired-end-demux.qza **
--source-format PairedEndFastqManifestPhred33
The 16 samples are ~4,5 Gb and the resulting .qza file is 981M.
I inspected the file:
And then ran
(qiime2-2018.6) ubuntu@aerosol0:~/mdw/Line_files$ qiime dada2 denoise-paired \
--i-demultiplexed-seqs bobcat_sas_paired-end-demux.qza \
--o-table bobcatsastable.qza \
--o-representative-sequences bobcatsas_rep-seqs.qza \
--o-denoising-stats bobcatsas_denoising-stats.qza
--verbose --p-n-threads 0 --p-trunc-len-r 0 --p-trunc-len-f 0
Unfortunately the remote conncetion broke, so I am not seeing any output from verbose, however dada2 is still running, based on other reports, it seems like the running time is unusually long. What can I expect for the run time for the large set of 350 samples? Could it be something with the import to qiime? any suggestions would be helpful.