Hey everyone, new to the forums here. I’m using QIIME2 to assign taxonomy to both 16S and 18S data i have. I’ve succesfully gotten taxonomy with 16S, but now that i try to implement it for my 18S data i’m running into an issue with dada2. It will take around a day just to parse one of my samples.
The data contains Illumina paired-end 18S reads with the adapters trimmed off, 150 mb for both forward and reverse read, the command i’m running is: qiime dada2 denoise-paired --i-demultiplexed-seqs paired-end-demux.qza --p-trunc-len-f 250 --p-trunc-len-r 160 --p-trim-left-r 0 --o-table {output.table} --o-representative-sequences {output.seqs} --o-denoising-stats {output.stats} --p-n-threads 0
Denoising only the forward read leads to the same result
I’ve loaded a sample in R to see where the problem was, and it gets stuck on learning the error rates:
errF <- learnErrors(derepFs, nbases=2e6, multithread=TRUE)
errR <- learnErrors(derepRs, nbases=2e6, multithread=TRUE)
53884560 total bases in 224519 reads from 1 samples will be used for learning the error rates.
I’ve found similiar topics on the matter such as:
Based on these i’ve ensured to upgrade to the latest versions, ensured all cores are being used. I’ve checked with the source of the samples and it should only contain fungal samples without a lot of contamination.
I’ve tried deblur as well against the SILVA132 18s 99 reference set, but after 2 hours i killed the process as well. The exact command was: qiime deblur denoise-other --i-demultiplexed-seqs paired-end-demux.qza --i-reference-seqs ref-18S_SILVA_132_99.qza --o-table deblur-table.qza --o-representative-sequences rep-seqs.qza --o-stats stats.qza --p-trim-length -1 --p-jobs-to-start 4
I know it can simply take several hours/days if i have a lot of unique sequences, if this is the case, are there any alternatives to the qiime2 denoisers that would work on the scale of the project? I’m looking to process 20 of these samples in one day.
I’m running QIIME2 in virtualbox ubuntu within a conda environment. I’m using version 2019.7
Thanks in advance!