My data is sequenced using Illumina NovaSeq 6000 and another dataset using Illumina MiSeq so running dada2 wasn't successful so I tried to use deblur instead. Before getting to the error faced while running deblur, is there a way to use dda2 on such data?
Both showed the same error using the deblur tool but different errors using dada2.
When I tried to run this command for one of them:
qiime deblur denoise-16S \
This is the error that happened to
and this one happened when I tried to remove the spaces (but it didn't work)
although the files are there in the same directory, including the files from the filtering step
When I tried to run the same data through dada2 this what I faced for the two of them:
This is the paried-end dataset using Illumina NovaSeq 6000 (but only the forward reads is being used)
and this is the second one (Single-end through Illumina MiSeq )
Isn't there a way to solve this error regarding the error rates? as I prefer using it.
I am putting down here all other info in case this will help define the problem
- The data is from NCBI bioprojects and it's 16S rRNA microbiome data
- The machine is a Mac with a processor of 2.9 GHz Dual-Core Intel Core i7 and a memory of 16 GB 1600 MHz DDR3
- The R version is 4.2.2 (2022-10-31)
- The Qiime2 version is: q2cli version 2022.8.0
Any help is really much appreciated.
So, first error:
Please, refer to the
bash tutorial Split Long Bash Command into Multiple Lines in a Script. Your command is incorrect. Refer to “Moving Pictures” tutorial — QIIME 2 2022.11.1 documentation for an example of correct commands.
DADA2 1st error - as written in log, you have too few reads for error estimation. Default is 1M.
DADA2 2nd error - as written in log, your quality scores are wrong. Either data are corrupted or importing was incorrect.
Thank you for your response. However, for the first DADA2 error.. the reads are already over 1 million, and here are the fastq stats:
As for the second DADA2 error, the importing step was done successfully with no errors reported, and the data is already published so I don't think it can be corrupted?
I am attaching the demux file for your reference. Your help is greatly appreciated.
single-end-demux.qzv (292.5 KB)
Welcome to the public data world!
All bases have the same quality scores - something went wrong during the data upload.
There are two options:
- they uploaded quality-controlled and not raw data
- something went wrong during the upload
The only way to know is to ask the authors directly, as for now I label the dataset as "corrupted".
Thank you for the clarification. However, I just want to highlight that the two errors are for two different datasets. I have read somewhere on the forum that DADA2 can't work with reads sequenced by higher sequencers such as Illumina NovaSeq 6000 and Illumina MiSeq (which is my case). The dataset with the first DADA2 error worked fine with deblur but the second showed an error which I am still trying to fix. Does the dataset's successful work with deblur mean that it's not corrupted? Or this has nothing to do with that?
At this point I am totally lost, the demux report you'd send doesn't look like raw data. If it's downloaded from SRA there are many things that could go wrong. I personally have pointed out a few to SRA maintainers, but the database is too big to fix everything.
It might be that sequences were just quality controlled before, but I don't know.
I had processed sequences from MiSeq before in DADA2 and had no problems with it.
deblur does not rely on quality scores but only on the sequences, that is why it can be used to co-analyze samples from different runs as well as samples from NovaSeq platform (which does bin the quality scores). So you should be able to process all the samples together with deblur.
Sorry, I lost the track of the error now, what is your latest error you are working on? Lets see if we can help more.
Cheers and happy new year
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.