Error while running deblur

EnyaroHatsonveski · December 27, 2022, 12:46am

My data is sequenced using Illumina NovaSeq 6000 and another dataset using Illumina MiSeq so running dada2 wasn't successful so I tried to use deblur instead. Before getting to the error faced while running deblur, is there a way to use dda2 on such data?

Both showed the same error using the deblur tool but different errors using dada2.
When I tried to run this command for one of them:
qiime deblur denoise-16S \
--i-demultiplexed-seqs ch1-demux-filtered.qza
--p-trim-length 249
--o-representative-sequences ch1-rep-seqs-deblur.qza
--o-table ch1-table-deblur.qza
--p-sample-stats
--o-stats ch1-deblur-stats.qza

This is the error that happened to

and this one happened when I tried to remove the spaces (but it didn't work)

although the files are there in the same directory, including the files from the filtering step

When I tried to run the same data through dada2 this what I faced for the two of them:
This is the paried-end dataset using Illumina NovaSeq 6000 (but only the forward reads is being used)

and this is the second one (Single-end through Illumina MiSeq )

Isn't there a way to solve this error regarding the error rates? as I prefer using it.

I am putting down here all other info in case this will help define the problem

The data is from NCBI bioprojects and it's 16S rRNA microbiome data
The machine is a Mac with a processor of 2.9 GHz Dual-Core Intel Core i7 and a memory of 16 GB 1600 MHz DDR3
The R version is 4.2.2 (2022-10-31)
The Qiime2 version is: q2cli version 2022.8.0

Any help is really much appreciated.

crusher083 · December 27, 2022, 8:27am

So, first error:
Please, refer to the bash tutorial Split Long Bash Command into Multiple Lines in a Script. Your command is incorrect. Refer to “Moving Pictures” tutorial — QIIME 2 2022.11.1 documentation for an example of correct commands.
DADA2 1st error - as written in log, you have too few reads for error estimation. Default is 1M.
DADA2 2nd error - as written in log, your quality scores are wrong. Either data are corrupted or importing was incorrect.

Cheers,
Valentyn

EnyaroHatsonveski · December 31, 2022, 11:20am

Thank you for your response. However, for the first DADA2 error.. the reads are already over 1 million, and here are the fastq stats:

As for the second DADA2 error, the importing step was done successfully with no errors reported, and the data is already published so I don't think it can be corrupted?
I am attaching the demux file for your reference. Your help is greatly appreciated.
single-end-demux.qzv (292.5 KB)

crusher083 · December 31, 2022, 11:41am

Welcome to the public data world!

All bases have the same quality scores - something went wrong during the data upload.
There are two options:

they uploaded quality-controlled and not raw data
something went wrong during the upload

The only way to know is to ask the authors directly, as for now I label the dataset as "corrupted".

Cheers
V

EnyaroHatsonveski · January 1, 2023, 3:48am

Thank you for the clarification. However, I just want to highlight that the two errors are for two different datasets. I have read somewhere on the forum that DADA2 can't work with reads sequenced by higher sequencers such as Illumina NovaSeq 6000 and Illumina MiSeq (which is my case). The dataset with the first DADA2 error worked fine with deblur but the second showed an error which I am still trying to fix. Does the dataset's successful work with deblur mean that it's not corrupted? Or this has nothing to do with that?

crusher083 · January 1, 2023, 8:53am

At this point I am totally lost, the demux report you'd send doesn't look like raw data. If it's downloaded from SRA there are many things that could go wrong. I personally have pointed out a few to SRA maintainers, but the database is too big to fix everything.
It might be that sequences were just quality controlled before, but I don't know.

I had processed sequences from MiSeq before in DADA2 and had no problems with it.

Cheers
V

llenzi · January 3, 2023, 9:56am

Hi @EnyaroHatsonveski,

deblur does not rely on quality scores but only on the sequences, that is why it can be used to co-analyze samples from different runs as well as samples from NovaSeq platform (which does bin the quality scores). So you should be able to process all the samples together with deblur.

Sorry, I lost the track of the error now, what is your latest error you are working on? Lets see if we can help more.
Cheers and happy new year
Luca

system · February 3, 2023, 3:57pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.