Denoise paired ends how?

Whats the problem here?

I have the paired_reads.qzv opened in the qiime2.view. But when I tried to denoise it by follwing code it didn't work.

*qiime dada2 denoise-paired
--i-demultiplexed-seqs fondue-output/paired_reads.qza
--p-trunc-len-f 64
--p-trunc-len-r 0
--o-table fondue-output/dada2_table.qza
--o-representative-sequences fondue-output/dada2_rep_set.qza
--o-denoising-stats fondue-output/dada2_stats.qza
*

Can anyone help? the project ID for this sequence data is "PRJNA329477"



Hi,

In order to help information from the log in /tmp/ is needed. Please, send the file here.

Cheers,
Valentyn

1 Like

Sorry, I couldn't get into the /tmp/ file.
I've run the code like bellow.....

I'm following this fondue tutorial downstream analysis

And the code here is about a single end read for denoising.

qiime dada2 denoise-single \
      --i-demultiplexed-seqs fondue-output/single_reads.qza \
      --p-trunc-len 120 \
      --o-table fondue-output/dada2_table.qza \
      --o-representative-sequences fondue-output/dada2_rep_set.qza \
      --o-denoising-stats fondue-output/dada2_stats.qza

What would be the code for paired end denoising, for this specific analysis or in general??
and here is the picture of paired_reads.qzv file.


Dear @adamova @crusher083, can you guys help me out? It's quite confusing that how should I interpret the paired_reads.qzv file and denoise it. I added a code I find to denoise but it showed the following error :cry: ....

Code

qiime dada2 denoise-paired \
      --i-demultiplexed-seqs fondue-output/single_reads.qza \
      --p-trunc-len-f 150 \
      --p-trunc-len-r 150 \
      --p-trim-left-f 6 \
      --p-trim-left-r 6 \
      --p-trunc-q 2 \
      --o-table fondue-output/dada2_table.qza \
      --o-representative-sequences fondue-output/dada2_rep_set.qza \
      --o-denoising-stats fondue-output/dada2_stats.qza

Error report

Hi @turtle,

I believe that the new error (Invalid value) is not linked to the initial error (Plugin error from dada2).

As for the new error (Invalid value):
To denoise a paired-reads sequencing file with dada2 denoise-paired you need to provide it with a file that actually contains paired reads (in --i-demultiplexed-seqs). It seems that the input you provided in your latest sample code fondue-output/single_reads.qza is a single-reads sequence file, not a paired-reads sequence file. Hence, you are being informed that the input sequence type is invalid.

As for the initial error (Plugin error from dada2):
In case this error occurs again, the file located in the folder depicted after "Debug info has been saved to" is needed as mentioned by @crusher083 before.

I hope this helps,
Anja

Thank you @adamova
But whenever I'm trying to denoise the paired_reads.qza file there is a error happening, following is the code I run to denoise the sequence..

Code

qiime dada2 denoise-paired \
      --i-demultiplexed-seqs fondue-output/paired_reads.qza \
      --p-trim-left-f 20 \
      --p-trunc-len-f 142 \
      --p-trim-left-r 20 \
      --p-trunc-len-r 150 \
      --p-trunc-q 2 \
      --o-table fondue-output/dada2_table.qza \
      --o-representative-sequences fondue-output/dada2_rep_set.qza \
      --o-denoising-stats fondue-output/dada2_stats.qza \
      --p-n-threads 4 \
      --verbose

and the Error showing is,

and the paired_reads.qzv file is here in case you want to see through it....
paired_reads.qzv (324.7 KB)

I also tried to do this by removing the existing fondue environment and then creating a new one but same is happening.

My guess now would be RAM requirements. It seems like you're running it on a laptop/home computer, whereas due to big data it's preferable to run qiime2 on clusters.
I don't know the details of DADA2 implementation, but my rough prediction is 30 samples * 4GB each ~120GB for this dada2 action at peak, as all the sequences need to be loaded in RAM at a certain point to build a general error model. It looks like those are metagenomic sequencing.
Maybe @misialq has better insight regarding that?
[EDIT]: the study in q2-fondue tutorial is an amplicon sequencing study, I was wrong.

Cheers,
Valentyn

Hello,

As @crusher083 suggested, error code -9 is indeed a memory error. However, dada2's memory requirements are usually not as high as suggested and would not scale in that fashion. Usually 8 GB RAM is sufficient for a "typical" analysis, but this may vary.

I notice that you are using multiple threads, which will reduce runtime, but at the cost of higher RAM (as multiple processes are running in parallel). So you might consider reducing the number of threads and/or shutting down other processes running on your computer to preserve RAM for this job.

Perhaps I missed this in the discussion above — dada2 should only be used for denoising amplicon data from the same marker gene, NOT for shotgun metagenomes. If you have shotgun metagenome data, you should not use dada2.

Good luck!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.