Dear @adamova@crusher083, can you guys help me out? It's quite confusing that how should I interpret the paired_reads.qzv file and denoise it. I added a code I find to denoise but it showed the following error ....
I believe that the new error (Invalid value) is not linked to the initial error (Plugin error from dada2).
As for the new error (Invalid value):
To denoise a paired-reads sequencing file with dada2 denoise-paired you need to provide it with a file that actually contains paired reads (in --i-demultiplexed-seqs). It seems that the input you provided in your latest sample code fondue-output/single_reads.qza is a single-reads sequence file, not a paired-reads sequence file. Hence, you are being informed that the input sequence type is invalid.
As for the initial error (Plugin error from dada2):
In case this error occurs again, the file located in the folder depicted after "Debug info has been saved to" is needed as mentioned by @crusher083 before.
Thank you @adamova
But whenever I'm trying to denoise the paired_reads.qza file there is a error happening, following is the code I run to denoise the sequence..
My guess now would be RAM requirements. It seems like you're running it on a laptop/home computer, whereas due to big data it's preferable to run qiime2 on clusters.
I don't know the details of DADA2 implementation, but my rough prediction is 30 samples * 4GB each ~120GB for this dada2 action at peak, as all the sequences need to be loaded in RAM at a certain point to build a general error model. It looks like those are metagenomic sequencing.
Maybe @misialq has better insight regarding that?
[EDIT]: the study in q2-fondue tutorial is an amplicon sequencing study, I was wrong.
As @crusher083 suggested, error code -9 is indeed a memory error. However, dada2's memory requirements are usually not as high as suggested and would not scale in that fashion. Usually 8 GB RAM is sufficient for a "typical" analysis, but this may vary.
I notice that you are using multiple threads, which will reduce runtime, but at the cost of higher RAM (as multiple processes are running in parallel). So you might consider reducing the number of threads and/or shutting down other processes running on your computer to preserve RAM for this job.
Perhaps I missed this in the discussion above — dada2 should only be used for denoising amplicon data from the same marker gene, NOT for shotgun metagenomes. If you have shotgun metagenome data, you should not use dada2.