Metadata file & DADA2


I am struggling to generate the barcode and metadata file from fastq file. I have the dual index multiplexing data (iilumina mesiq) which does not have the barcode in the header. Therefore, I used the Casava 1.8 paired-end demultiplexed fastq becasue it does not required the barcode. The out put file demux-paired-end.qza and then used the qiime summarize and the out put file demux.qzv.(demultiplexing reads). Next step follow the “Atacama soil microbiome” tutorial and I want to use for denoising with DADA2 but I do have the metadata file to generate the demux.qza. My issue I need to generate the demux.qza but I do not have the metadata file. In addition, this is my link after demultiplexing reads. I would greatly appreciate it, if you can guide me.

Thank you very much.

Best Regards,

Hi @Susie,

You only need the metadata file (containing a list of barcodes) if your data are not yet demultiplexed. Casava 1.8 data is by definition already demultiplexed. So if this is the correct format, you can proceed with the dada2 step described in that tutorial.

Thanks for providing the demux quality visualization. This really helps clarify — your files are indeed Casava 1.8 data, and are therefore already demultiplexed. You can input the QZA that you used to make that file directly to dada2 without problem. That is your demux.qza file. You will not need a metadata file.

I hope that helps!

Hi Nicholas,

Thanks very much for your reply.

I run the dada2 on my mac and it takes two hours to run without any output. Then, I used the second option 2 deblur and it works. I am not quite sure if this is the correct way to do it. Could please advise me if this correct? I want to use the option 1 dada2 to see if the results are different or not. Based on the quality plot demux.qza what is the correct the trim and trunc-len ? I like Qiime 2 so much!

I really appreciate your advice.

Best Regards,

Option 1:
qiime dada2 denoise-paired \ it takes a lot of RAM
–i-demultiplexed-seqs demux-paired-end.qza
–p-trim-left-f 30
–p-trim-left-r 18
–p-trunc-len-f 248
–p-trunc-len-r 75
–p-n-reads-learn 100000
–o-representative-sequences rep-seqs.qza
–o-table table.qza

Option 2:
qiime deblur denoise-16S
–i-demultiplexed-seqs demux-filtered.qza
–p-trim-length 100
–o-representative-sequences rep-seqs-deblur.qza
–o-table table-deblur.qza
–o-stats deblur-stats.qza

Could you clarify what you mean by no output? Are you sure the job finished running? (dada2 can take a lot of RAM and long runtimes on some datasets). Is the issue that you get an output but it does not contain any sequences? If so, your trimming parameters are probably incorrect.

Your forward reads look good — I would push it to a trunc-len of 230. But the reverse reads have an early quality drop-off (~80 nt) that makes me worry about using those at all.

I would personally just use the forward reads and discard the reverse reads. But you could try using different reverse trunc-len settings to see how many merged reads you can recover with dada2.

You can set trim-left to 0, unless if you still have primers included in your reads (you can also trim these off each end with q2-cutadapt). It looks like the 5’ ends of your reads are fine.


Hi Nicholas,

Thanks so much for your response.

Yes, I agree with you trimming parameters are probably incorrect.

Based on your advice:

qiime dada2 denoise-paired \
–i-demultiplexed-seqs demux-paired-end.qza
–p-trim-left-f 0
–p-trunc-len-f 230
–o-representative-sequences rep-seqs.qza
–o-table table.qza

I will run this by using the forward reads and discard the reverse reads. I will keep you updated on the progress

Thank you very much!

Best Regards,

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.