Qiime dada2 denoise-paired denoise

Jeongsu_Kim · March 2, 2018, 12:06pm

I am working with MiSeq paired-end seq files.
I ran dada2 and got results which seem like most of sequences were denoised.
My questions are..

The result shows input, filtered, denoised, merged sequences of only 6 samples (out of total 70 samples). So does it mean that other 63 samples are fine? Is there a way I can see input, filtered, denoised, merged sequences of other 63 samples?
Is it okay to go to the next step of analysis with these files:table-dada2.qza and rep-seqs-dada2.qza ?

I am attaching the command and the result.

(qiime2-2018.2) Jeongsus-MacBook-Pro:SM_raw Jeongsu$ qiime dada2 denoise-paired --verbose --i-demultiplexed-seqs demux-paired-end.qza --p-trunc-len-f 220 --p-trunc-len-r 220 --p-trim-left-f 10 --p-trim-left-r 10 --o-representative-sequences rep-seqs-dada2.qza --o-table table-dada2.qza
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /var/folders/vh/l4ckn9311hx8yrvvbfpnh6zh0000gn/T/tmp9isl57vm/forward /var/folders/vh/l4ckn9311hx8yrvvbfpnh6zh0000gn/T/tmp9isl57vm/reverse /var/folders/vh/l4ckn9311hx8yrvvbfpnh6zh0000gn/T/tmp9isl57vm/output.tsv.biom /var/folders/vh/l4ckn9311hx8yrvvbfpnh6zh0000gn/T/tmp9isl57vm/filt_f /var/folders/vh/l4ckn9311hx8yrvvbfpnh6zh0000gn/T/tmp9isl57vm/filt_r 220 220 10 10 2.0 2 consensus 1.0 1 1000000

R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0

Filtering ......................................................................
Learning Error Rates
2a) Forward Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 170760 reads in 40057 unique sequences.
Sample 2 - 72976 reads in 14057 unique sequences.
Sample 3 - 80375 reads in 17519 unique sequences.
Sample 4 - 91867 reads in 19942 unique sequences.
Sample 5 - 36676 reads in 7725 unique sequences.
Sample 6 - 40860 reads in 6791 unique sequences.
Sample 7 - 74639 reads in 16195 unique sequences.
Sample 8 - 51861 reads in 8882 unique sequences.
Sample 9 - 57126 reads in 12585 unique sequences.
Sample 10 - 59322 reads in 13621 unique sequences.
Sample 11 - 59887 reads in 12912 unique sequences.
Sample 12 - 53675 reads in 15168 unique sequences.
Sample 13 - 67513 reads in 14291 unique sequences.
Sample 14 - 46384 reads in 7620 unique sequences.
Sample 15 - 91381 reads in 15922 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
selfConsist step 7
Convergence after 7 rounds.
2b) Reverse Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 170760 reads in 41084 unique sequences.
Sample 2 - 72976 reads in 17249 unique sequences.
Sample 3 - 80375 reads in 20455 unique sequences.
Sample 4 - 91867 reads in 22107 unique sequences.
Sample 5 - 36676 reads in 7559 unique sequences.
Sample 6 - 40860 reads in 7705 unique sequences.
Sample 7 - 74639 reads in 17660 unique sequences.
Sample 8 - 51861 reads in 9785 unique sequences.
Sample 9 - 57126 reads in 12761 unique sequences.
Sample 10 - 59322 reads in 14496 unique sequences.
Sample 11 - 59887 reads in 14477 unique sequences.
Sample 12 - 53675 reads in 14563 unique sequences.
Sample 13 - 67513 reads in 15693 unique sequences.
Sample 14 - 46384 reads in 8816 unique sequences.
Sample 15 - 91381 reads in 17443 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
Convergence after 6 rounds.
Denoise remaining samples .......................................................
The sequences being tabled vary in length.
Remove chimeras (method = consensus)
input filtered denoised merged non-chimeric
A1_S68_L001_R1_001.fastq.gz 199077 170760 170760 128 122
A10_S3_L001_R1_001.fastq.gz 80574 72976 72976 40 40
A11_S1_L001_R1_001.fastq.gz 92433 80375 80375 42 42
A12_S72_L001_R1_001.fastq.gz 103531 91867 91867 69 69
A14_S75_L001_R1_001.fastq.gz 41777 36676 36676 33 33
A15_S61_L001_R1_001.fastq.gz 46939 40860 40860 49 49
Write output
Saved FeatureTable[Frequency] to: table-dada2.qza
Saved FeatureData[Sequence] to: rep-seqs-dada2.qza

Mehrbod_Estaki · March 2, 2018, 12:16pm

Hi @Jeongsu_Kim,

The short answer is yes, all your samples have been processed and everything appears to have worked as expected, so you should be able to carry on with your analysis!
As for the specifics of why the DADA2 output appears to only be working on a subset of samples, see this recent discussion that answers the exact same thing.

Good luck with the rest of it!

Jeongsu_Kim · March 2, 2018, 12:51pm

Thanks for your quick answer!

system · April 2, 2018, 6:51pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.