We are working through our first trial microbiome study after reading tutorials and forums for weeks! We received multiplexed raw data from Basespace which includes read 1 (forward-R1) and read 2 (reverse-R2) files in fastq.gz format. We have basic questions about the correct order of the pre-processing steps to get us started. Is this the correct order?
A) import paired-end reads to create QIIME artifact
*B) join paired-end reads (do we need to worry about correctly orienting the R1 and R2 reads?)
C) demultiplex data and cut barcodes/primers with “demux-paired” and “trim-paired” commands which use cutadapt plugin
E) quality filter and denoise with DADA2
We are a bit confused whether we even want to join the paired-end reads, as it seems that you can use the “demux-paired” command followed by “trim-paired” command to demultiplex and trim barcodes/adapters/primers from the PE reads (using the mapping file). Can we just skip step B above? I saw in another forum thread that DADA2 will join paired-end reads, so we shouldn’t use joined data as the input for DADA2, is that correct? We’d appreciate any clarification of the correct order of steps for our data.
I would skip step B if you are planning on using DADA2. More on that shortly...
That is exactly the case - DADA2 works best with unprocessed data - part of the processing in DADA2 will join the reads for you. In some cases where overlap is insufficient or reverse reads are poor quality, it can be advantageous to run DADA2 on just the forward reads, but we will cross that bridge when we get there.
For now, I would recommend proceeding with A, C, E (btw, where did D go??), with one caveat. Adata-integrity bug was just identified this week with q2-cutadapt's trim-paired method. I would recommend waiting until the 2018.6 release comes out before proceeding (we are on track for releasing later today ).
Thank you so much! We will wait for the release of the newest version and try what you suggest. It’s good sometimes to just get confirmation that we’re understanding this all correctly! It’s like learning a new language!