We are working through our first trial microbiome study after reading tutorials and forums for weeks! We received multiplexed raw data from Basespace which includes read 1 (forward-R1) and read 2 (reverse-R2) files in fastq.gz format. We have basic questions about the correct order of the pre-processing steps to get us started. Is this the correct order?
A) import paired-end reads to create QIIME artifact
*B) join paired-end reads (do we need to worry about correctly orienting the R1 and R2 reads?)
C) demultiplex data and cut barcodes/primers with “demux-paired” and “trim-paired” commands which use cutadapt plugin
E) quality filter and denoise with DADA2
We are a bit confused whether we even want to join the paired-end reads, as it seems that you can use the “demux-paired” command followed by “trim-paired” command to demultiplex and trim barcodes/adapters/primers from the PE reads (using the mapping file). Can we just skip step B above? I saw in another forum thread that DADA2 will join paired-end reads, so we shouldn’t use joined data as the input for DADA2, is that correct? We’d appreciate any clarification of the correct order of steps for our data.