Hello qiime2 community,
I have a couple of questions regarding paired-end reads and the joining/merging step with dada2.
I am working with fastq files from a paired-end MiSeq 16S amplicon run (V4 region). These files have a format that I have not seen before in that the R1.fastq and R2.fastq files seem to have a mixture of orientations (R1.fastq file contains both forward and reverse reads, as does R2.fastq file).
For example, within the R1.fastq file, the first sequence ID is
And the second sequence ID is
I have searched on the qiime2 forum and the closest thread relating to my issue was this one posted by Martin in December 2017. I am wondering if we used the same sequencing center.
However, his main problem was that he still had barcodes in his sequences, which made it difficult for him to import the data into qiime2. Mine do not seem to have any non-biological sequences (barcodes, primers, or linkers) within the sequences themselves.
I went ahead and imported and ran a complete analysis with these files using qiime2, version 2019.4, and everything ran smoothly. I obtained a feature table that seems consistent with what I was expecting from the samples I worked with.
I have 2 main questions:
- Based on the sequence ID, am I correct in interpreting that these files have both forward and reverse reads?
- If these files are mixed, would the joining step during dada2 (the denoiser I used for this analysis) correctly pair sequences from the R1.fastq and R2.fastq file, even if the files themselves have a mix of forward and reverse reads?
I have always thought that the R1.fastq must only contain forward reads, and R2.fastq must only contain reverse reads, and that all sequences in the R2.fastq file are reverse complimented during the joining step. I never heard of mixed-files such as the ones I am working with, and am hoping someone out there has experience with this type and can provide me with some clarification.
Thank you for any info you can provide!