I am trying to compare my results to those of the company that generated the reads. These are the processing/QIIME 1 steps they took:
- Trimmomatic or Cutadapt to cut out low quality bases (phred score < 30) and adapters
- Check that reads are at least 70% high quality (not sure what is “high”) and ≥ 50 bp long
- Check for primers on each read*
- Fastq-join to stitch together the reads w/ primers*
I’ve run the same data through QIIME 2 but I’ve gotten substantially fewer sequences per sample. I used DADA2 (forward truncation, 270; reverse: 200). I’m not sure how to incorporate the quality/trimming steps above similarly using QIIME2. Here is the forward/reverse reads.
*Also, I believe the primers must be cut off before doing DADA2, which also joins the reads. Not sure how to incorporate all of these steps.
And for an example of the paired end read differences, after the above processing/DADA2:
Company: My data:
I’m assuming it has to do with the truncation parameters I chose, but any other tips or suggestions would be most appreciated.