Hi! I am searching for guidance on whether my data prep is appropriate to run both DADA2 and Deblur. I have paired-end 16s sequences (fastq files with read1 and read2 as separate files for each sample) so already demultiplexed when I got it back from Illumina. I was able to successfully import the fastq files, perform cut-adapt (the script didn't provide any errors though I'm still figuring out if I trimmed the correct primer sequences because I modeled my script on an old post docs, I have another post about this if its relevant!), merged the reads with qiime vsearch join-pairs, and quality-score filtered with the code below.
qiime quality-filter q-score
--i-demux step.01c.joined.qza
--p-min-quality 4
--p-quality-window 3
--verbose
--o-filtered-sequences step.01d.joined.filtered.qza
--o-filter-stats step.01d.joined.filtered.stats.qza
echo "backend: Agg" > ~/.config/matplotlib/matplotlibrc
qiime demux summarize
--i-data step.01d.joined.filtered.qza
--o-visualization step.01d.joined.filtered.qzv
Below are the 3 outputs I got after each of the steps.
my output looked like this after trimming with cut-adapt:
After merging the reads (join-pairs with vsearch):
after quality-filter on the joined reads:
This is my first time performing analysis on any 16s seq data, I previously have only done the wetlab library work - what should I be taking away from these plots about the data? What are the important characteristics from these plots that are necessary to incorporate in my next script for denoising (the investigator would like me to perform two separate denoising jobs, one with deblur and one with DADA2 because they want to see how the ASV output compares to the OTU output before moving forward).
Many thanks for any assistance!