Hi, I'm trying to analyze for the first time a set of 16S rRNA samples obtained from human faeces.
V3-V4 amplicons, paired-end sequencing (2x300 bp). Sequences were given already demultiplexed, so I create a manifest file and imported to QIIME2 (conda). There are 68 samples, each one with ~100.000 sequences of 301bp length.
So far, I have managed to import the sequences and generate a .qzv file.
paired-end-demux.qzv (312.6 KB)
1. At this point, should I trimm the first 16 bp of the FORWARD read (median Q score < 28) in order to assure a better quality downstream? How do I do this?
According to this tutorial (demultiplexing flowchart), next step is
2. Should I use the parameter
--p-truncqual at this point? If I don't trim my forward sequence prior to this (as stated in question 1, and assign for example a Q of 30), given the quality of my sequences: would this mess up the sequences and the process? Is it better not to use this parameter?
- Should I keep the default
3. What is the expected overlap of the sequences? How this affect the default parameter value of
--p-minovlen (10)? When and why should I change this parameter?
I'm aiming to use Deblur and to generate ASVs. Any advice prior to reaching this step?
Thanks in advance for your time and help.