Hi, I'm trying to analyze for the first time a set of 16S rRNA samples obtained from human faeces.
Short description
V3-V4 amplicons, paired-end sequencing (2x300 bp). Sequences were given already demultiplexed, so I create a manifest file and imported to QIIME2 (conda). There are 68 samples, each one with ~100.000 sequences of 301bp length.
So far, I have managed to import the sequences and generate a .qzv file.
paired-end-demux.qzv (312.6 KB)
1. At this point, should I trimm the first 16 bp of the FORWARD read (median Q score < 28) in order to assure a better quality downstream? How do I do this?
Regarding joining of paired reads
According to this tutorial (demultiplexing flowchart), next step is vsearch join-pairs
.
2. Should I use the parameter --p-truncqual
at this point? If I don't trim my forward sequence prior to this (as stated in question 1, and assign for example a Q of 30), given the quality of my sequences: would this mess up the sequences and the process? Is it better not to use this parameter?
- Should I keep the default
--p-minlen
value (1)?
3. What is the expected overlap of the sequences? How this affect the default parameter value of --p-minovlen
(10)? When and why should I change this parameter?
Regarding Quality Control
4. Why the quality-filter q-score-joined
method is said to be deprecated? Why is still recommended in the tutorial? What would be the alternative? Or is ok to still use it?
Regarding Denoising
I'm aiming to use Deblur and to generate ASVs. Any advice prior to reaching this step?
Thanks in advance for your time and help.