Hello everyone,
One of the issues I'm experiencing in my analysis is due to the low sequencing quality. I believe that the sequencing company to which I sent my samples does not have an optimized protocol for the reverse, as I'm seeing a very poor quality graph with the same pattern as another analysis that was conducted with the same company on different samples a few years ago.
I performed an initial analysis with Qiime2, obtaining the following average read counts and the following quality graph:
|forward reads |reverse reads||---|---|---|
|Minimum |32764 |32764|
|Median |59342.0 |59342.0|
|Mean |66898.966667 |66898.966667|
|Maximum |165126 |165126|
|Total |4013938 |4013938|
After filtering with DADA2, the merge percentage merged is lower than 1%.
I decided to try using two filtering tools separately. First, I used Trimmomatic, and then I made another attempt with Cutadapt.
Trimmomatic
To achieve a quality score of 20, I had to trim the sequences with DADA2 at positions 230F and 152R.
After using the trimmomatic tool, I observed an improvement in the quality graph without significantly affecting the number of obtained reads. This allowed me to perform filtering with DADA2 using a length of 285F/232R. However, the issue is that the resulting table after filtering has very low merge percentages, with a maximum of 6.66%.
After processing my sequencing data with Trimmmomatic, I obtained a file containing sequences that have been successfully paired. However, when I proceeded with the analysis in QIIME 2, the percentage of sequence pairing was unexpectedly low. This is puzzling to me since Trimmmomatic specifically selects sequences that can be paired.
I have carefully reviewed the analysis steps and ensured that the parameters used in QIIME 2 are appropriately configured for my data, including the pairing settings for sequences processed by Trimmmomatic. I have also considered the possibility of biological variability in my samples.
Demultiplexed sequence counts summary
||forward reads|reverse reads|
|Minimum|29362|29362|
|Median|55000.0|55000.0|
|Mean|61351.066667|61351.066667|
|Maximum|154057|154057|
|Total|3681064|3681064|
I don't understand why all my sequences are considered as chimeric, and I dont know if i am doing anything wrong...
files with just dada2
demuxPrueba.qzv (322.8 KB)
denoising_stats_sintrim..qzv (1.2 MB)
files trimmered with trimmomatic
denoising_stats_trimmed.qzv (1.2 MB)
demuxtrimeado.qzv (324.4 KB)
Could you please help me understand why QIIME 2 is unable to successfully pair the sequences, even though Trimmmomatic has already selected them for pairing? I would greatly appreciate any insights or guidance you can provide to resolve this issue.