I tried to denoise my Fastq.gz file using DADA2 Denoise pyro on galaxy environment.But when I select Truc_lenint as 1574 ( where quality starts to drop below 20 at 50th percentile) It gives me this error.
error
An error occurred with this dataset:
This plugin encountered an error:
No reads passed the filter. trunc_len
(1574) may be longer than read
lengths, or other arguments (such as
max_ee or trunc_q) may be preventing
reads from passing the filter.
Here is my visualization After running Cutadapt trim and demux summerize
https://view.qiime2.org/visualization/?type=html&src=https%3A%2F%2Fcancer.usegalaxy.org%2Fdisplay_application%2F31a3451e5d1dd80b%2Fq2view%2Fmain_q2view%2Fc545c7623fd22aa1%2Fdata%2Fgalaxy_31a3451e5d1dd80b.qzv
I analyzed the .tsv file also with ChatGPT to get an idea where to Truncate, and gives me this,
ChatGPT-
Summary of the Analysis of the First TSV File:
- Data Characteristics: The file contained summary statistics for the quality scores of sequencing reads at each position. It was used to understand the quality distribution across the sequence lengths.
- Quality Score Observation: The analysis revealed that the median quality score fell below a conservative threshold (initially set at 20, then adjusted to 15 for Ion Torrent data) right from the beginning of the sequences.
- Challenge in Determining Truncation Point: Due to the consistently low quality scores from the start, it was challenging to determine an appropriate truncation point based on quality scores alone.
Summary of Truncation Parameters Suggestions:
- Context of 16S rRNA Sequencing: Considering that you are working with 16S rRNA gene sequences from Ion Torrent data, the typical lengths of these regions (like V3-V4) range from approximately 400 to 500 base pairs.
- Initial Trim (
trimLeft
): DADA2's recommendation for Ion Torrent data includes trimming the first 15 bases from each read (trimLeft=15
) to remove low-quality bases at the start. - Suggested Truncation Lengths (
truncLen
):
- Given the typical length of 16S rRNA regions, and after accounting for the initial trimming of 15 bases, I suggested experimenting with truncation points (truncLen) of 385, 435, and 485 bases. These lengths aim to capture the complete 16S rRNA regions while considering the quality profile of Ion Torrent sequencing data.
- Iterative Adjustment Approach: Due to the low-quality scores and the nature of Ion Torrent data, it was advised to start with these truncation points and then iteratively adjust them based on the results, focusing on balancing the sequence quality and quantity.
Based on this I tried 385 as truncation length but here are the results.
Input - 5200
Filtered - 473
% Passed the filter 0.91%
Denoised - 1
Non - Chimeric - 1
and it is not good.
Can anyone help me find a way out of this?(This analysis is for my university research and it is very crucial to me)