analysing trimmed data

kopelol · October 27, 2024, 6:37am

Hello.

I'd like to analyse 454 sequene data downloaded from SRA.

My question is

Are the quality scores of 454 reads interpreted differently from those of other sequencing platforms?
According to the paper, the data has been filtered based on quality values, but the FASTQ file still contains reads with low quality.
How should ASVs and OTUs be classified? Additionally, which method would be the most suitable?

3)Is there a way to perform clustering while ignoring the quality score, assuming the data has been filtered?

Here are the details.

Data

Information of the data
-sequencer
454 GS FLX Titanium
-region
16S V1V2
-filterng & trimming
reads with quality value <25 were removed
primer sequence were removed
possibly chimera sequence were removed.
-analysis
3000 reads were randomly selected from filter-passed reads.
-16S
clustering OUT using UCLUST
blast using thieir own database using GLSEARCH

fastq data obtained from SRA was already trimmed and included 3000 reads.

I'd like to analyse this data using Qiime2. But the quality of reads was strange.

inport

qiime tools import \
  --type SampleData[SequencesWithQuality] \
  --input-path manufest.txt \
  --output-path S1.qza \
  --input-format SingleEndFastqManifestPhred33V2

qiime demux summarize \
--i-data S1.qza \
--o-visualization S1.qzv

qiime dada2 denoise-single \
 --i-demultiplexed-seqs S1.qza \
 --p-trim-left 0 \
 --p-trunc-len 0 \
 --o-representative-sequences rep-seqs-dada2.qza \
 --o-table table-dada2.qza \
 --o-denoising-stats stats-dada2.qza

Dada2 error

Learning Error Rates
77896 total bases in 195 reads from 1 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).

Best regards,