How to find which Phred my data is

Hi all - I was wondering if there is a straightforward way of finding out if a specific fastq file is encoded in Phred 33 or 64…
I mean - is there a way of telling from looking at the file itself? Should I run it by specific software?
I would like to have a way of checking that instead of changing my qiime2 code until I stop getting an error…

Thanks!

1 Like

Hey there @ErikaGanda!

I think the safest way to figure it out is to ask the sequencing center, or the source of the sequence data. The problem is that the two ranges of phred scores overlap, so really low quality phred 64 data might look like high quality phred 33 data (or vice versa). Depending on the nature of the scores, you might be able to figure it out based on the signal (for example, if there are quality scores outside of the range of overlap, then that can point you in the right direction).

@Robert_Edgar has a nice write-up on this, here (as well as some suggested tools to assist you): https://drive5.com/usearch/manual/quality_score.html

As well, q2-demux’s summarize visualization will also attempt to warn you if it thinks the wrong phred offset was selected on import.

2 Likes

VSEARCH has a convenient function for this (and VSEARCH is installed as a dependency of some QIIME 2 plugins so you should already have it in your environment, making this a convenient one to use):

vsearch --fastq_chars path-to-your-file.fastq

In addition to other info, it outputs a “Guess” for the Phred.

1 Like

Great! Thank you @Nicholas_Bokulich and @thermokarst
You all are awesome!

2 Likes