fastqc or similar tool to analyze and compile sequencing data quality

MBugay · January 16, 2025, 6:48pm

Hi everyone,

For brief context, I received sequencing data back (Illumina MiSeq 2x250) and randomly selected 3 samples (out of 240) to check in fastqc. I noticed a pattern, where in the reverse read, there is a major drop in quality at position 6bp in the "Per Base sequence quality". In the "Per base N content", there is also increased N at the same position of 6bp. I find it a little strange that the random 3 samples that I selected seemed to have the same pattern.

I wanted to ask if there is tool (similar to fastqc) where I can analyze and summarize the information of all 240 samples.

I haven't seen this before in other sequencing data that I have received, so I'd also like to understand if this is cause for concern. Thanks!

Example below:
Forward

Reverse

colinbrislawn · January 16, 2025, 7:48pm

Hello again MBugay,

Because fastq files are the defacto standard for raw sequencing data, there are a lot of tools that summarize and visualize their quality!

Within Qiime2, there's qiime demux summarize which makes this output. Click on the tab called Interactive Quality Plot to see a similar graph to the one fastqc makes.

where I can analyze and summarize the information of all 240 samples.

Yes, the Qiime2 plugin provides summary stats across all samples you have imported! So if you import 240 samples, you will see the q-score box plot for all 240 samples at once!

You could also check out vsearch fastq-stats to get the stats, but you would have to make the graphs yourself...
https://docs.qiime2.org/2024.10/plugins/available/vsearch/fastq-stats/

I've seen this before. This is why plugins like DADA2 allow you to trim bases from the start of the read! Chopping off the first 6 bases from R2 is a good option!

MBugay · January 16, 2025, 9:21pm

Hi Colin,

Thank you for the prompt response!

And thank you for the reminder of already available tools within qiime2! I will definitely check out demux summarize and vsearch fastq-stats

I saw a post on SEQanswers about MultiQC and AfterQC, which seemed interesting, but I am currently unfamiliar with them.

I've seen this before. This is why plugins like DADA2 allow you to trim bases from the start of the read! Chopping off the first 6 bases from R2 is a good option!

Great! Glad to know it's not of concern

Btw, I always find your responses so helpful!!

colinbrislawn · January 16, 2025, 10:02pm

I was just working with MultiQC today! It's a pretty cool tool if you already know how to use Nextflow.