How to choose same truncating length for different runs when qualities are so different


I want to analyze many samples from different runs (actually 7 different sequencing runs) but when, after importing them, I look at the quality graphs to choose where to truncate them, these are really different to actually decide for sure where to cut.

In most of the sequencing runs' graphs I would trunc in 200, however, there is one run where the quality drops down in 300 and other where I would trunc in 90... Taking into account this variability... What would you do?

I have tried several times to attached some figures, but it is not working... I will try it tomorrow, in case I haven't explained it well

Besides that... what are exactly the characteristics that have a real effect on these quality graphs? How can I know why the quality of a sequencing run is much lower?

Thank you so much!

Hi @MiriamGorostidi,

Thanks for reaching out, apologies for the delay in response here!

Can you provide us with some additional details on your analysis pipeline thus far? What commands have you run? If you have it, can you share your interactive quality plot? This is typically the best way to determine where to trim/truncate your data. Thanks! :lizard:

Hello @lizgehret

I imported the samples with the following commands:

qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path samples-manifest.tsv
--input-format SingleEndFastqManifestPhred33V2
--output-path /samples.qza

qiime demux summarize
--i-data samples.qza
--o-visualization samples-demux.qzv

I'm trying to upload the interactive quality plot of each run, but it is not working... It starts uploading but it stays in 0%...

Could you facilitate me an e-mail so I can send you a doc?

Thank you so much!

Hi @MiriamGorostidi,

Thanks for providing those details! Something that @Mehrbod_Estaki suggested is utilizing deblur here for quality control, since you won't have to worry about your quality scores when picking a truncation length. If there is a reason that you need to stick with DADA2 and want to keep that truncation length the same across all of your runs, it'll be best to pick your parameters based on your worst run. Hope this helps!

Cheers :lizard:


Hi @lizgehret

Thank you for your advice! Is Deblur recommended every time you work with different sequencing runs or when the quality plots are so different between runs?

Yes, it helps a lot!

Thank you :slight_smile:

1 Like

Hi @MiriamGorostidi,

Apologies for the late reply! There isn't a hard and fast answer to this, but @wasade has a great breakdown of the differences between DADA2 and Deblur to help you make that assessment for your particular dataset and analysis, which I've linked below:

Hope this helps! Cheers :lizard: