How to choose same truncating length for different runs when qualities are so different

MiriamGorostidi · November 7, 2022, 4:23pm

Hello!

I want to analyze many samples from different runs (actually 7 different sequencing runs) but when, after importing them, I look at the quality graphs to choose where to truncate them, these are really different to actually decide for sure where to cut.

In most of the sequencing runs' graphs I would trunc in 200, however, there is one run where the quality drops down in 300 and other where I would trunc in 90... Taking into account this variability... What would you do?

I have tried several times to attached some figures, but it is not working... I will try it tomorrow, in case I haven't explained it well

Besides that... what are exactly the characteristics that have a real effect on these quality graphs? How can I know why the quality of a sequencing run is much lower?

Thank you so much!

lizgehret · November 16, 2022, 7:37pm

Hi @MiriamGorostidi,

Thanks for reaching out, apologies for the delay in response here!

Can you provide us with some additional details on your analysis pipeline thus far? What commands have you run? If you have it, can you share your interactive quality plot? This is typically the best way to determine where to trim/truncate your data. Thanks!

MiriamGorostidi · November 22, 2022, 12:36pm

Hello @lizgehret

I imported the samples with the following commands:

qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path samples-manifest.tsv
--input-format SingleEndFastqManifestPhred33V2
--output-path /samples.qza

qiime demux summarize
--i-data samples.qza
--o-visualization samples-demux.qzv

I'm trying to upload the interactive quality plot of each run, but it is not working... It starts uploading but it stays in 0%...

Could you facilitate me an e-mail so I can send you a doc?

Thank you so much!

lizgehret · November 23, 2022, 7:49pm

Hi @MiriamGorostidi,

Thanks for providing those details! Something that @Mehrbod_Estaki suggested is utilizing deblur here for quality control, since you won't have to worry about your quality scores when picking a truncation length. If there is a reason that you need to stick with DADA2 and want to keep that truncation length the same across all of your runs, it'll be best to pick your parameters based on your worst run. Hope this helps!

Cheers

MiriamGorostidi · November 24, 2022, 7:57am

Hi @lizgehret

Thank you for your advice! Is Deblur recommended every time you work with different sequencing runs or when the quality plots are so different between runs?

Yes, it helps a lot!

Thank you

lizgehret · December 6, 2022, 7:12pm

Hi @MiriamGorostidi,

Apologies for the late reply! There isn't a hard and fast answer to this, but @wasade has a great breakdown of the differences between DADA2 and Deblur to help you make that assessment for your particular dataset and analysis, which I've linked below:

Hope this helps! Cheers

system · January 7, 2023, 1:13am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.