Deblur analysis of single-end-reads (joined reads with in-sequence primers)

Dear Qiime2 team,

I have some questions regarding Deblur, in particular the quality filtering step.

A bit of background: I have 2*251 data (515F-806R). I have extracted the barcodes and joined the reads in qiime1 with default parameters. The mean length for my joined reads is 287+-10bp.

I have then imported the data to q2 as single-end-reads and have demultiplexed them (I have attached the quality plot). Here I already have a question: why do some positions show a barplot, whereas others only show a dashed line?

Since both primers were still in my sequences, I have trimmed them at the same time with cutadapt trim-single (I have attached the quality plot of these reads as well as the summary).

I have then used deblur with -p-trim 273 (although at this position some reads were shorter). After checking the demux-filtered-stats file, I have realized that no reads were truncated and none were too short after truncation. Is this normal?

Thank you!

demux-trimmeddemux

Hi @Linnie,

Thank you for reaching out!

I’m not sure about that with quality plots, perhaps just an artifact of visualization? @thermokarst, do you know?

Just to verify, was q2-deblur run using --p-sample-stats? I just want to make sure as the stats are disabled by default as they can be expensive to gather. I would not expect reads to be too short after truncation, as the truncation clips the read to the length described by --p-trim. But given the parameters described and the plots, I would expect there to be > 0 reads that were trimmed. As a sanity check, if you rerun using --p-trim 200, do the stat report quite a few reads being trimmed?

Best,
Daniel

Those appear to be valid quality box plots, but with no variation in the boxes at those positions, so they are just effectively showing whiskers. If you hover over one of those positions, the 7-number summary in the table below the graph should show some values that support this. Thanks! :t_rex:

2 Likes

Hi @wasade,
thank you for your answer!

Yes, i did use that flag. However, that made me think of something and I am afraid I have misled you. So, sorry.

I see that no reads have been truncated in demux-filter-stats.qzv, that is before deblur denoise. After deblur I get a different statistics table, but it does not say how many sequences were trimmed or how many were dropped given that they did not reach the desired length. Is there anyway to know such numbers?

Thanks!
L.

No worries! You likely could infer the number of reads dropped by comparing the demux and deblur stats, but I don’t think we explicitly describe this right now. I also do not believe we explicitly track the number of reads trimmed. I’ve created an issue for q2-deblur about tracking those stats. In the near term, the solutions I can think of would require a little bit of programming or scripting, if these numbers are valuable for you I’d be happy to propose a solution that operates out-of-band from q2-deblur.

Note that the truncation in q2-quality is likely a quality based one not read length.

Best,
Daniel

Thank you so much, Daniel!
I will try doing some scripting to get those numbers. Thanks for opening an issue about it!
Best,
L.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.