Filter by size after overlapping reads (16S) - region V3-V4

Lucas_Carvalho · January 23, 2018, 5:03pm

Hello everyone,

I have a doubt. After overlapping reads of V3-V4 region, the overlap has different sizes between 440bp~465bp. QIIME only support reads for next steps with same size. How I can timming the reads to be the same size? The ideia is trim the overlapped region to maximize the sequence similarity.

In my analysis, a histogram showed that 440bp reads is more abundance than others lenghts. So I start trimming 400bp starting in first base of each overlapped read. It depends how the reads were distributed over the region, showing a defficiencie in this strategy.

Thanks to all.

Nicholas_Bokulich · January 23, 2018, 11:58pm

Hi @Lucas_Carvalho,

Short answer: length trimming is not required in any step in QIIME2, but for paired-end 16S reads I would recommend following the Atacama soils tutorial to process your data, i.e., by denoising with dada2 (in which case reads are joined after denoising and optional trimming. In fact, dada2 can only be run on paired data this way, and reads cannot be joined prior to passing to dada2). Unless if your data is from a type that is not yet supported in QIIME2 (e.g., dual-index reads), you should be able to follow those steps to get the job done right.

A bit of length variation is normal. I'm not sure what this should be (depends on your primers), but that sounds like a typical range.

Where did you get that information? Many different processing pipelines are possible in QIIME2, and none of them require trimming. (e.g., trimming can be disabled in deblur and dada2). If you are reading that trimming is required, please let me know where and we can get that fixed.

I hope that solves your issue!

system · February 24, 2018, 5:58am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.