Joining paired ends non-overlapping

I just have a quick question of clarification. In the command
qiime vsearch join-pairs for the the option of
–p-allowmergestagger
does this allow for paired ends that don’t overlap to still be combined? And if so, what does it put in between the combined reads? A string of NNNs?

Hi @nricks,

The allowmergestagger actually deals with the opposite of the situation where you described. It is useful when you have amplicon targets that are shorter than the sequencing reads and thus your sequenced reads from the one direction pass the opposite read.

From the vsearch documentation:
“–fastq_allowmergestagger
When using --fastq_mergepairs, allow to merge staggered read pairs. Staggered pairs
are pairs where the 3’ end of the reverse read has an overhang to the left of the 5’ end
of the forward read. This situation can occur when a very short fragment is sequenced.
The 3’ overhang of the reverse read is not included in the merged sequence. The opposite
option is the --fastq_nostagger option. The default is to discard staggered pairs”

1 Like

So if I wanted to join together paired ends that didn’t overlap I would use the --fastq_nostagger option? Or is that just the default option to discard staggered pairs?

Hi @nricks,
First just to clarify the citation from my previous post refers to vsearch’s commands which qiime2 wraps using slightly different syntaxes so make sure in qiime2 you use the appropriate syntax as shown here.

And unfortunately not, the allowmergestagger option is really only dealing with whether or not to allow the inclusion of those scenarios. I am not aware of any available tools in qiime2 that would allow for merging of non-overlapping reads. It looks like the non-qiime2 version of vsearch has some notes on these scenarios but I have no experience with the matter. Even if some custom tool does this outside of qiime by inserting arbitrary reads and perhaps aligning these to some reference database I would still be very hesitant about trusting this type of data as those insert sizes would be estimates at best and would need some very reliable a prior fragment distribution to be useful. I think if you are missing an overlap region then your best bet would be to just use either your forward or reverse reads and discard the other. It’s not ideal but would be much more reliable, and hassle-free!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.