Best practices in comparing sequences from your study to a previous publication

Hi all, my apologies if this is a silly question I am new to microbiome analysis,

I have completed my analysis using paired end reads (Illumina) with the 16s V3-V4 primer pair (341F - 805R) in QIIME (with DADA2) following this pipeline Amplicon SOP v2 (qiime2 2022.11) · LangilleLab/microbiome_helper Wiki · GitHub. I would now like to repeat this analysis but include sequences from another study to compare my groups.

This study is linked here: Salivary Gluten Degradation and Oral Microbial Profiles in Healthy Individuals and Celiac Disease Patients - PubMed

This publication states that they also used paired end reads so I expected to download fastq files that contained the forward and reverse reads to include with my own sequences, then I can pass them both, together, through QIIME2 but when I downloaded these files it appears that the reads were already joined when submitted.

My questions are as follows:

  1. Do I need to split the joined reads from the publication into forward/reverse reads and then I continue with QIIME2?
  2. Or is it appropriate to pass the joined sequences as single end through the QIIME2, then merge the ASV tables from DADA2 for my study then proceed with normalization?
  3. Will treating treating one set as single end and another as paired end impact results? If so, is there a more appropriate method to complete this procedure?

Please feel free to share your expertise, thank you.


Not a silly question at all!

Is it possible? Never heard about such approach.

I guess it will intensify the batch effect of different studies.

To avoid increasing the batch effect in order to compare the studies I would merge reads from my study with VSEARCH-merge and then run them together with joined reads from another study through DEBLUR pipeline for denoising.