How to join sequences from different libraries sequenced in miseq

Hi @Ariadna,
Got it — I understand your setup now. Thank you for the sample metadata (map) file, that clarifies.

This should be easy to process, but you must merge downstream. Let's walk through what you have done so far:

Since you re-used the same barcodes in each of your 5 libraries, you cannot merge them before importing. So do not concatenate your libraries together. In all of the steps I describe below, you will process each library separately until we get to the merging stage.

importing

Where are the barcodes? Are they in-line with the reads? Or, e.g., in the header lines. Do you have a barcode on one end only or on each end? This is all important because it impacts the format that you want to import as. You have two options:

  1. barcodes are in the sequences, only on one end (forward or reverse): import and demultiplex each library separately using q2-cutadapt.
  2. It looks like you already have some code for extracting barcodes, so you could just slice off the barcodes outside of QIIME 2 as you have done, but do it for each library separately. Then import and process as EMP paired-end format, as you began to do.

demultiplex

At this point you can demultiplex as you did previously. As long as you keep your libraries separate, everything should be okay. Note that you will need to make a separate sample metadata (map) file for each library for the purposes of demultiplexing, because the redundant barcodes are breaking things! But keep the merged metadata file for downstream analysis.

Then denoise/OTU cluster each library separately.

merging

At this point you will have a separate feature table and representative sequences artifact for each library. You can use merge and merge-seqs to merge these together before continuing with your analysis.

So you were on the right track, but you need to keep everything separate until after denoising since you re-use the same barcodes multiple times.

I hope that helps!

1 Like