QIIME2 Demultiplexing Sequence

Hi!friends in the QIIME2 forum! I am very sorry to disturb you, but I am having trouble demultiplexing multiplexed sequencing files for QIIME2 analysis that confuse me a lot.

I have two 16S multiplexed sequencing files, one labeled as BD_1.fq.gz and the other as BD_2.fq.gz.
The contents of these files are shown in the images.


The sequences in these two files contain barcodes at the beginning of each sequence, which are used for sample demultiplexing.

I also have a barcode file for demultiplexing the samples.It looks like this:

sample | 341F_barcode(5'-3') | 806R_barcode(5'-3')
s1 | XXXXX | XXXXX
s2 | XXXXX | XXXXX
s3 | XXXXX | XXXXX
.....

From the tutorials, I learned that QIIME2 plugins cannot be used for demultiplexing samples from dual-barcoded multiplex sequencing. However, I cannot find any tutorials on using the bcl2fastq software for demultiplexing multiplex sequencing samples.

Therefore, I would like to know how I can use my barcode information and multiplex sequencing files to demultiplex and obtain individual sample files.

Any method or software would be acceptable.

Thank you all for your help and support. I truly appreciate your time and assistance in addressing my query.

I believe it is now supported: demux-paired: Demultiplex paired-end sequence data with barcodes in-sequence. — QIIME 2 2024.5.0 documentation

You'll need to set up a metadata file, if you have it you could tweak the samplesheet used for sequencing.

3 Likes

Hi Jono,

Thank you so much for your response to my question on the QIIME2 forum. I really appreciate your help and the time you took to provide such detailed information.

I have a few more questions regarding your suggestions.

I have thoroughly reviewed the document you recommended, but there are still some aspects that I find unclear.
I have a paired-end primer file containing barcodes. The format is as follows:

sample | primer_F | primer_R
s1 | barcode_F + primer_F | barcode_R + primer_R
s2 | barcode_F + primer_F | barcode_R + primer_R
s3 | barcode_F + primer_F | barcode_R + primer_R
....

Firstly, I would like to know what file I need to input for the --m-forward-barcodes-file parameter. Should it be a file consisting of only the sample column and the first primer column from the paired-end primer file like this?
sample | primer_F
s1 | barcode_F + primer_F
s2 | barcode_F + primer_F
s3 | barcode_F + primer_F
...
Similarly, for the --m-reverse-barcodes-file parameter, should it be a file consisting of the sample column and the second primer column?

Secondly, I am also unsure about the content required for the --m-forward-barcodes-column parameter. Can I understand this parameter as needing the column name of the forward primer from the paired-end sample file to perform the sample demultiplexing?

Additionally, I have seen in many sample demultiplexing tutorials that it is necessary to reverse the order of the forward and reverse primers, perform the demultiplexing again, and then merge the results. Is this step necessary in cutadapt?

Finally, is the demultiplexed sample file I obtain the per-sample-sequences.qza file? If so, can I use this file directly for subsequent dada2 and taxonomic annotation analysis?

Thank you again for your assistance. I look forward to your response.

Best regards,

Zhang Chengwei

I have a paired-end primer file containing barcodes. The format is as follows:

sample | primer_F | primer_R
s1 | barcode_F + primer_F | barcode_R + primer_R
s2 | barcode_F + primer_F | barcode_R + primer_R
s3 | barcode_F + primer_F | barcode_R + primer_R

I am unsure as I do not use cutadapt for demultiplexing but I think you just want the barcode here not the primers.

I don't see why you can't use the same metadata file for forward and reverse reads, and then have --m-forward-barcodes-column primer_F and --m-reverse-barcodes-column primer_R to denote the correct columns.

I'm unsure if you need to reverse complement the reverse barcodes, if you get unusual results (ie low numbers of reverse reads) then I would try reverse complementing them, but I'd try without first.

2 Likes

Thank you very much for your help.

I am now trying to use cutadapt to demultiplex the mixed sequencing samples. I also have a question: what method do you usually use to demultiplex the type of mixed sequencing samples I mentioned earlier?

Thank you again for your assistance. :grinning: :grinning: :grinning:

Either they will arrive from a facility or our lab with 2 fastq.gz files per sample (one forward one reverse). If needs be and the sample sheet for the run was wrong I use bcl2fastq on the MiSeq run folders itself.

2 Likes

Dear Jono,

I hope this message finds you well.

I am writing to express my heartfelt gratitude for your prompt and helpful response to my query on the QIIME2 forum. Your guidance was invaluable and has significantly helped me overcome the challenges I was facing.

It is encouraging to see such a supportive community, and I am inspired to contribute back in the future.

Thank you once again for your time and assistance.

Best regards,

Zhang Chengwei

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.