Replace barcode sequence with sample-Id

Hello,
I would like to analyze my MiSeq data with qiime2, but I am encounrteringf challenges.
I have demux reads barcoded in the reverse which are indicated below,**

$ head Undetermined_R1
@M03511:188:000000000-BN34F:1:2106:8954:1142 1:N:0:NTNNNCCTAAAT
TGGTAGTCCATGCCGTAAACGATNAGNNNNCGCCCTTGGTCTACNCNNACCAGGGGCCNNNNNNNCGCGTGAAACACTCCGCNNGGGGAGNACGGNTNCAAGACCNNANCTCAAAGGAATTGACGGGGNNCNNNACAAGCGGTNGANNNNNNNNNTTAATTCGATGATACNNNNNNAACCTTACCANGGCNNNNNNNNAGACTGACCNNTNNGNNNNCAGATCTNNNGNNNNACAGTTTANNNNNNNNTNNNTNGTTGTCGNNNNNNNGTGCCGTGAGGTGGNNNNNNNNNTCNNNTAACG
+
CCCCCGGGGGGGGGGGGGGGGGG#=C####::DFGGGGGGGGGG#:##::FGGGGGGG#######::CFFGGGGGGGGGGGG##:CFGGG#:CFG#:#:CFGGGG##:#::DFGGGGGGGGGGGGGGG##:###::AFFGGGF#:6#########88@FGFGGGGGGGGF######66@BFGGGEG#6>:########44=EFCFG<##/##1####2:C8DF###/####22:DGDG:###########0#0.<:?D:#######(-((49?F0<?B?(#########–###,–2,

$ head Undetermined_R2
@M03511:188:000000000-BN34F:1:2106:8954:1142 2:N:0:NTNNNCCTAAAT
TCAAAGTTTGCNNNNNNNGTNTTGTTAGAGNNCNNNNNNNNNNTNNCTGGCAANNNNNNNCAGGGGTTGCGCTNNNNNNNNGACTTAACCTGACNNNNNNNNGCACGAGCTGACGNCANNNNNGCNNCACCTTGTAANNNGTNNTGCGAAANNTCTNNNNNNNAATCGGTCANNNNNNNTTTAAGCCTTGGTAAGGTTCCTCGCGTANNNNNNNATTAAACNNNNNNNTCCNCCCNNNNNNNNGGCCCCCNNCAANTCCTTTGGNTTNCGNNNNNGCAACCGTTNNNNNNNGGCGGNGNNT
+
CCC<CGGGGGG#######:=#=CFGGGGGG##:##########6##::DFGGG#######::CFGGGGGGGGE########::CAFGGGGG=FF########9:AFFGDGFGGGG#9A#####88##88DFGGGGGG###66##86AEFEG##+6@#######55@F9EEEA#######44=CFGGGGGGGGGGGGGGGGFFFF4?1#######21:=>A*#######(2-#-4(########-(-8?49##–(#–4044:(#-4#((#####(–(449?(#######(—3#-##(

I do not have a barcode.fastq file but I do know the sequence of my barcodes (96 barcode sequences, 12nt)

$ head map_file.txt
#SampleID BarcodeSequence LinkerPrimerSequence Description
T0.A.MockArea TCCCTTGTCTCC AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

**I am able to import the sequences using manifest, however, I am not able to rename/append the SampleID in the sequences header for downstream analysis and it seems for me that this step is only done by qiime demux emp-paired. **
**Also, I would like to select only the 96 barcode sequences from the pull of reads that I have in the fastq. **
Is there a work around this issue using qiime2?
Many thanks for the help

Hi @M.Amine.Hassan!

Are you sure your reads are demultiplexed? Just double-checking, because those filenames imply otherwise.

If they are demultiplexed, you should be able to import them using the manifest format, and you can rename the sample IDs using the first column (sample-id) of the manifest file. This column is the sample-id after importing! That is how you go about re-identifying your samples.

Once you have imported your demuxed reads, you can use q2-cutadapt to trim off your barcodes, and any other primers that might be present.

Let us know how that goes for you! :t_rex:

I am sorry, but I think I have misstated the question.
The manifest renames the file.fastq.

$ head manifest
sample-id,absolute-filepath,direction
Run1,$PWD/Undetermined_R1.fastq,forward.

When I generate the qzv, I could see run1 but no hint for the 96 samples.

What I am specifically asking for is that in the in downstream analysis (filtering - otu table) I could not identify the 96 sample Ids provided in the map_file.txt (i.e. Mock1, Condition1 …ect)

$ head map_file.txt
#SampleID BarcodeSequence LinkerPrimerSequence Description
Mock1 TCCCTTGTCTCC AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Condition1 TCCCTTGTCTCC AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

At anystep described so far I used the map_file.txt to indicated which barcode sequence correspond to which sample (Mock/Condition) as indicated in the Moving Pictures

qiime demux emp-single
–i-seqs emp-single-end-sequences.qza
–m-barcodes-file sample-metadata.tsv
–m-barcodes-column BarcodeSequence
–o-per-sample-sequences demux.qza

Unless you are suggesting here that I should split my reads by Barcode-Sequnce each in a fastq file, like All reads to Mock1 are in Mock1.fasta …ect and import all of these as sequences using manifest?

Hi @M.Amine.Hassan!

Did you see my question above? I will copy it here in case:

The manifest format is only for already demultiplexed data. It really sounds like your reads aren't demultiplexed, but if you could confirm one way or the other, that would be great. If they aren't demultiplexed, we should be able to import as MultiplexedPairedEndBarcodeInSequence, and follow along with the q2-cutadapt community tutorial to demultiplex these reads.

This is similar to what was suggested to you above:

but I don't think your multiplexed reads are in EMP format, so we can't use the command suggested to you.

Keep us posted! :t_rex:

Indeed, the reads aren't demultiplexed and I could follow the q2-cutadapt community tutorial.
However, this rises the another issue
Thanks for the help!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.