Hello, I am running QIIME2-2021.4 and it was installed via conda. Here are the exact commands I am running:
qiime tools import
--type EMPSingleEndSequences
--input-path (folder name)
--output-path multiplexed.qza
Here is the error I am receiving: Plugin error from demux: Golay decoding requires 12nt barcodes. The barcode attempting to be decoded (CCTGTTTGCTC) is of length 11nt.
I tried looking through other threads on the forum but couldn't figure out the fix. This is the exact code I've been running for dozens of other 16S runs with no issue, and I'm not sure why it's starting now. The only thing that has changed between this and other runs is that I downloaded the Undetermined_R1 and Undetermined_R2 files from Illumina's Basespace rather than from the MiSeq it was run on (which I don't have access to)
Hi @ThatGuySam, As @jwdebelius noted in reply to your post on Twitter, that barcode that's showing up in the error message is 11 bases long. If those are your actual barcodes, you can suppress error correction by passing the --p-no-golay-error-correction parameter to qiime demux emp-single. If you had a copy/paste error with one or more of your barcodes, such that you pasted in only 11 bases, you should be able to correct that and move on.
I actually have no idea where it is getting the barcode (CCTGTTTGCTC) from. I threw in a dummy metadata sheet with only one of the samples from this run as a test and ensured the barcode length was 12nt and it gave me the same error. I also went to the sample sheet and ensured the barcodes there were all 12nt in length as well.
First, can you run the command that's giving you the error, providing the --verbose option, and post the command and the full error message that you're receiving.
Next, on the barcodes file that you imported into QIIME 2, can you run the following command (here I'm assuming that it's called barcodes.fastq.gz. (Note this is the file that you're importing - not a .qza that you imported.)
@ThatGuySam, yes, it looks like that's where the error is coming from. So the barcode reads that you have in the fastq.gz file are 11 bases long, while the barcodes in your sample metadata are 12 bases long. I've seen this happen when the sequencing center only ran 11 cycles for the barcode reads. You'll have to check with them - if it was an error on their part, they may re-do the run for you.
Alternatively, you can check to see if your barcodes are all unique in the first 11 bases. If they are, you can adapt your sample metadata file to include the first 11 bases of each barcode, and re-run qiime demux emp-single with the --p-no-golay-error-correction.
Having just re-run it with our samples, I am now receiving this error: Plugin error from demux:
No sequences were mapped to samples. Check that your barcodes are in the correct orientation (see the rev_comp_barcodes and/or rev_comp_mapping_barcodes options). If barcodes are NOT Golay format set golay_error_correction to False.
I will say, the output from Illumina this time was an Undetermined_R1 and Undetermined_R2 file. It is possible that this is causing our errors, as we normally work with an Undetermined_R1 and Undetermined_I1 file. Our I1 file normally contains our barcodes. I figured that the I1 and R2 would accomplish the same thing because I saw tutorials with the input being the Undetermined_R1 and Undetermined_R2, but this must not be the case.
It sounds like you don't have an index / barcode read file this time. Try running that same gzip -cd command on the R1, R2, and I1 (if you have it) fastq.gz files to review the data and figure out where your barcodes are. If it looks like you do have barcodes in one of those files, experiment with the parameters that are suggested in that most recent error message to get your barcode reads and sample metadata barcodes in a consistent orientation.