NextSeq data isn’t demultiplexing well. Not sure how to proceed. Barcodes in our MiSeq are usually reverse complemented, but with the NextSeq runs they’re on the forward reads. 12 base pairs, but not the usual Golay with the EMP guidelines. Wondering if someone could help point me the way to see how I can get these imported correctly. Thank you very much. Ben
We formatted the data in such a way as we have a multiplexed file of forward and reverse fastq.gz files from the NextSeq. The different samples are within these files. This is the way we processed the EMP MiSeq files, but I think there’s some incompatibility with the way that NextSeq headers/files are being seen by Qiime2.
Praying to the mods, unqueue me.
Thanks mods, may you be blessed with the rains down in Africa
Hi @ben,
Just to get the ball rolling with troubleshooting this.
Is there a specific reason that lead you to think this? Can you provide some examples of your reads?
You mention that the barcodes on your forward reads but are there any on the reverse reads as well or just on the forward?
Did you run a dual-index run but only have barcodes in one direction?
Have you checked to see if all your reads are actually in the right orientation and not somehow mixed? This is probably not the case, but if so it could lead to some wonky demultiplexing.
Sure, these are stool samples in the links, read assignment less than 1000 seems very unlikely.
No barcodes in reverse
Only barcodes on the foward
All sequences are processed by the center we send them to, they are usually and consistently excellent
pairing in Qiime1.9.1 gives me 30,000,000 reads for the entire run (6 plates)
Assignments are similar in Qiime1.9.1 and Qiime2
So, I’ve actually tried running the NextSeq in Qiime1.9.1, here’s the Split_Libraries_Fastq.py results:
modifiers:
-barcode 12
Quality filter results
Total number of input sequences: 29073029
Barcode not in mapping file: 28580648
Read too short after quality truncation: 19635
Count of N characters exceeds limit: 144
Illumina quality digit = 0: 0
Barcode errors exceed max: 0
Thanks for the update @ben. It’s obvious that you should be getting more read assignments and that the issue is at the demultiplexing step.
Did you happen to check for possible mixed orientation of your reads as I mentioned above? You could provide a sumsample of your reads?
If you ran a dual-index run but only have barcodes in one direction, you may run into an issue as described here. That may be something you can also discuss with your sequencing facility if unsure.
Hi Colin, thanks, no it didn’t demultiplex correctly either. Using the Joined reads command I ended up an approximately 16 gigabyte file of joined reads, and 23 gigabyte file in forward and reverse unjoined reads.
From this, I tried to demultiplex, which is what you see above:
Quality filter results
Total number of input sequences: 29073029
Barcode not in mapping file: 28580648
Read too short after quality truncation: 19635
Count of N characters exceeds limit: 144
Illumina quality digit = 0: 0
Barcode errors exceed max: 0
From the NextSeq data I was only able to get 30,000,000 joined sequences, of which only 70,000? were demultiplexed correctly.
Agreed, and yes, I saw that - not sure why 30,000,000 sequences are missing barcodes from the barcode file. I can provide the cat from the three fastq and the barcode file if that would help.
There’s also this interesting bit: https://www.biostars.org/p/317492/ where the index orientation will depend on the reverse complement of the adaptor? This is a rabbit hole, we’re speaking with our sequencing core. Ben
Small update, we solved the problem, the primers for some reason, which work excellently in MiSeq are not working in NextSeq. We are contacting Illumina and addressing the issue. Thanks for your help.
I worked this out by "cat"ing the seq.fna file, making sure that barcodes were present, then "cat"ing the barcode file and finding a lot of trash barcodes. I asked the core to bcl2fastq and look @ the highest barcodes and it turns out there were a lot of issues w/ their barcodes. Essentially, we think that it failed with the index read.
Thank you, we contacted Illumina and there are slight differences between the NextSeq and MiSeq protocols which may lead to variations. These are being worked out now. I guess this is a cautionary tale for those trying to switch between platforms. Illumina has been excellent in helping and our core is great, so hopefully we will get to the bottom of this. Ben