I am running qiime2-2021.8 in an Ubuntu virtualbox. The files I am using are the forward, reverse, barcodes, and sample-metadata from the atacama soils tutorial. I am attempting to first merge the reads, then remove adapters, then trim the file using NGmerge. After this step I would like to import into qiime2 and demultiplex it.
#After installing the required files I run
NGmerge/NGmerge -1 forward.fastq.gz -2 reverse.fastq.gz -o NGmerge/sequenced -a -m 20 -e 50
NGmerge/NGmerge -1 NGmerge/sequenced_1.fastq.gz -2 NGmerge/sequenced_2.fastq.gz -o qiime/multi-sequences
mv multi-sequences.gz sequences.fastq.gz
qiime tools import
#Up to now everything runs smooth. Then I run
qiime demux emp-single
My error command is as follows below
Anyone have ideas for how I can fix this?
Welcome to the forums!
The error of 'mismatched sequence IDs' is thrown when the sequence IDs don't match between your forward, reverse, and or index files. Some software for joining paired-end reads is careful to keep all your reads in order. Other software jumbles them up, and I think that's what happened here.
Yes! If you import your data into Qiime2 before joining, then join using a Qiime2 plugin, this problem will be solved. Bonus: some plugins like DADA2 will both trim, denoise, and join your reads all with one command.
I think this is the easiest way forward, unless you wanted to use NGmerge. (We can get that working too, if you would like!)
(This also avoids some spookiness, like importing EMPSingleEndSequences that are secretly JoinedSequencesWithQuality )
Colin, thanks for the quick response! For the sake of simplicity, I agree that QIIME2 is the easier solution. However, I am trying to compare various pipelines to determine which one I'd like to use going forward. So, being able to run through a pipeline with NGmerge is one of my priorities. Also, would it be feasible to implement NGmerge as a plugin for QIIME2 or is it simpler to keep the process outside of the QIIME2 enviornment?
How would I confirm that NGmerge jumbles the reads and if it does, how should I go about unscrambling them?
@colinbrislawn and I had a brief chat and we think you should be able to do the following, somewhat roundabout, approach:
Import the raw paired-eads into QIIME 2, demultiplex them, then export the demuxed paired reads. From here you can merge the paired reads on a per-sample basis with NG merge. Finally, you can re-import these merged reads as
JoinedSequencesWithQuality type using the Manifest format, or other format that assumes the data are already demuxed. Of course, you'd lose provenance in between the import/export steps.
This would also limit you to using
deblur within QIIME 2 to analyze your merged reads. Although, nothing would stop you from running
dada2 denoise-single (assuming you import the merged sequences as
SequencesWithQuality, it'd violate the assumptions of
dada2 denoise-single, and may return spurious ASVs.
Hope this helps!
I agree with @SoilRotifer, and also wanted to 'qiime-in'
Benchmarking third-party software is a great use case for doing it 'outside' of the Qiime2 ecosystem. Keep-It-Super-Simple
And of course, you could eventually bring the best performing program into the Qiime2 ecosystem with a plugin!
Yes! When you are ready, check this out Developing a plug-in for dummies — QIIME 2 Developer Documentation documentation
Thanks for the help and suggestions so far. So I'm actually working with jmlayton on this.
I was curious when using the Manifest format for re-importing the merged reads back into qiime I was trying to figure out the format it needs to be in I was able to have the sample-id and the absolute path but the specifics of how to call qiime import using manifest on merged samples is alluding me currently. Is there a way to specify the import for merged sequences or must it be done with a forward and reverse sequence?
Hi @Connor_Herron, you just need to make a tab-delimited manifest file as outlined here. Specifically, you'd make your manifest file look like:
Assuming our manifest file is named
merged-reads-manifest-file.tsv. Then you'd run:
qiime tools import \
--type 'SampleData[SequencesWithQuality]' \
--input-path merged-reads-manifest-file.tsv \
--output-path merged-demux.qza \
Note, we need to trick QIIME 2 by importing your merged data as a
SingleEndFastqManifest... format. You may need to change the
Phred33V2 to either
Phred64V2 if the import does not work.
From here you can run deblur (not DADA2 as mentioned previosly), and/or OTU clustering.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.