Issue with Paired End Reads/Demultiplexing- qzv overview and quality score plot missing samples?

Hello everyone!

I’m very new to qiime2, and I’ve been running into an issue demultiplexing my sequence data. When I take my data through a EMP paired ends concatenation and demultiplexing process, the resulting qzv file, which should show me an overview and quality score for 50 sequences only shows me three sequences… I’m wondering where the other 47 went?

Essentially, I’ve been working with a sample that contains sequences for 50 sample (each has a forward, reverse and barcode associated with them).

The order of commands I took were:

  1. I imported data from our remote illumina server into our campus remote qiime2 server:
    mv ORIGINALFILENAME.fastq.gz forward.fastq.gz
    mv ORIGINALFILENAME.fastq.gz reverse.fastq.gz
    mv ORIGINALFILENAME.fastq.gz barcodes.fastq.gz

I saved these in their own directory folder.

2.I imported my mapping file as a .txt file and changed it into a .tsv file type.
-I saved this in a level about my directory folder containing the forward, reverse and barcode reads.

  1. I concatenated this data:
    qiime tools import
    –type EMPPairedEndSequences
    –input-path DIRECTORY
    –output-path FILENAME.qza

  2. I demultiplexed data:
    qiime demux emp-paired
    –m-barcodes-file FILENAME.tsv
    –m-barcodes-category BarcodeSequence
    –i-seqs FILENAME.qza
    –o-per-sample-sequences demux
    –p-rev-comp-mapping-barcodes

  3. I visualized this qza file:
    qiime demux summarize
    –i-data demux.qza
    –o-visualization demux.qzv

  4. I copied this qzv file to my desktop and visualized it in the qiime2 viewer website… this is where I noticed only 3 samples came up and the other 47 were missing.

What we have done to trouble shoot:

  1. We know from qiime1 workflow with this set that there isn’t an issue with the mapping file, at least that we are aware of.
  2. The internet could have went down during the concatenation steps, but if I felt that was the case, I removed the file in question and re-imported it into qiime2 again… ie I’m fairly sure that it wasn’t an issue with broken data getting imported.
  3. I’ve tried re-importing this data twice, under different folders. Both times I’ve ended up with the same qzv missing files visualization issues.
  4. We have confirmed our barcode, forward and reverse read files look okay.

What we are wondering:

  1. Is there a “built in” filtering step that occurs with the demux command? So that the resulting qzv file will only show reads of a certain quality and other samples are removed? If that is the case, is there a way I can work around this?
  2. Has anyone encountered a similar issue and what did you do?
  3. Is there a better way to concatenate and demultiplex paired end data? (we used earth microbiome project primers).
  4. Could it possible be something with the qiime2 view website that is only showing me part of my data?
  5. Should the mapping file be formatted differently for qiime2 as compared to qiime1?

Thank you all for your assistance!!

Thanks for the great post @njnealon! Your level of detail is incredibly useful in figuring out what is going on.

It is likely that your barcodes are not reverse-complimented, so if you leave off that last parameter, things should match (this is the most common reason you see only a couple samples when you were expecting a great many more).

If you've already tried that, then something else is going on, but let me know if that works.

If it isn't the rev-comp parameter, this is the next most crucial step, so thanks!

Answering your other questions:

No however, the barcode matching is currently an exact match (no alignment or error correction is attempted), so that acts as a kind of "implicit" quality filter. We've found that in practice this hasn't mattered, although it's something we'll likely implement at some point.

I'm not entirely sure what you mean by concatenate, but I'm guessing you mean the instrument does not attempt to demultiplex the barcodes? What you've described so far (the EMP protocol) should be well supported by QIIME 2.

It isn't clever enough to pull that off :slight_smile: It just shows you what is inside of the .qzv.

A QIIME 1 mapping file will always work in QIIME 2, and any changes we've made have ultimately loosened some of the restrictions. You can read more about that here. But it shouldn't have any impact on your particular analysis.

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.