Unknown paired-ends fastq import issue

I have been given three fastq.qz files (Undetermined_S0_L001_I1_001.fastq.gz (barcodes), Undetermined_S0_L001_R1_001.fastq.gz (forward), Undetermined_S0_L001_R2_001.fastq.gz (reverse)) and tried importing them with the EMP protocol. When I then demultiplex and visualize the summary (SampleData[PairedEndSequencesWithQuality].qza), the result is a very low sequence and frequency count.

Could this be an error in the importing process or in my steps?

Hi @georgia,

After discussing with @ebolyen, we have 2 theories:

  • You are getting confused with the --p-revcomp-barcodes parameter, which is discussed here: Demux results in the loss of data, or
  • Your barcodes are not actually EMP - Golay, you would need to check with the person doing the wetlab work to verify this.

Hope this helps.

I’ve gone through using the --p-revcomp-barcodes and without using it, and every time the output seems to be wrong. It must be the format the barcodes are actually in.
Is there any other way to decipher what format or semantic type they are for importing into qiime2, without asking the person who performed the work on them?

To be fair, the easiest is to ask the person/group that processed those samples.

Now, another option is to check the primer used and see if it matches one of the EMPs, for 16S. If they do, it’s highly probable that the barcodes are actually EMP, and if they don’t it’s highly probable that they are not. Another hint is to check if the primer is at the beginning of the forward/reverse read; if they are not, it points to EMP protocol. As a following step for confirmation, the references below that link should have a list of all possible barcodes and you should be able to confirm them by matching them.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.