Phred33V2 manifest import error

Hello! I have used QIIME2 many times and never run into this error, despite having used Paired End Phred33V2 data types every time. I am trying to import Paired End Phred33V2 forward and reverse reads into a demux.qza file using a manifest. The manifest is as follows ( | indicates the presence of a tab):

sample-id forward-absolute-filepath reverse-absolute-filepath
4.18.16 /home/qiime2/Desktop/16its/fasta/041816-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/041816-MSdITSfusion_R2.fastq
7.11.16 /home/qiime2/Desktop/16its/fasta/071116-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/071116-MSdITSfusion_R1.fastq
7.25.16 /home/qiime2/Desktop/16its/fasta/072516-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/072516-MSdITSfusion_R1.fastq
8.1.16 /home/qiime2/Desktop/16its/fasta/080116-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/080116-MSdITSfusion_R1.fastq
8.8.16 /home/qiime2/Desktop/16its/fasta/080816-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/080816-MSdITSfusion_R1.fastq
8.11.16 /home/qiime2/Desktop/16its/fasta/081116-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/081116-MSdITSfusion_R1.fastq
8.15.16 /home/qiime2/Desktop/16its/fasta/081516-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/081516-MSdITSfusion_R1.fastq
8.29.16 /home/qiime2/Desktop/16its/fasta/082916-MSdITSfusion_R1.fastq /home/qiime2/Desktop/16its/fasta/082916-MSdITSfusion_R1.fastq

I am using this script to import the data, as I have many times in the past (this script is in fact copied from my scripts with only the file names changed):

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path $PWD/16its-manifest.txt
--output-path $PWD/16its-pairedend-demux.qza
--input-format PairedEndFastqManifestPhred33V2 \

I am in the directory with the manifest. I am receiving this error each time I attempt to import the data:

There was a problem importing /home/qiime2/Desktop/16its/16its-manifest.txt:
/home/qiime2/Desktop/16its/16its-manifest.txt is not a(n) PairedEndFastqManifestPhred33V2 file:
Filepath on line 2 and column "reverse-absolute-filepath" (sample "7.11.16") has already been registered on line 2 and column "forward-absolute-filepath" (sample "7.11.16").

I am quite confused with this error message. First, I am quite sure that the data is in the correct format for an "--input-format PairedEndFastqManifestPhred33V2 " argument, as I am using data from the same sequencing company I always use. Second, I am confused why QIIME2 has an issue with me importing forward and reverse reads under the same name, as that is the whole point of using a Paired End Fastq format. Why is QIIME2 telling me that the name "7.11.16" has already been registered when only the forward read for that sample has been read? As you can see from the manifest, 7.11.16 is not the first sample in the manifest, so if this was an error with the format of the manifest why was the error not returned on the first sample?

I tried to see if perhaps they sent data in Phred64V2 format, but that did not work and returned the same error message.
I also tried using a .tsv format for the manifest, but that also returned the same error message.
I am using QIIME2 2019.4 on a Windows 10 machine inside a VM session, as I usually do. I am using this version because the classifiers I have were trained using 2019.4 and the computational demand of training a new classifier on a more recent version of QIIME2 is not possible from my laptop due to work from home restrictions.

Thank you all for your consideration and time assisting me with this error, have a pleasant day and stay safe!

Hi @jameshopkins,

This does not look like a format issue, but a naming issue

As you've noted, that's the issue: that you are using duplicate names. So line 1 looks okay (where you have the R1 and R2 naming conventions for forward and reverse reads), but line 2 (and the others) use the exact same filepath for forward and reverse.

Not to my knowledge. The issue is that you are using the exact same files, i.e., you are passing each file in twice as its forward and reverse read, so QIIME 2 is kindly informing you that you are not passing in the correct file pairs. The goal of PE formats is not to pair each read with itself (as you are doing if you use the same file as F and R), but to pair together the PE reads from separate files. So in your case if the forward read is:
/home/qiime2/Desktop/16its/fasta/071116-MSdITSfusion_R1.fastq

your reverse filepath should probably be:
/home/qiime2/Desktop/16its/fasta/071116-MSdITSfusion_R2.fastq

It's not that the file name has been registered, but the filepath has been registered already. QIIME 2 keeps track of what filepaths have been used so that you do not accidentally duplicate filenames.

Because the format is not faulty, but the filepaths are duplicated on every line except the first. So 7.11.16 is the first sample encountered with duplicate filepaths. 4.18.16 does not raise an error because it has the proper (non-duplicated) PE filenames listed.

So fix those filenames so that the reverse read column has all "R2" instead of "R1" and all will be well.

Good luck!

3 Likes

Thank you Nicholas! Its apparent that I just can't read correctly, I appreciate your thoroughness! Stay safe!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.