Problem importing data - QIIME 2 2019.4

Hello everyone!

I’ve been having problems importing my data to Qiime 2 version 2019.4, installed in my Oracle VirtualBox. I searched in the forum before and my trouble is really similar to the one in this post, but I can’t seen to find the answer:

I have paired end data which is demultiplexed and sequenced in Illumina MiSeq, the script that I’m using is the following:

qiime tools import
–type ‘SampleData[PairedEndSequencesWithQuality]’
–input-path /home/qiime2/Documents/seq-felipe
–input-format CasavaOneEightSingleLanePerSampleDirFmt
–output-path felipe-imported.qza

And the error message I’m getting is:

There was a problem importing /home/qiime2/Documents/seq-felipe:

Unrecognized file (/home/qiime2/Documents/seq-felipe/Sample_2019-FR-LIB-C14G_L001_R2_001_fastqc.html) for CasavaOneEightSingleLanePerSampleDirFmt.

But this file doesn’t seem to be any different from any of my other files. If someone can help me, please, I will be really grateful!!! Cheers, Felipe.

Hi @Felipe_Rocha,

The culprit file appears to be an .html file, likely some sort of summary report from the sequencing facility. The expected file format here is .fastq or .fastq.gz

2 Likes

Hello @Mehrbod_Estaki,

thank you for your answer! Do you mean that for the importing step I don’t need to have any of my quality reports for each read?

Because for each sample I have a .html and a .zip file which is the summary report you are talking about. I thought both were necessary for the importing step because then they would be available for me to analyse the quality of my reads.

I’ll try to import with only my fastq.gz data, and then move forward to the read quality analysis. I’ll let you know if this works out.

Hi @Felipe_Rocha,
Correct, you don’t need those .zip or .htlml files, not unless the facility has done something weird to your .fastq files and removed/replaced the quality scores with something else. I doubt this is the case.
The .fastq files are .fasta files with the quality scores within them so those are really all you need to import. The other files you have are, I imagine, a convenience thing produced by the sequencer/facility.

1 Like

I went throught the tutorial and I can’t understand why is not a CasavaOneEightSingleLanePerSampleDirFmt. Is it because all my files start with Sample_ and not differently for each sample?

@Mehrbod_Estaki, thank you so much for helping me!!!

I separated those files and now I have only the .fastq in my directory. Now the error is another one, and I think I have a clue of what is going on, but I rather ask you.

My script and the error message:

(qiime2-2019.4) [email protected]: ~/Documents/seq-felipe$ qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path /home/qiime2/Documents/seq-felipe --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path felipe-imported.qza

There was a problem importing /home/qiime2/Documents/seq-felipe:
/home/qiime2/Documents/seq-felipe is not a(n) CasavaOneEightSingleLanePerSampleDirFmt:
Duplicate samples in forward reads: {'Sample'}

My files in the directory are named like this:

Thanks for the help. Best wishes!

1 Like

Hi @Felipe_Rocha - your filenames don’t match the CasavaOneEightSingleLanePerSampleDirFmt, you are missing the “barcode” field. You will need to use a manifest format, instead.

3 Likes

Thank you, @thermokarst!!! and also thanks @Mehrbod_Estaki!!!

I made a manifest file containing all my samples and their absolute filepath and it worked :heart_eyes:

Now I will continue to the nexts steps and it won’t be long until I’m here again probably haha.

Cheers, Felipe.

2 Likes