qiime tools import error "+_.+_L[0-9][0-9][0-9]_R[12]_001\\.fastq\\.gz"

I am trying to use some data off the ncbi to create some taxonomies but first I am working on the importing step for my fastq data. It is single ended with quality. Here is the command:

iime tools import --type SampleData[SequencesWithQuality] --input-path ./Manifest.csv --output-path single-end-demux.qza --input-format SingleEndFastqManifestPhred33

My Manifest.csv file is like this:
sample-id,absolute-filepath,direction
/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870254,/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870254.fastq,forward
/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870350,/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870350.fastq,forward
/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870351,/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870351.fastq,forward
/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870352,/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870352.fastq,forward
/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870353,/home/gomeza/davi2714/Ashok_data/SRP023463/fastq/SRR870353.fastq,forward

Finally the fastq files themselves look like this generally:
@SRR870404.1 GSXJRPF01EAVZZ length=402
TCAGAACGCACGCTAGCATGCTGCCTCCCGTAGGAGTTTGGACCGTGTCTCAGTTCCAATGTGGGGGACCTTCCTCTCAGAACCCCTATCCATCGTTGACTAGGTGGGCCGTTACCCCGCCTACTATCTAATGGAACGCATCCCACTCGTCTACCGGAAAATAACCTTTAATCATGCGGACATGTGAACTCATGATGCCATCCTGGATTAATCTTCCTTTCAGAAGGCTGGCCAAGAGTAGACGGCAGGTTGGATACGTGTTACTCACCCGTGCGCCGGTCGCCATCAGCCTTAGCAAGCTAAGACCATGCTGCCCCTCGACTTGCATGTGTTAAGCCTGTAGCTAGCGTTCATCCTGAGCCAGGATCAAACTCTGACTGAGCGGGCTGGCAAGGCGCATAG
+SRR870404.1 GSXJRPF01EAVZZ length=402
IIIIIIIIIIIIIIIIIIIIIIIIIIBBBIIIIIIIHHHIIIHHHIIIIIIIIIIIIIIIII;;;;;[email protected]=AA<A971111126AA;[email protected]:73-005.77?AEEIIIIIBB?;;44;;CCC<<7??IIIIIIIIIIIIIIIIIIIIIIIIIIIIIHEEI;;;;BBEEEEBBA???IIIIIIIIII;;;DIIIIIIIIIIIIII???C???EEEEICBDHGGGIIIIIIIIIIIIIIIIIIIIIIHHHHIIIIIIIIHHHIIIIIICC[email protected]@A9422EA@@@777797EIIFI

The error I get is:
There was a problem importing Manifest.csv:

Missing one or more files for SingleLanePerSampleSingleEndFastqDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz

Any ideas on how to fix this? I have checked for any weird spacing in the Manifest.csv file, I have checked if the fastq files are somehow corrupted.

Thanks for the consideration!

Hello @misterman. I am not affiliated with the Qiime team and just stumbled across this post randomly. I am not sure what the problem might be, but my gut feeling tells me that you’d be better off- also downstream - if you lost all the / characters in the sample-id field. For example the base name without extension would probably be enough and can be generated using bash, by focusing on the first column and then furthermore call this:

s=/the/path/foo.txt
$ echo ${s%.*}
foo

after https://stackoverflow.com/questions/2664740/extract-file-basename-without-path-and-extension-in-bash. Or of course just do that manually if not too many lines. Hope that helps.

1 Like

Good morning @misterman,

Have you validated your sample metadata using Keemei? I ask because I can see your absolute-filepaths and directions, but I don’t see sample names in your table. Keemei will point out issues like this and help you make corrections.

Let me know what you find!

Colin

Thank you! Sometimes a fresh pair of eyes is needed. I am missing the sample-ids. Thank you!

1 Like

Hi @misterman - just curious, can you please confirm what version of QIIME 2 you are running? You can run qiime info for the details. Thanks!

1 Like

The version I am running is qiime2 2019.4

1 Like