What if i don't have the barcodes file

Hello,
I'm using QIIME2 to analyze my human gut microbiome 16s rRNA sequence data. Below is my command when I import my data:

qiime tools import
--type EMPSingleEndSequences
--input-path emp-single-end-sequences
--output-path emp-single-end-sequences.qza

But it turns out a ValueError: Missing one or more files for EMPSingleEndDirFmt: 'barcodes.fastq.gz'

The problem is I even don't have the barcodes file.

Note that when I try to import my data as a SampleData[SequencesWithQuality] type, as follows

qiime tools import   --type SampleData[SequencesWithQuality] --input-path ./SampleData/   --output-path SampleData.qza

I get the following error

Traceback (most recent call last):
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/bin/qiime", line 6, in <module>
    sys.exit(q2cli.__main__.qiime())
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/q2cli/tools.py", line 62, in import_data
    view_type=source_format)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/sdk/result.py", line 191, in import_data
    return cls._from_view(type_, view, view_type, provenance_capture)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/sdk/result.py", line 216, in _from_view
    result = transformation(view)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/core/transform.py", line 57, in transformation
    self.validate(view)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/core/transform.py", line 114, in validate
    view.validate()
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/plugin/model/directory_format.py", line 168, in validate
    getattr(self, field)._validate_members(collected_paths)
  File "/Users/yinghe/miniconda3/envs/qiime2-2017.4/lib/python3.5/site-packages/qiime2/plugin/model/directory_format.py", line 104, in _validate_members
    self.pathspec))
ValueError: Missing one or more files for SingleLanePerSampleSingleEndFastqDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001\\.fastq\\.gz'

Not exactly sure what this error means, or how to get started figuring this out.
Any pointers would be greatly appreciated.

Thanks,
dullgreen

Hey @yinghe,

What kind of data do you have? Do you have a separate file for every sample (i.e. demultiplexed)?

If so you should use the new fastq manifest formats. Here is the tutorial for using it. Alternatively you could also rename all of your files to match what SingleLanePerSampleSingleEndFastqDirFmt expects (which is what your last error is trying to communicate).

If your data isn’t already demultiplexed and you aren’t using the EMP protocol (i.e. no barcodes file), then you’ll need to contact your sequencing facility and find out how you should demultiplex your reads into per-sample sequences.

1 Like

Yes, my data are two separate .gz file for every sample. I think one is forward and another is reverse.

When I just use the forward file and name it as 'Casava 1.8 single-end demultiplexed fastq' style which is look like 'L2S357_15_L001_R1_001.fastq.gz'. And use below command:

qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path casava-18-single-end-demultiplexed
--source-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-single-end.qza
Problem solved. I could import my data to qiime2.

Thank you for your prompt help, and please tell me if i'm doing right.:blush:

2 Likes

That should work!

If you want to use your paired end data, you can do a similar thing, but basically you just change the --type to 'SampleData[PairedEndSequencesWithQuality]'.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.