So I am testing out the Moving Pictures Tutorial to familiarise myself with qiime2 and deblur. However, I am running into issues importing my data: I am able to download the barcodes, but I get this response when I try to download the sequences using wget
(qiime2) taylorrayne@Taylors-MacBook-Air qiime2-moving-pictures-tutorial % wget \
-O "emp-single-end-sequences/sequences.fastq.gz" \
"https://data.qiime2.org/2022.8/tutorials/moving-pictures/emp-single-end-sequences/sequences.fastq.gz"
emp-single-end-sequences/sequences.fastq.gz: No such file or directory
I tried to download the sequences using the browser option instead, but then I ran and got this:
(qiime2) taylorrayne@Taylors-MacBook-Air qiime2-moving-pictures-tutorial % qiime tools import \
--type EMPSingleEndSequences \
--input-path emp-single-end-sequences \
--output-path emp-single-end-sequences.qza
Usage: qiime tools import [OPTIONS]
Import data to create a new QIIME 2 Artifact. See https://docs.qiime2.org/
for usage examples and details on the file types and associated semantic
types that can be imported.
Options:
--type TEXT The semantic type of the artifact that will be
created upon importing. Use --show-importable-types
to see what importable semantic types are available
in the current deployment. [required]
--input-path PATH Path to file or directory that should be imported.
[required]
--output-path ARTIFACT Path where output artifact should be written.
[required]
--input-format TEXT The format of the data to be imported. If not
provided, data must be in the format expected by the
semantic type provided via --type.
--show-importable-types Show the semantic types that can be supplied to
--type to import data into an artifact.
--show-importable-formats
Show formats that can be supplied to --input-format
to import data into an artifact.
--help Show this message and exit.
There was a problem with the command:
(1/1) Invalid value for '--input-path': Path 'emp-single-end-sequences' does
not exist.
I think my issue is a quick fix of properly getting the data and naming it accordingly ... but I am confused, haha.
Thanks for reaching out! Happy to lend a hand here
Based on your second code block, it seems like the emp-single-end-sequences directory doesn't exist locally on your machine (at least in your working directory where you're running those commands). Let's try wget again, and then have you run the rest of the subsequent commands after that's working.
Were you able to run the other wget commands successfully, or did you have this issue with all of them? The command itself looks correct, so there might be an issue with your conda environment.
Thank you so much for following up with me on this.
After running the wget command again, as you suggested, I had success! Although, I do have remaining questions, mainly concerning data input/output.
If I try to run my own data through deblur using qiime2, how might I change the paths in the following commands?
% qiime tools import \
--type EMPSingleEndSequences \
--input-path trail1stock\ ; name of directory that has my fastq file
--output-path trail1stock-seq.qza
I think I might have messed up with the output path perhaps, but I am not really sure. I've done some googling but haven't come up with anything.
Do you see what might be the issue?
It looks like you're missing a space between your input path name and the line break, which is why QIIME 2 thinks your input path name is trail1stock--output-path, and that you're missing the --output-path parameter.
You can copy/paste this adjusted command below, and it should work for you:
Perfect! That fixed it ... almost.
So now I am dealing with a few issues related to the EMPSingleEndSequences type.
I tried to run with the (your) code above, but got:
There was a problem importing trail1stock:
Missing one or more files for EMPSingleEndDirFmt: 'sequences.fastq.gz'
So I tried to rename my data files but then:
There was a problem importing trail1stock:
trail1stock/sequences.fastq.gz is not a(n) FastqGzFormat file:
File is uncompressed
Which makes sense. Do you know how to solve this issue - either by re-routing EMPSingleEndSequences to take my original file or by some other means?
Take a look at our importing guide for EMPSingleEndSequences, and make sure that your data is in the format described there. If that is all in order, the only issue is (as you mentioned in your previous response) related to this error:
If your files are not compressed (i.e. zipped), they will not be accepted. Simply adding .gz to the end of your filename will not solve this - file extension names don't mean anything unless the data format reflects what that extension should represent.
You'll need to run the gzip command on both your sequences and barcodes files in order to create the correct compressed file format (which will result in the sequences.fastq.gz and barcodes.fastq.gz files that are required). Here's an example of what that command looks like on OSX: