Various errors when importing data

Hi, I am very new to QIIME2 and I am having trouble importing the data from the Moving Pictures Tutorial. I am using QIIME2 version 2019.1 through VirtualBox on a Mac. I've looked at the forum posts for similar errors, but I still haven't been able to progress.

I've tried following the directions for the tutorial regarding make a directory and downloading the files:
mkdir emp-single-end-sequences (I've tried a variety of directory names, including qiime2-moving-pictures-tutorial, but this was the name I used for my most recent attempt) cd emp-single-end-sequences

I downloaded the metadata, sequences, and barcodes from the tutorial and put them in a desktop folder of the same name.

I've used the dir command to see what files are in my directory:
$ dir
which gave me:
emp-single-end-sequences:barcodes.fastq.gz
emp-single-end-sequences:sequences.fastq
emp-single-end-sequences:sequences.fastq.gz
sample-metadata.tsv

But when I import (and I've tried many variations), it gives me an error in some way:

#1: $ qiime tools import --type EMPSingleEndSequences --input-path emp-single-end-sequences --output path emp-single-end-sequences.qza
gives me:
Error: Invalid value for "input-path" : Path "emp-single-end-sequences" does not exist

#2: or the same command gives me: or There was a problem importing emp-single-end-sequences: Missing one or more files for EMPSingleEndDirFmt: 'sequences.fastq.gz'

#3: $ qiime tools import --type EMPSingleEndSequences --input-path emp-single-end-sequences:sequences.fastq.gz --output path emp-single-end-sequences.qza
gives me:
There was a problem importing emp-single-end-sequences:sequences.fastq.gz: Importing 'EMPSingleEndDirFmt' requires a directory, not emp-single-end-sequences:sequences.fastq.gz

I don't know what the problems are in each case. Did I download the data correctly (should I use wget or curl instead for VirtualBox?). Is the data not in the directory even if QIIME says it is? Am I not properly giving the program a source to pull the files from?

Any help will be very much appreciated. Thank you!

1 Like

Good morning @ET1335,

Welcome to the Qiime 2 forums! :qiime2:

I'm glad you got qiime installed and found the tutorial. Your perseverance through all these errors is impressive! :trophy: I think you are really close so let's see if we can sort this out.

That's what I think is going on too. Let's move around the directories a bit and see what we find.

Running this command will take you back to your home directory
cd
and running this command
pwd
will show your the absolute file path of the directory you are in (so, the absolute file path of your home directory). pwd stands for Print Working Directory, by the way.

Cool. Navigate to that directory using the cd command. Your command might look like this:
cd Desktop/emp-single-end-sequences
Just like you have done before, take a look at the files in your folder. Instead of dir, try using ls
ls
and post the results!


I should explain my strategy here: Every error you received was about the folder names being close, but not quite right. I want to explore the folders you do have, so we can craft a command that is 100% perfect. (Computers are not very flexible, :man_shrugging: :robot:)

Colin

1 Like

Hi Colin,

Thank you for your response.

typing in cd Desktop/emp-single-end-sequences and then pwd
gives me:
/home/qiime2/Desktop/emp-single-end-sequences

$ ls
gives me:
barcodes.fastq
emp-single-end-sequences:barcodes.fastq.gz (in red text)
emp-single-end-sequences:sequences.fastq.gz (in red text)
sequences.fastq

I'm assuming (or hoping, at least) that these are the files I downloaded from the Moving Pictures Tutorial, and the "sequences" file is the one I need to import first. However, when I am in my emp-single-end-sequences directory and I try the import command, it still doesn't work, giving me either the "Invalid value" or "requires a directory" error.

If I refer to the complete directory path for the input path, here's what I get:

$ qiime tools import --type EMPSingleEndSequences --input-path /home/qiime2/Desktop/emp-single-end-sequences --output-path emp-single-end-sequences
gives me:
There was a problem importing /home/qiime2/Desktop/emp-single-end-sequences:
Missing one or more files for EMPSingleEndDirFmt: 'sequences.fastq.gz'

I don't seem to have a file called sequences.fastq.gz - I only have sequences.fastq and emp-single-end-sequences:sequences.fastq.gz. Is it a problem with file naming/format?

I am a complete novice when it comes to computational biology, so this may just be me not setting up the directory/downloading the files properly. Thank you for your help!

Yes, the results of your ls command appear to confirm this.

I would suggest deleting the emp-single-end-sequences directory (using the ubuntu GUI), and starting over again. This time, carefully read each step in the instructions, and pay careful attention to the command order.

Hello again,

Again we are so close! Looks like the files are downloaded, but maybe you have two copies??

EDIT: With Matt's help, I think I know what's going on! Do your folder look like this?

- home
-- qiime 2
--- Desktop
---- emp-single-end-sequences
       barcodes.fastq
       emp-single-end-sequences:barcodes.fastq.gz (in red text)
       emp-single-end-sequences:sequences.fastq.gz (in red text)
       barcodes.fastq

So you have two extra file, with the : in the middle.


Matt has a suggestion:

But I like this suggestion of his:

Have you used the ubuntu GUI to view these files a folders like you would on a normal Windows machine? This would let you see :eyes: the files in the folders, and I think this would be informative.

So, open up your current directory like this
xdg-open .
or try this
nautilus .

Then explore all your files and report back! :file_folder: :open_file_folder:

Colin

1 Like

Hi Colin and Matt,

Thank you for your responses. It seems I have multiple copies of the files, some of which are zipped, as noted by the text in red.

I've tried to start over by following the directions on the Moving Pictures tutorial page to the letter in order. Now my directory path is /home/qiime2/qiime2-moving-pictures-tutorial/emp-single-end-sequences. The first directory qiime2-moving-pictures-tutorial contains both the metadata file and the emp-single-end-sequences directory.

I'm trying to avoid downloading extra files this time, but when I use the links on the tutorial to get the sequences and barcodes files, I always get one zipped and one unzipped file of each (for example, the sequences download link gives me sequences.fastq.gz and sequences.fastq).

When I try to import any of the above files, I get the various errors described in my first post.

I'm not sure what you mean when you say to use ubuntu gui to look at my files and/or delete my original emp-single-end-sequences directory. The folder on my desktop lists the files as
barcodes.fastq
barcodes.fastq.gz (with another barcodes.fastq inside)
sequences.fastq
sequences.fastq.gz (with another sequences.fastq inside)

Which files do I actually need? Why am I downloading both zipped and unzipped versions of the files from the download links on the tutorial page?

Putting the xdg-open command into the qiime2 terminal gives me the following, which I don't think is helpful:
xdg-open - opens a file or URL in the user's preferred application
Synopsis
xdg-open { file | URL }
xdg-open { --help | --manual | --version }
Use 'man xdg-open' or 'xdg-open --manual' for additional info

Meanwhile, the nautilus command simply brings me to the file menu within VirtualBox.

I'm so sorry to keep bothering you, but I really don't know what's going on.

Thanks for your help!

How are you downloading the files? It sounds like you might be clicking the browser link to DL, which maybe in turn is automatically unzipping the files. Try using the wget commands, instead.

Good morning,

I think you are making good progress!

Looking at the files on your desktop is exactly what I meant! Good job.

OK good! Nautilus is the file menu, so I'm glad this is working as intended.

It sounds like you are using both the file menu and the terminal to make progress in the tutorial. Have you tried downloading and seeing your files without using the file menu? :see_no_evil: This means using the terminal to download using wget, going to folders with cd, viewing their contents using ls, and finding your current location using pwd.

I think you are making good progress and we are getting close! I think using the terminal first and the file menu second will help focus your work. Qiime 2 is all about using the terminal! :qiime2:

Colin

Colin and Matt,

Great news! Import was finally successful after tweaking both the directory and the files.

I was previously telling QIIME2 to look for the directory emp-single-end-sequences while it was in that directory, instead of starting in the directory outside of it, qiime2-moving-pictures-tutorial.

Additionally, there were duplicate files in the emp-single-end-sequences directory (as you both suggested). I had barcodes.fastq.gz and sequences.fastq.gz as well as their unzipped versions, which may have been messing with the program, because when I deleted the unzipped files and made sure I was in the correct directory, the import worked!

So the correct directory and command were
~/qiime2-moving-pictures-tutorial$ qiime tools import /--type EMP SingleEndSequences /--input path emp-single-end-sequences /--output-path emp-single-end-sequences.qza

Thank you both so much for your help. I will continue the Moving Pictures tutorial and return to the forums when other problems arise.

Thanks,
Emily

3 Likes

Glad you got this working, Emily! :clap:

Colin

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.