Problems importing sequences from SRA

Dear Qiime2 users, I'm trying to use paired-end sequences downloaded from SRA, but I have problems with the command, for example:
**SRA files (one per sample): i.e., experiment SRX2730405 (view SRX2730405.txt (3.3 KB)
**Manifest file: Map-imput_Narish2017.csv (164 Bytes)
** comand
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path Map-imput_Naris2017.csv
--output-path Narish2017-demux.qza
--input-format PairedEndFastqManifestPhred64

I already tried several forms of --type, and --input-format, but did not identify the problem. Please, help me identify and correct the error.
Also, I would appreciate any recommendation on how to download data from SRA (I just identified and downloaded each experiment as .fastq

, is that correct?

Hi @Dasiel,

I suspect the issue is in the manifest file. I don’t see a file extension listed as part of the path. I would expect those paths to end in .fastq (or maybe .fastq.gz). If you run the following command, what do you see?

ls /mnt/c/Dowload_linux/set-sequence/

Otherwise, could you post what error you are seeing specifically?


Evan, thank you very much for your help.
Yes, you're right, I must include .fastq at the end of the filephat, i.e.


But still, it doesn't work. Here the -ls view and the error-output

I see that the sequences R1 and R2 are in the same file (SRA run), is it necessary to separate them in order to import them?
I was reading about FASTQ de-interlacer on paired-end reads (on galaxy)...what do you suggest?

Hey there @Dasiel! It looks like there might be an issue related to the manifest file itself. I noticed you are running an older version of QIIME 2 (2018.8). If you upgrade to 2018.11 you will see a more detailed error message about the manifest file that can hopefully set you in the right direction.

One thing I notice is that the filename in your manifest says “SRX2730405_1.fastq”, but the closest filename in the dir is called “SRX2730405.fastq” (note the missing _1 at the end).

Are they interlaced, or pre-joined?

Hello Matthew, thanks for your observation. Normally I work on the lab server, and we have the current version, but I will also update my laptop, thanks.

The sequences are interlaced, what do you recommend me?

Unfortunately we don’t have a mechanism in QIIME 2 for dealing with interlaced reads — if you are able to deinterlace using an external tool (see this link for a suggestion) you could then import those deinterlaced reads using the manifest format you started working with above. Sorry!

Great, in the exchange with you I can understand the problem!. Interesting that not all the samples (run) in this set are interlaced, so I’m checking them individually. I’m using FASTQ de-interlacer (Galaxy version). Thanks for your important support to the users of qiime2. Cheers


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.