importing paired end split fastq data

lilycrook · September 30, 2020, 4:41am

Hi there1

I am very new to bioinformatics and I am analysing datasets from differents studies. I am collecting these datasets from NCBI using SRA toolkit. My reads are paired-end and I have managed to split them, so I have the forward and reverse files.

I am going through the 'Importing files tutorial' and I am thinking I can just replicate the code line given under the demultiplexed Cassava data type. But, to be honest, I have no idea what to do.

.
any suggestions on how I can go about this?

Also, I have created my metadata table on excel, saving as a tsv document. I became a bit confused when reading the metadata tutorial about the validation, is this the line I will need to use/

.

Cheers,

Lily

jwdebelius · September 30, 2020, 9:44am

Hi @lilycrook,

I think your best bet for importing the paired end SRA reads is a manifest. Your names do not match the specific casava format, and so rather than trying to make that work, you'll find the manifest easier.

This is just a notice to let you know there's new functionality coming soon. It doesn't effect you now, but in the future, you can validate your metadata in the command line rather than having to use Keemei on google sheets - great for anyone who can't upload their metadata to google.

Best,
Justine

lilycrook · October 1, 2020, 4:05am

Hi Justine,

Thank you for you quick response.
I have created a manifest table on excel and saved as a tab delimited file (txt) containing all the filepaths for my reads.

and this is the code I am trying to use to import the manifest:

qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path manifest.txt
--output-path Paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2

I have also created a shared folder where I have saved the files folder and the manifest file

and this is what I get from the code above:

Am I in the right path at all?

Cheers,

Lily

jwdebelius · October 1, 2020, 7:45am

Hi @lilycrook,

You’re close! The QIIME command is warning you that it can’t find your arguments. you can fix this by wrapping the command using a \ character. so, your command should look like:

qiime tools import \
–type 'SampleData[SequencesWithQuality]' \
–input-path manifest.txt \
–output-path Paired-end-demux.qza \
–input-format PairedEndFastqManifestPhred33V2

Another piece of (entirely unsolicited) advice: if. you replace the spaces ( ) in your file paths with dashes (-) or underscores (_), it will make it easier to navigate on your linux system. Linux parses spaces as part of commands rather than a filepath. It wont affect the import, but it will make things harder later.

Best,
Justine

lilycrook · October 3, 2020, 6:13am

Thank you Justine, I have taken up on your advice and changed the file paths, and it seems to have worked, however I am getting a different error now, which I am not sure what it means.

Could you help me out, please?

Cheers,

Lily

jwdebelius · October 3, 2020, 2:00pm

Hi @lilycrook,

It looks like both you and I missed something. The semantic type, SampleData[SequencesWithQuality] is only forward reads. I think you need SampleData[PairedEndSequencesWithQuality].

Best,
Justine

system · November 3, 2020, 8:00pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.