Dear Qiimers,
I am working on some unusual data for me, and I need some help with importing the reads correctly. I have 16S v3-v4 Illumina data, which seem to have been already demultiplexed and trimmed; also, looking at the headers, each sample was sequenced on multiple lanes, which were then merged. My usual:
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' \
--input-format CasavaOneEightLanelessPerSampleDirFmt
does not work:
Missing one or more files for CasavaOneEightSingleLanePerSampleDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001\\.fastq\\.gz'
So I tried
--input-format CasavaOneEightLanelessPerSampleDirFmt
The import step seemed to have worked, but then I got the same error in the next step (cutadapt).
Can you please advice on the best way to import the data?
Thank you for your kind attention,
Max
EDIT: I am using q2 v 2019.7
1 Like
jwdebelius
(Justine Debelius)
November 12, 2019, 9:38pm
2
Hi @mstagliamonte ,
Have you tried the manifest format ? It requires a bit more work up front from you for importing, but I think it will make your life easier over all.
There's some development for automatic manifests in both R and python which might be of interest.
Hello all!
Since joining this community my coding skill has improved quite a lot. When I first started I made this tutorial here , which is functional, but not ideal. I decided to update it to be a bit more useful for people here using Python 3. Hope it helps you all.
https://github.com/Micro-Biology/BasicBashCode/blob/master/BasicScripts/Q2_manifest_maker.py
Hosted on github so I can edit it, feel free to use however you want, if you add other formats please share below.
Any feedback please ā¦
Here I present my tutorial for making a āManifest.csvā automatically using an R conda environment, to be used to import data according to the " āFastq manifestā formats" part of the Importing data tutorial .
I wrote this script because we intended on running a lot of analyses through QIIME2 and I did not want ot manually make a manifest file to import it every time.
This guide assumes you have a version of conda installed, and that files at names samplename.R1.fastq.gz for forward reads and samā¦
Best,
Justine
3 Likes
Dear @jwdebelius ,
Thank you for your kind answer. Also the script for making manifest files is a great idea!
I was looking at the manifest file tutorial, and one of the fields requires the lane number. The files I received have already been merged by sample, so there are at least 2 lanes in each fastq file. Do you have any advice how to include this info on the manifest file? Or should I resort to split the reads by lane and then import them?
Best,
Max
EDIT : Nevermind, I read thorugh the tutorial again, maybe I donāt need that column in the manifesto. I will give it a try and see how it goes.
3 Likes
ben
(benjamin w.)
November 13, 2019, 2:26pm
4
Yes the manifest file should look like this:
Code should look like this:
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path /gpfs/scratch/wub02/projects/milan.elastase.run/fastq/manifest.txt
--output-path /gpfs/scratch/wub02/projects/milan.elastase.run/QIIME2_2_import/paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2
1 Like
Great!
I have run ll the way to deblur and everything seems to work. Thank you for guiding me through it, I appreciate your kind help.
Best,
Max
2 Likes
system
(system)
Closed
December 15, 2019, 2:44am
6
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.