Importing Casava 1.8 paired-end demultiplexed fastq.gz files

Good Morning,

I am currently trying to import my Casava 1.8 paired-end demultiplexed fastq data I have received from a download on Illumina Basespace into qiime 2 version 2-2021.4 in a conda MacOS environment. However, I am unsure of the language I need to be using. I am trying to import the whole folder, but I am not sure what text to change in my import template that was provided by Qiime 2 docs online. This is the code I ran:

qiime tools import   
--type 'SampleData[PairedEndSequencesWithQuality]'   
--input-path /Users/SoilMicrobiologyWSU/Nick/M06339_Run131_Neely_MS3064010-242560318/FASTQ_Generation_2021-04-04_08_49_18Z-398026631    
--input-format CasavaOneEightSingleLanePerSampleDirFmt   
--output-path demux-paired-end.qza
There was a problem importing /Users/SoilMicrobiologyWSU/Nick/M06339_Run131_Neely_MS3064010-242560318/FASTQ_Generation_2021-04-04_08_49_18Z-398026631:

  Unrecognized file (/Users/SoilMicrobiologyWSU/Nick/M06339_Run131_Neely_MS3064010-242560318/FASTQ_Generation_2021-04-04_08_49_18Z-398026631/B4_ITS2_L001-ds.73754145408645b6bd9901fe7f209eae/B4-ITS2_S326_L001_R1_001.fastq) for CasavaOneEightSingleLanePerSampleDirFmt.

I tried modifying the Casava line to change the text to represent my data but it did not like that either. I am also trying to copy over the entire folder with my 16S and ITS data files. Is this the correct thing to do or do i need to import every individual file with the format similar to this sample id, absolute path, direction? I am very new to python/Qiime and have reviewed tutorials online but am still unsure of where to modify template code. Any help or resources would be appreciated. Thank you!
-Nick

Hi @Nick.622,

I think the problem is not the path direction but the files format. You are trying to import fastq files and it needs to be compressed as fastq.gz files when the format is Casava type. But I am not sure of it :sweat_smile:.

I was looking in Importing section and found the section "Fastq manifiest formats", maybe it can help you?

Sorry for not be more helpful.

Best,

Elsa

3 Likes

I think I am confused on the manifest file. Do I need to create an excel sheet saved as a .tsv with three columns sample-id, forward-absolute-filepath, and reverse-absolute-filepath? Then paste the filepaths proceeded by $PWD? Thanks!

Hi @Nick.622,

In the manifest file, you describe the location of your input data. So, if there are paired-end samples, you only need these three columns! And yes, the format file is ".tsv". In the importing section, they say:

The manifest file is a tab-seperated (i.e., .tsv ) text file. The first column defines the Sample ID, while the second (and optional third) column defines the absolute filepath to the forward (and optional reverse) reads. All of the rules and behavior of this format are inherited from the QIIME 2 Metadata format.

In this case, I prefer writing the absolute file path, but I guess this approximation is good too!

Also, I think this post is interesting (in case you are wondering about the differences between metadata and manifest files):

I hope I have been helpful!!

Best,

Elsa

3 Likes

Awesome, thank you for the answer!

Nick

2 Likes