Issues with coverting fastq.qz files to qza files with updated version of qiime2-2022.2

rr220 · March 1, 2022, 9:54pm

I have a total of 40 samples that have both forward and reverse .fastq.gz files. The file naming format is -M-CON-10_S10_L001_R1_001.fastq.gz M-CON-10_S10_L001_R2_001.fastq.gz where R1 = forward and R2 = reverse. I have saved all of these in a folder named rawreads.
When I type the command mentioned below:
qiime tools import --type EMPPairedEndSequences --input-path rawreads --output-path rawreads.qza

I get following error message, even though no files are missing

There was a problem importing rawreads:
Missing one or more files for EMPPairedEndDirFmt: 'forward.fastq.gz'

jwdebelius · March 1, 2022, 10:00pm

Hi @rr220,

Welcome to the :qiime2: forum!

It sounds like your reads have already been demultiplexed. I would recommend checking the importing data tutorial. My favorite way to import paired end demultiplexed sequences is to use the manifest format.

Best,
Justine

rr220 · March 2, 2022, 12:19am

Thank you Justine! I will check the importing data tutorial and also use the manifest format. Will update on the forum if it worked or follow up with query.

Thanks ton!
Best,
R

rr220 · March 4, 2022, 3:12am

Thank you. The manifest program worked. After that I am working on code pasted below:
qiime dada2 denoise-paired --i-demultiplexed-seqs demux-paired-end.qza --o-table table.qza --o-representative-sequences reps-seqs.qza --p-trunc-len-f 270 --p-trunc-len-r 220 --o-denoising-stats denoising-stats.qza

I am not getting any error message with this but, its taking forever to run on terminal (Mac). Is there something wrong with my code? I have a total of 40 sequences, is it expected to run past 4 hours?
Please guide if you can.

Best,
R

jwdebelius · March 4, 2022, 2:22pm

Hi @rr220,

I'm glad it worked. I'm gong to recommend moving this issue to a new topic int he future. That's a little weird that it's taking so long, although you might be having issues training on 40 sequences. If that's the case, I woudl suggest using deblur. If sequences was a typo and you mean 40 samples, maybe. I typically start my Illumina runs before bed, keep my Macbook on, and let them run all night. (Its only bad when something errors late in the process). Keep in mind that DADA2 trains its model on your existing data, which happens serially, and then performs denoising. It's going to take a while.

Best,
Justine

rr220 · March 4, 2022, 2:40pm

Hi thank you for your response. They are 40 samples, so 80 FASTQ files. It took me over 7 hours to get output from the code I shared. In future I will also start this work at night and then leave it on until morning. Appreciate your help.

Additionally, I did make this a new topic and posted. Just in case someone has a similar question, they can easily search for it.

Thank you!

jwdebelius · March 4, 2022, 3:12pm

Hi @rr220,

Then I'm going to encourage you to follow up in that new topic, so you're not getting answers from two of our volunteer mods. I know it's your first time, maybe double check the CoC about this issue. (Most of us are full time researchers, teachers, or developers). And, Merhbod is one of the best people on the forum for DADA2, IMO.

Best,
Justine

rr220 · March 4, 2022, 4:09pm

Thank you for your help and suggestions. I am new to this and make sure to keep these instructions in mind.

system · April 4, 2022, 10:09pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.