Plugin error from itsxpress: Missing sequence for record beginning on line 5

Praveenth · July 9, 2023, 9:49am

Dear Sir/ Madam,

Please see following error report
Plugin error from itsxpress: /tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-zkswhuwu/ITSTW31_S267_L001_R2_001.fastq.gz is not a(n) FastqGzFormat file: Missing sequence for record beginning on line 5 Debug info has been saved to /tmp/qiime2-q2cli-err-nn4micb2.log

I went through all the threads with the similar issue. But nothing is working

My itsxpress version
itsxpress 1.8.0 pyhdfd78af_2 bioconda
q2-itsxpress 1.8.0 pypi_0 pypi

Finally settled with qiime dada2 denoise-single using the forward reads. but not sure whether I used correct trimming lengths.
Screenshot from 2023-07-09 07-43-17

Is it possible identify my all missing sequences. I can remove them and run q2-itsxpress again. Because I have used q2-itsxpress for analyzing all my other data. Please help me.

Thank you and Kind regards

szymanski · July 10, 2023, 3:06pm

Could you share the debug info that was saved to /tmp/qiime2-q2cli-err-nn4micb2.log ? That might help others in their attempts to help with this.

Also, could you share earlier steps in the read processing? I remember having issues with using itsexpress, similarly saying things like this. If you used cutadapt or anything else that could have filtered reads, it could be leading to this happening downstream.

This last part isn't tech support, but I think is worth saying: When it comes to the ITS, in my experience and talking to other people who have worked extensively in the ITS region with amplicon data, using just the forward reads is often perfectly fine. The reverse reads have worse quality typically, and if you amplified the entire ITS region (ITS 1 and ITS 2 both amplified, which happens with pairs like ITS1f/ITS4), you will most likely lose a massive amount of reads regardless if you try to use dada2 denoise-paired because many wont be able to pair on account of the combination of poor quality in the region that would be overlapping and the fact that the ITS 1-5.82-ITS 2 stretch can be well over 600bp regardless.

This is moreso to say that this is going to likely be okay if you proceed just using the denoising with forward reads!

Praveenth · July 11, 2023, 1:24am

Hi @szymanski

Thank you for your response.

Please see following

I used FWD ITS1 and REV ITS2 primers

|Fwd|CTTGGTCATTTAGAGGAAGTAA|

|Rev|GCTGCGTTCTTCATCGATGC|

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path casava-18-paired-end-demultiplexed
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-paired-end.qza

(qiime2-2022.2) qiime2@qiime2core2022-2:~/Whang_ITS_Files/Dada2_paired$ qiime itsxpress trim-pair-output-unmerged\

--i-per-sample-sequences demux-paired-end.qza
--p-region ITS1
--p-taxa F
--o-trimmed trimmed.qza

Plugin error from itsxpress:

/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-3_uezber/ITSTW31_S267_L001_R2_001.fastq.gz is not a(n) FastqGzFormat file:

Missing sequence for record beginning on line 5

Debug info has been saved to /tmp/qiime2-q2cli-err-iohomybv.log

(qiime2-2022.2) qiime2@qiime2core2022-2:~/Whang_ITS_Files/Dada2_paired$ head /tmp/qiime2-q2cli-err-iohomybv.log
vsearch v2.7.0_linux_x86_64, 14.3GB RAM, 2 cores

Reading file /tmp/itsxpress_cqgchnnh/seq.fq.gz 100%
3896378 nt in 10875 seqs, min 36, max 545, avg 358
Masking 100%
Sorting by abundance 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%

Thank you for the information and suggestions. So I will proceed with de-noising using only forward reads.
Kind regards
Prav

system · August 11, 2023, 7:25am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.