"Missing sequence for record"

Hello, I seem to be having the same ITSXpress error. I used the v2.1.4. After running the job, it ran successfully. Based on the log, ITSxpress appeared to be working as the samples were processed and trimmed. However, the expected output file, itsx-trimmed-seqs.qza, was never generated.

From the log, the FASTQ file BB_50_91_L001_R1_001.fastq.gz was read and processed successfully:

51794601 nt in 188235 seqs, min 33, max 300, avg 275
minseqlength 32: 41 sequences discarded.
Masking 100%
Sorting by abundance 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 28736 Size min 1, max 55755, avg 6.6
Singletons: 21902, 11.6% of seqs, 76.2% of clusters

Despite this, QIIME2 still flagged the file as not in FastqGzFormat with the following error:

Traceback (most recent call last):
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/plugin/model/file_format.py", line 26, in validate
self.validate(level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/q2_types/_util.py", line 97, in validate
self._check_n_records(record_count_map[level])
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/q2_types/_util.py", line 66, in _check_n_records
raise ValidationError('Missing sequence for record '
qiime2.core.exceptions.ValidationError: Missing sequence for record beginning on line 13

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/q2cli/commands.py", line 530, in call
results = self._execute_action(
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
results = action(**arguments)
File "", line 2, in trim_single
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
outputs = self.callable_executor(
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/sdk/action.py", line 587, in callable_executor
self.signature.coerce_given_outputs(output_views, output_types,
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/core/type/signature.py", line 498, in coerce_given_outputs
output = self._create_output_artifact(
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/core/type/signature.py", line 520, in _create_output_artifact
artifact = qiime2.sdk.Artifact._from_view(
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/sdk/result.py", line 374, in _from_view
result = transformation(view, validate_level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/core/transform.py", line 68, in transformation
self.validate(view, level=validate_level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/core/transform.py", line 143, in validate
view.validate(level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/plugin/model/directory_format.py", line 177, in validate
getattr(self, field)._validate_members(collected_paths, level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/plugin/model/directory_format.py", line 107, in _validate_members
self.format(path, mode='r').validate(level)
File "/sw/eb/sw/QIIME2/2024.10-Amplicon-itsxpress/lib/python3.10/site-packages/qiime2/plugin/model/file_format.py", line 28, in validate
raise ValidationError(
qiime2.core.exceptions.ValidationError: /tmp/job.665345/qiime2/u.tb252725/processes/234998-1739298337.56@u.tb252725/tmp/q2-OutPath-xgkwpal0/BB_50_91_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:

Missing sequence for record beginning on line 13

Thanks for reaching out @Tawny_Bolinas, and sorry for the slow reply! We must have lost track of this.

@Adam_Rivers - do you have any ideas on what might be going on here? Based on this error, it sounds like QIIME 2 might not like something about a fastq.gz file that is being created by ITSxpress 2. I'd be happy to help debug this with you if that's what's happening.

Hi @Tawny_Bolinas, are you able to share your input data? If so, I can try to reproduce the error. Could you let me know the results of running conda list? In the mean time I'll see if I can replicate it with other datasets.

In the past this has happened when someone's dataset has a sample where, after merging, there are no reads left.If that's the case we can potentially add some better error handling for the edge case.

1 Like

Hello, @Adam_Rivers, here's the error log when running itsxpress

qiime2-q2cli-err-2el0f_ke.txt (54.0 KB)

I have the input data in .qza format. How can I best upload it?

If you want to put it somewhere like google drive then direct message me the link I can download it.

I've sent the link. Please let me know if there's anything else needed. Thank you!

Hi @Tawny_Bolinas, I tracked down the issue. Sometimes when doing single ended analysis and setting the --p-cluster-id to less than 1, a read in a sample will have length of 0, which casues the validation error. As a quick workaround, setting the --p-cluster-id to 1 solved this with your test data. I'm in the middle of fixing this but I decided to clean up some other things with the next release, so it may take me a bit longer to push a fix.

3 Likes