ITSxpress error: Missing sequence for record beginning on line ...

Okay, this happened because many of your reverse mate-pair reads are very short. Illumina has started providing trimming opinions during data acquisition so you don't always get 300 bases for every read on a 2x300 run anymore if in-run trimming has been done by your sequence provider.

In some cases, if the input read is very short ITSxpress will trim away the 5.8S of your reverse read and nothing will be left. The command line version of ITSxpress just outputs an empty reverse read to the Fastq without complaining but Qiime has a validation step that raises the error you encountered if any reads are length 0.

For your data, my advice is to use only our forward reads and run ITSxpress and Dada2 in single-read mode.

I will release a new version of ITSxpress shortly that will not write a trimmed mate-pair if either read is length 0 and will raise a warning informing the user of how many 0-length reads are encountered.

Thanks to @seinarsson for working with me to figure this out.

2 Likes