Difference between min and max levels of validation

Hello there,

I was trying to validate my imported data on QIIME2 and when I use min level of validation (qiime tools validate felipe-imported.qza --level min) my data seem to be fine, but when I use max level of validation (qiime tools validate felipe-imported.qza) I get this error message:

Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/q2cli/builtin/tools.py", line 404, in validate
result.validate(level)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/result.py", line 319, in validate
self.format.validate(self.view(self.format), level)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/plugin/model/directory_format.py", line 171, in validate
getattr(self, field)._validate_members(collected_paths, level)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/plugin/model/directory_format.py", line 101, in _validate_members
self.format(path, mode='r').validate(level)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/plugin/model/file_format.py", line 24, in validate
self.validate(level)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_types/per_sample_sequences/_format.py", line 279, in validate
self._check_n_records(record_count_map[level])
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_types/per_sample_sequences/_format.py", line 239, in check_n_records
for i, record in file
:
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/gzip.py", line 289, in read1
return self._buffer.read1(size)
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/home/qiime2/miniconda/envs/qiime2-2019.4/lib/python3.6/gzip.py", line 482, in read
raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

An unexpected error has occurred while attempting to validate result felipe-imported.qza:

Compressed file ended before the end-of-stream marker was reached

See above for debug info.

What are the differences between these levels of validation? Is it ok to continue my analysis knowing that at max level validation there was an error?

Thanks for the help,
Felipe.

Great questions!

This depends on the artifact's semantic type, but generally min checks the first few lines, max checks the entire file. min is used when importing most types as a "quick check" but it is often good to do a max validate as you have done to confirm the integrity of your files.

No. in your case, it looks like the file was corrupted somehow — perhaps an interrupted download? You should re-download and re-import, then re-validate before proceeding.

1 Like

Hi, Nicholas!

That is kind of you :grin:

Alright, I get the difference between them, thanks.
I have done every step again as you suggested and now my max validation is valid. Yaay, thank you so much for the insight :smiley:

1 Like

min validates the first few records of the file (typically), while max validates the entire file (typically). It depends on the format of data being validated, but, that is the general strategy.

NO.

:stop_sign:

This error message is telling you that you have incomplete fastq.gz files in your qza file --- your data is incomplete. Please re-acquire from your sequencing center and reimport. I suggest you use something like the md5sum of the files to ensure you fully acquire them, prior to import.

Keep us posted! :t_rex: :qiime2:

1 Like