QIIME1DemuxFormat is not compatible with FASTQ files

With the addition of QIIME1DemuxFormat as a source-format, Qiime2 can now import Qiime1-demultiplexed fasta files (i.e. seqs.fna by split_libraries_fastq.py). However, this does not seem to work for the equivalent fastq files (i.e. seqs.fastq)

To illustrate, this works when the input is a FASTA file (no quality scores),

qiime tools import \
    --type SampleData[Sequences] \
    --input-path seqs.fna \
    --output-path demux.qza \
    --source-format QIIME1DemuxFormat

but not when the input is a FASTQ file (with quality scores),

qiime tools import \
    --type SampleData[SequencesWithQuality] \
    --input-path seqs.fastq \
    --output-path demux.qza \
    --source-format QIIME1DemuxFormat

for which it yields the following error

An unexpected error has occured:

  No transformation from <class
  'q2_types.per_sample_sequences._format.QIIME1DemuxFormat'> to <class '
  q2_types.per_sample_sequences._format.SingleLanePerSampleSingleEndFast
  qDirFmt'>

See above for debug info.

Is my import command correct? If yes, is the error due to the lack of a transformation that enables sourcing a QIIME1DemuxFormat into SampleData[SequencesWithQuality]? Would it be possible to code such a transformation in upcoming releases?

There are two related discussions regarding this issue.

  1. One user suggested the manifest file approach for importing data. However, this assumes that the library of sequences is demultiplexed into multiple files, one for each sample. This is not always the case for files previously demultiplexed in Qiime1 where usually a single file contains all demultiplexed samples.
  2. Another user has created a clever way to circumvent this problem but this somewhat unnecessarily requires splitting the files via Qiime1 and creating a manifest file.

Thanks again for your work on on Qiime! It has been really nice to witness your progress in the newest release.

1 Like

@firasmidani, the method I had proposed as workaround was to use QIIME 1.9.1’s split_sequence_file_on_sample_ids to split the result of split_libraries_fastq.py into per sample files. I think that approach should still work as a ducttape solution, but I have not verified that it works on the latest release of QIIME2.

Best,
Daniel

Hi @firasmidani!

We don't currently support importing QIIME 1 demultiplexed FASTQ files. We have an open issue tracking this feature and will follow up here when it's available in a release!

Thanks for the workaround @wasade!

Another option is to use a tool to convert your FASTQ file into FASTA, then you should be able to import it using the QIIME1DemuxFormat command you posted above. QIIME 1's convert_fastaqual_fastq.py script can be used to do that conversion (ignore the script name, it can do bidirectional conversions), and there's plenty of other tools available that can do the conversion as well if QIIME 1 isn't an option for you.

Let us know how it goes!

@firasmidani, the method I had proposed as workaround was to use QIIME 1.9.1’s split_sequence_file_on_sample_ids to split the result of split_libraries_fastq.py into per sample files. I think that approach should still work as a ducttape solution, but I have not verified that it works on the latest release of QIIME2.

Thanks @wasade. Over the past couple of weeks, I have been primarily using your workaround approach including your make_importable.sh script and it has been working great with the newest Qiime release (2017.9).

We don’t currently support importing QIIME 1 demultiplexed FASTQ files. We have an open issue2 tracking this feature and will follow up here when it’s available in a release!

@jairideout, thanks for pointing out the open issue; I did not notice it earlier.

Another option is to use a tool to convert your FASTQ file into FASTA, then you should be able to import it using the QIIME1DemuxFormat command you posted above. QIIME 1’s convert_fastaqual_fastq.py script can be used to do that conversion (ignore the script name, it can do bidirectional conversions), and there’s plenty of other tools available that can do the conversion as well if QIIME 1 isn’t an option for you.

FASTQ to FASTA conversion followed by QIIME1DemuxFormat importing works as expected but I would like to re-run older or publicly available datasets through QIIME2 with DADA2 denoising which requires quality scores.

In the meantime, I will keep using @wasade's approach and will re-visit this issue at the next release. Thanks!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.