Mock Community Data Import Issue


I have used Qiime2 to do analysis with paired-end, phred33 and multiplexed sequences, and everything went well after importing the raw data into qiime2 artifacts.

Now, however, I am trying to analysis the mock-6 data ( from the Mockrobiota GIthub ), includes barcodes.fastq.gz and forward reads in fastq format




which is a Phred64.
I was able to import this multiplexed data as "EMPSingleEndSequences" type,

qiime tools import \
    --type EMPSingleEndSequences \
    --input-path mock_6_data \
    --output-path mock_6_data/mock.qza     

and after I demultiplexed the data, the quality plot looks like this,
which obviously means I didn't import the data correctly. So I am stucking in the very begining.

Any suggestions?

Hi @Guan_Haibin,
Thank you for reporting. This not a user error, it looks like the EMP format is not properly equipped for Phred64 format, and so is being interpreted as Phred33. I have raised an issue to get this fixed.

You could try converting to Phred33, or else @thermokarst is going to follow up with a different workaround.

Also, just a suggestion: mock-6 is a really old and not particularly good dataset in mockrobiota. If you want 16S data, I would recommend just about any others. mock-12 and mock-18+ are particularly good (and probably phred33 already).

Good luck!

qiime tools export mock.qza --output-dir exported-mock
cat exported-mock/MANIFEST | awk -F ',' 'BEGIN { OFS = "," } {print $1, "$PWD/exported-mock/" $2, $3}' | sed "s|\$PWD/exported-mock/filename|absolute-filepath|g" > phred-64-manifest.csv
qiime tools import \
  --type 'SampleData[PairedEndSequencesWithQuality]' \
  --input-path phred-64-manifest.csv \
  --output-path paired-end-demux.qza \
  --source-format PairedEndFastqManifestPhred64

The second line with the cat and awk business is for transforming the existing MANIFEST file into a fastq-manifest — you could also use Excel or Google Sheets, if that makes more sense for you. The idea here is to export the data you imported already, take advantage of a file (MANIFEST) that has been generated for you, and use that to try and re-import as Phre64.

Hope that helps! :qiime2: :t_rex:

1 Like

Hi @thermokarst,

Thanks for generating the manifest file.
In this case, the mock-6 data has not been demultiplexed yet, thus I don’t think it would work by using the Manifest file.

BTW, I have tried to use “fastq_phred_convert” tool that @Nicholas_Bokulich suggested and it worked for me.

Oops, I made a typo when writing up my guide - you would export after you demultiplexed, then you would have the MANIFEST file.

I tried by using the following commands and it worked great:

qiime tools export mock_6_data/demux.qza --output-dir mock_6_data/exported-demux
    cat mock_6_data/exported-demux/MANIFEST | awk -F ',' 'BEGIN { OFS = "," } {print $1, "$PWD/mock_6_data/exported-demux/" $2, $3}' | sed "s|\$PWD/mock_6_data/exported-demux/filename|absolute-filepath|g" > mock_6_data/phred-64-manifest.csv
    qiime tools import \
      --type 'SampleData[SequencesWithQuality]' \
      --input-path mock_6_data/phred-64-manifest.csv \
      --source-format SingleEndFastqManifestPhred64 \
      --output-path mock_6_data/single-end-demux.qza 
    qiime demux summarize \
        --i-data mock_6_data/single-end-demux.qza \
        --o-visualization mock_6_data/single-end-demux.qzv


1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.