ECM data qiime tools import for qiime-2019-4

nasiegel · March 31, 2020, 5:02pm

I am trying to make a workflow as reproducible as possible by implementing snakemake and I am using the ECM data as practice in making the workflow.

In short, the workflow works with demultiplexed paired-end reads that don’t require mapping. Can someone please tell me what the input type and format parameters were for the command qiime tools import in Bokulich et al., 2016? I believe this is the source of error in my pipeline as snakemake fails during the demultiplexing and mapping step, saying that it expects artifact of type MultiplexedPairedEndBarcodeInSequence. When I change the type from SampleData[SequencesWithQuality] to MultiplexedPairedEndBarcodeInSequence an error stating qiime expected a path and not a .txt file input.

Nicholas_Bokulich · March 31, 2020, 6:35pm

Welcome to the forum @nasiegel!

It sounds like you are importing these reads just fine, but maybe applying an inappropriate step in your workflow:

That's fine; depending on where you are grabbing the ECAM data from (e.g., off of QIITA), the reads may already be demultiplexed, and it sounds like you were able to import these as SampleData[SequencesWithQuality] just fine.

Now you've confused me — if the reads are already demultiplexed, why are you demultiplexing them? It sounds like this error is telling you that you are inputting the wrong type... because the reads are a SampleData[SequencesWithQuality] type, i.e., already demuxed.

The ECAM data are not MultiplexedPairedEndBarcodeInSequence format in their rawest form, either, since the EMP protocol was used. So your snakename workflow will not work with these data if you are only using q2-cutadapt for demux.

I hope that helps!

nasiegel · March 31, 2020, 7:36pm

Thank you for your quick response. Yes, I am getting the data of QIITA. The demultiplexed reads I spoke of was from a different set of data that the snakemake pipeline was originally made for. I am trying to alter the pipeline to accommodate data from a longitudinal study since I am working with a similar study design, albeit with nonhuman primates.

I am going to try demultiplexing first rather than after I have removed primers.