question about importing SRA study as artifacts

Hi there,
I am familiar with QIIME1 but relatively new with QIIME2. I have gotten my raw file in the past from a facility in the CASAVA pair ended demultiplexed format and I had no problem analyzing those files with both versions of qiime.
I am now conducting an investigation of public data available in the SRA archive and I am having a time importing the raw data as QIIME2 artifacts.
I have downloaded different projects (prefetch and fastq-dump commands in the SRA toolkit) and I don’t think they are demultiplexed (?) so I guess I have two questions:

  1. How do I evaluate which what to specify in the --type option of the import script?specifically how do I tell what type of data I have?
  2. would I be better off just demultiplexing the data first and then rename everything in the casava format? is that even an option? I looked around the forum and maybe the manifest format would be the option?

I am running qiime2-2020.6

1 Like

First, please review Importing data — QIIME 2 2020.6.0 documentation. If you still have questions, you can ask us here on the forum. Usually we need to see a few lines of the file(s), etc to help guide.

This one is tricky, usually the best person to ask is the sequencing center that prepared that data, although in this case I recognize you don't have that info. Does the study metadata say anything about how it was produced?

It is an option, although no need to rename to casava filenames, you can just use a manifest format. Unfortunately though you will still need to know what primers/adapters/barcodes/etc are still in the reads (and what sequencing platform produced them) in order to make informed decisions on things like DADA2 vs Deblur vs OTU clustering.

Keep us posted! :qiime2:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.