QIIME 2 use QIIME 1 file directly

Hello, I have some old MiSeq 16S rRNA PE sequencing data. They are from old project (at that time, we only have QIIME 1). The data are are fasta format. They are from three batches of sequencing.

Here the details that how I generated the data.

After I received raw data (fastq file) from sequencing center, I do some basic QC for fastq files. Then I run join_paired_ends.py firstly and then split_libraries_fastq.py. In the end, I got a fasta file. I combine three demulplex fasta file and three mapping file. I get a total fasta file + combined mapping file. From here, I can easily build OTU table using whatever methods such as “pick_closed_reference_otus.py” in QIIME1.

I am now switching to QIIME2 and I would like to see how taxonomy looks using the QIIME2 methods.
I read QIIME 2 tutorials. Almost all tutorials start with raw fastq data from sequencing center. However, it’s been long time. The center didn’t save the raw files (fastq files). I only have a fasta file and a mapping file.
Is there anyway that I can use these two files in QIIME2. Simple conversion etc.


:smile: :point_right: https://docs.qiime2.org/2019.10/tutorials/importing/#sequences-without-quality-information-i-e-fasta

Then you can perform taxonomy classification as shown in the tutorials.

Let us know if you have any question, and welcome to the Qiime 2 forums! :qiime2:


1 Like

Hello Colin,

Thank you very much.

1>I am using the command qiime tools import to convert my unaligned fasta file into qza file. It seems working, but super slow, given that I am using a HPC?

Does it default support multiple cores?

2> For downstream analysis, I think I need to have a sample metafile like this (https://docs.google.com/spreadsheets/d/16oomVnULW-uesehNZc_mKIDTnuRoTiun0CpzSFZagvo/edit#gid=0) at some stage.

There is no way to use script to do this right? So, I have do it manually in excel? Is there any more information about the how to create metafile? I saw the example online and I guess I have to have the first two columns. I am not sure if the other columns are mandatory or the data type (numeric/ categorical) are must. My current mapping file are more complicated than the example.

I would like to know more information about how to design the metafile.

Hi colin,

I just follow the instruction here: https://docs.qiime2.org/2019.10/tutorials/importing/#sequences-without-quality-information-i-e-fasta

Just to confirm. The instruction says " An example of importing and dereplicating this kind of data can be found in the OTU Clustering tutorial."
Does this mean I can only use OTU clustering methods, which not much difference from QIIME 1.

I would like to use to use this “moving Pictures” tutorial workflow to create a sequence variants table. I tried several command, but none of them works.

The problem is that you only have fasta data — dada2 requires fastq since it uses the quality scores to build its error model. If you can’t obtain the quality scores then dada2 is not an option, either in QIIME 2 or in R for that matter.

deblur can be run outside of QIIME 2 without quality scores, an comes installed along with QIIME 2. Running deblur directly will be one way to get an ASV table, then import that (and the output sequences) into QIIME 2 for further analysis.

But if you can’t get fastq data, and don’t want to use deblur, then OTU clustering is your only resort.

Good luck!