Importing fasta+qual instead of fastq

I have fasta+qual files. Qiime1 has convert_fastaqual_fastq.py. What’s the equivalent in qiime2? Or, even better, how to create a qiime2 artifact directly from the fasta+qual+mapping files? I looked here, which seems to be the appropriate location:
https://docs.qiime2.org/2.0.6/tutorials/import-sequence-data/
Thanks!

Hi @git-ingham,
We don’t currently have that functionality in QIIME 2. Can you describe the source of your data (i.e., what sequence platform it was generated on)? If you could also provide the first ~10 lines or so from the fasta and qual files in your reply, that will help me understand exactly what the files contain and I’ll better be able to advise on whether/when we’ll be able to support this.

Thanks!

Sorry for the delay, I have been away for a bit.

The data comes from Mr. DNA, and I believe that they are 454 data. Here are the first couple of lines of the fasta file:

>FS150416.22::M02542:90:000000000-AG8NP:1:1110:6076:20726 1:N:0:4
ATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGTAGCACAGGGAGCTTGCTCCTGGGTGACGAGCGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCCGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCTTCGGACCAAAGTGGGGGACCTTCGGGCCTCACACCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGAGGTAATGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACTCTGGAACTGAGACACGGTCCAGACTCCGACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTGTGAAGAAGGCCTTCGGGTTGTAAAGCACTTTCAGCGAGGAGGAAGGGTTCGGTGTTAATAGCACCGTTCATTGACGTTACTCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCCGCCGC
>FS150416.14::M02542:90:000000000-AG8NP:1:1101:25628:22047 1:N:0:4
ATTGAACGCTGGCGGCGGGCCTAACACATGCAAGTCGAGCGGTAGCACAAGAGAGCTTGCTCTCTGGGTGACGAGCGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATGACGTCTTCGGACCAAAGTGGGGGACCTTCGGGCCTCAAGCCATCAGATGTGCCCAGCTGGGATTAGCTAGTAGGTGGGGTAATGGCTCACCTAGGCGACGATCGCTAGCTGGACTGAGAGGATGACCGGCCACACTGGAAGCGAGACACGGTCCAAACTCCTACGGGAGGCAGCAGTGGGGAAGATGGCACAATGGGCGCAAGCCTGATGCACCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGAGGAGGAAGGCATTAAGATTAATCACCTTAGTGATTGACGTAACTCGCAGTTGAAGCACCGGTTACCTCCTTGCCAGCCGCCGGAGTAAAAC

and here is the start of the qual file:

>FS150416.22::M02542:90:000000000-AG8NP:1:1110:6076:20726 1:N:0:4
38 38 38 38 38 38 38 38 35 38 38 38 38 38 38 38 38 38 38 38 35 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 36 38 38 38 38 38 38 38 37 38 38 38 38 38 38 38 22 34 37 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 35 38 38 38 37 38 37 35 37 37 38 38 38 37 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 33 29 37 38 37 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 31 37 37 38 38 38 38 38 38 38 38 38 37 38 38 38 38 37 38 38 38 37 38 37 38 34 37 38 38 38 38 38 38 38 34 25 34 34 37 37 36 38 38 36 36 37 36 21 34 36 38 38 38 35 38 37 38 38 38 38 38 37 37 38 20 41 41 41 41 41 41 27 35 41 41 28 29 41 41 26 23 41 35 41 41 41 41 36 41 41 41 41 41 41 41 22 28 7 18 41 41 41 41 41 41 41 41 38 28 31 29 41 41 41 41 41 41 28 4 41 22 39 41 41 41 41 41 41 41 41 41 41 39 22 41 28 40 41 41 41 14 28 33 24 25 8 23 25 34 27 18 9 10 10 10 34 10 34 18 15 9 18 10 15 9 32 28 33 18 9 10 34 34 17 30 9 23 9 34 25 16 26 25 36 38 37 34 38 37 27 26 36 34 37 38 38 38 38 37 34 34 24 25 38 37 37 24 26 31 26 11 38 37 35 36 38 37 34 34 25 37 37 37 37 38 38 37 25 31 37 27 22 22 37 31 22 36 37 25 34 25 34 31 37 35 37 11 37 38 37 38 37 38 37 38 37 31 35 38 35 35 38 37 37 24 35 38 37 38 38 38 38 38 38 37 35 38 38 38 38 38 38 38 38 32 38 37 38 38 38 38 38 35 38 36 38 38 38 38 38 38 38 38 38 37 36 37 37 34 38 37 37 37 38 38 37 34 34 37 27 38 37 24 37 38 37 34 38 38 38 37 37 37 38 37 37 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 38 34 34 34 34 34 
>FS150416.14::M02542:90:000000000-AG8NP:1:1101:25628:22047 1:N:0:4
37 31 27 37 38 36 38 37 38 38 38 38 36 31 37 35 34 36 38 36 33 34 34 36 36 37 37 38 38 38 35 38 38 37 34 34 28 25 29 34 38 36 37 38 37 31 34 36 28 36 36 37 38 38 38 38 38 38 38 38 38 37 38 38 37 37 38 37 36 28 37 32 37 31 37 38 38 38 38 35 36 38 38 38 38 38 37 38 27 37 38 38 34 38 38 38 38 38 37 38 38 38 38 37 38 38 38 38 38 36 37 37 37 38 37 37 38 38 37 36 36 38 38 10 25 27 37 38 38 37 34 37 35 37 38 34 27 37 33 37 23 34 34 38 37 38 38 37 38 38 38 38 38 38 38 38 20 26 9 26 37 38 37 36 38 38 36 37 23 37 37 36 37 10 23 30 34 38 38 36 38 22 37 38 38 34 20 16 23 23 10 32 10 10 16 17 27 18 15 24 34 37 30 32 37 38 38 35 24 41 3 22 41 8 41 41 33 41 41 41 41 41 41 38 41 41 41 15 11 41 41 41 41 41 41 41 36 41 41 41 41 20 41 20 31 41 41 41 17 31 20 33 34 14 30 27 30 8 41 41 41 41 41 28 41 41 20 41 22 41 41 41 29 34 25 15 9 29 16 8 17 8 25 37 37 37 29 19 10 22 27 22 17 9 9 10 22 34 34 17 9 25 23 9 30 29 9 27 29 9 37 31 21 36 34 23 32 25 14 34 27 11 17 24 23 35 25 19 11 20 34 33 30 29 11 11 17 11 30 17 11 25 37 37 34 36 25 31 9 23 9 18 10 23 37 21 29 23 25 23 37 26 22 34 33 31 10 18 31 23 10 28 10 18 28 31 11 27 36 20 11 11 24 37 34 11 20 11 11 11 25 25 31 10 31 10 10 38 36 24 37 35 27 36 36 24 11 37 27 11 28 37 31 25 35 37 36 37 37 37 34 25 11 11 11 27 11 11 24 27 11 11 27 21 11 27 34 11 37 34 11 34 21 11 11 24 37 23 36 37 34 37 37 21 36 33 24 27 11 10 22 21 23 11 11 21 11 11 11 34 21 11 25 31 21 21 31 11 26 11 37 33 12 27 38 37 34 11 11 11 11 10 31 21 10 21 10 34 11 11 11 37 32 37 34 34 12 32 12 

I know that others have been able to use QIIME 1.9 to read these files.

Kenneth

Hi @git-ingham,
Thanks for the examples. At this point we don’t have plugins for processing 454 data in QIIME 2. The quality control functionality that we have in q2-dada2 and q2-quality-filter is designed for use with Illumina data and isn’t directly applicable to 454 data. Developing plugins for analyzing 454 data isn’t high on our priority list at the moment as it seems that most of our users are working with Illumina sequencing data, and it’s possible to analyze 454 data with QIIME 1.

Are you certain that these are 454 data? If so, I would recommend using QIIME 1 for the analysis up to the feature table (or “OTU table” in QIIME 1 terminology) and the representative sequences. Once you have those files, you can either import them into QIIME 2, or continue with QIIME 1.

If you choose to import your feature table and representative sequences into QIIME 2, you can see the instructions for importing a BIOM 2.1.0 table here. You would import your representative sequences as follows, assuming the name of that file is rep_set.fna:

qiime tools import --input-path rep_set.fna --output-path temp/rep-seqs.qza --type FeatureData[Sequence]
2 Likes

I just verified that the data is 454.

However, I also will be working with Illuumina data later. However, it is not in EMP format, but the same format as the 454 data. How do I import it into QIIME2?

Hi @git-ingham,
Sorry for the slow reply, this conversation dropped off my radar. This document will give you information on how to import both demultiplexed and multiplexed data (including data that is not in the EMP format).

Hope this helps!