Uchime_uniq.fasta import into qiime

Dear all,

I have got the clean fasta that means chimeras have been filtered. I want to use the qiime2 to do the next steps, but I don’t know how to import the data into qiime2.

for example, I have 100 samples named 16S15697A_RA_GUT_001_AGAACA_33157_last_precluster_uchime_uniq.fasta
16S15697A_RA_GUT_001_AGAACA_33157_last_precluster_uchime.fasta
16S15697A_RA_GUT_002_GGTGTG_28581_last_precluster_uchime_uniq.fasta
16S15697A_RA_GUT_002_AGAACA_33157_last_precluster_uchime.fasta

Thanks for your help!

Best,
Hees

Hi @13479776!

Do you by chance still have the raw sequence data? You’ll be able to do a bit more with it, but it’s also fine if you don’t have it, or don’t wish to re-analyze.

It looks like you have SampleData[Sequences] (since there’s no quality scores). Do these reads have direction? Also would you be able to post a snippet of the start of one of your FASTA files?

I kind of think we don’t have a format for this defined yet, so we might have to work a bit harder to get this particular structure into QIIME 2.

Hi ebolyen,

Here is the one of start of FASTA files.

16S15697A_RA_GUT_001_AGAACA_33157_last_precluster_uchime_uniq.fasta

M04044^16^000000000-AHYET^1^1112^23707^24274^4580
TGAGGAATATTGGTCAATGGACGAGAGTCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATACGGGAATAAAGTGAGCCACGTGTGGCTTTTTGTATGTACCGTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCGGGTTGTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGCGACCTTGAGTGCAACAGAGGTAGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTTACTGGATTGTAACTGACGCTGATGCTCGAAAGTGTGGGTATCAAACAG

16S15697A_RA_GUT_001_AGAACA_33157_last_precluster_uchime.fasta

M04044^16^000000000-AHYET^1^1112^23707^24274
TGAGGAATATTGGTCAATGGACGAGAGTCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATACGGGAATAAAGTGAGCCACGTGTGGCTTTTTGTATGTACCGTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCGGGTTGTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGCGACCTTGAGTGCAACAGAGGTAGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTTACTGGATTGTAACTGACGCTGATGCTCGAAAGTGTGGGTATCAAACAG

Could this be used for SampleData[Sequences] import? Thanks!

Best,
Hees

Hi @13479776,

Yes, those data are eligible for SampleData[Sequences] import. You can do this with this command:

qiime tools import \
  --type SampleData[Sequences] \
  --input-path seqs.fasta \
  --output-path seqs.qza

I hope that helps!

Hi Nicholas,

The command returned the following error.

 There was a problem importing  16S15697A_RA_GUT_060_TGGACG_22568_last_precluster_uchime.fasta:

16S15697A_RA_GUT_060_TGGACG_22568_last_precluster_uchime.fasta is not a(n)        QIIME1DemuxFormat file

Best wish,
Hees

Try this:

qiime tools import \
  --type SampleData[Sequences] \
  --input-path seqs.fasta \
  --output-path seqs.qza \
  --source-format DNAFASTAFormat

You can see importable formats with this command:

qiime tools import --show-importable-formats

Hi Nicholas,

Sorry for new error. I have used the above code. The unexpected error was attached here. Could you Kindly check the example data? I am not sure the fasta format that will be fine for this import code. Thank you!

the .fasta data

M04044^16^000000000-AHYET^1^1112^23707^24274^4580
TGAGGAATATTGGTCAATGGACGAGAGTCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATACGGGAATAAAGTGAGCCACGTGTGGCTTTTTGTATGTACCGTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGGCGGGTTGTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGCGACCTTGAGTGCAACAGAGGTAGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCTTACTGGATTGTAACTGACGCTGATGCTCGAAAGTGTGGGTATCAAACAG
M04044^16^000000000-AHYET^1^2118^28098^16154^4301
TGAGGAATATTGGTCAATGGGCGAGAGCCTGAACCAGCCAAGTAGCGTGAAGGATGACTGCCCTATGGGTTGTAAACTTCTTTTATAAAGGAATAAAGTCGGGTATGTATACCCGTTTGCATGTACTTTATGAATAAGGATCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGATGGATGTTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGATATCTTGAGTGCAGTTGAGGCAGGCGGAATTCGTGGTGTAGCGGTGAAATGCTTAGATATCACGAAGAACTCCGATTGCGAAGGCAGCCTGCTAAGCTGCAACTGACATTGAGGCTCGAAAGTGTGGGTATCAAACAG

The error info.

(qiime2-2018.4) mprobes-MacBook-Air-2:2_filter_chimeras mprobe$ qiime tools import -        -type SampleData[Sequences] --input-path seq_test.fasta --output-path seq_test.qza --source-format DNAFASTAFormat
 Traceback (most recent call last):
 File "/Users/mprobe/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/q2cli/tools.py", line 116, in import_data
view_type=source_format)
 File "/Users/mprobe/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/qiime2/sdk/result.py", line 218, in import_data
return cls._from_view(type_, view, view_type, provenance_capture)
  File "/Users/mprobe/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/qiime2/sdk/result.py", line 242, in _from_view
recorder=recorder)
  File "/Users/mprobe/miniconda3/envs/qiime2-2018.4/lib/python3.5/site-packages/qiime2/core/transform.py", line 59, in make_transformation
(self._view_type, other._view_type))
Exception: No transformation from <class 'q2_types.feature_data._format.DNAFASTAFormat'> to <class 'qiime2.plugin.model.directory_format.QIIME1DemuxDirFmt'>

An unexpected error has occurred:

No transformation from <class 'q2_types.feature_data._format.DNAFASTAFormat'> to <class 'qiime2.plugin.model.directory_format.QIIME1DemuxDirFmt'>

See above for debug info.

I think you are correct. It looks like SampleData[Sequences] specifically expects QIIME1DemuxDirFmt. Those files should follow the format described here. My advice would be to:

  1. try reformatting your files to match that format
  2. Perform denoising/OTU picking outside of QIIME2 (e.g., with dada2 in R) and then import the resulting biom table into QIIME2 for downstream analysis. Maybe you have already done this — if you have UCHIME unique sequences, I’m assuming you may have already used USEARCH for OTU picking? (if so, import these sequences as type FeatureData[Sequences] instead)
  3. Follow @ebolyen’s advice from above:

We have chimera filters etc in QIIME2, so working with raw data would be a lot easier.

Sorry we don’t have an easier answer! I think @ebolyen was right from the start — these data are not in a format that QIIME2 currently supports.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.