Hi, greetings to everyone.
Currently I'm working with CO1 database from NCBI downloaded by [RESCRIPt] but it comes with seperated taxonomy and sequence for me further with blast using Qiime. So for now I'm trying to merge the separate sequence and taxonomy using sugested command below:
Import the Data: Import your sequence and taxonomy artifacts into QIIME 2 as separate data files. You can use the qiime tools import command. The exact commands may vary depending on your data formats. Below are example commands:
Merge the Data: You can merge the sequence and taxonomy artifacts using the qiime rescript merge-taxa command. Here's an example command: qiime rescript merge-taxa *
** --i-data sequences.qza *
** --i-taxonomy taxonomy.qza **
** --o-merged-sequences taxonomy_and_sequences.qza**
Export Merged Data: Once the data is merged, you can export it to a format of your choice. In your case, you want a FASTA file with taxonomy annotations. You can export the merged artifact to a BIOM format, and then use biom convert to convert it to FASTA.
However I'm already stuck at the step 1 to import the data using this command qiime tools import --type 'FeatureData[Sequence]' --input-path dna-sequences.fasta --output-path sequence.new.qza the error was come out as ,
An unexpected error has occurred:
** BLAST6 is not a variant of SampleData.field['type']**
See above for debug info.
therefore, if anyone may help me to solve this error could be very helpful.
thank you.
It looks like you are trying to import a file that is not in .fasta format into an artifact that expects such a format. What does head dna-sequences.fasta output?
so I had to used the other version of ubuntu (Ubuntu 20.04.6 LTS) to import the file from fasta format to qza.
this error was occured by using Ubuntu 22.04.2 LTS.
yes, since version of Ubuntu 22.04.2 LTS also was installed with RESCRIPt package, somehow Ubuntu 20.04.6 LTS without RESCRIPt package on it. Does it will affect any of it?
for version of Ubuntu 22.04.2 LTS, I did try the command with the latest version of qiime2.2023.7 but got another error: Traceback (most recent call last):
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/bin/qiime", line 11, in **
** sys.exit(qiime())**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 1157, in call**
** return self.main(args, kwargs)
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 1078, in main*
** rv = self.invoke(ctx)**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 1688, in invoke**
** return _process_result(sub_ctx.command.invoke(sub_ctx))**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 1688, in invoke**
** return _process_result(sub_ctx.command.invoke(sub_ctx))**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 1434, in invoke**
** return ctx.invoke(self.callback, ctx.params)
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/click/core.py", line 783, in invoke**
** return __callback(args, kwargs)
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 49, in export_data*
** result = qiime2.sdk.Result.load(input_path)**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/result.py", line 75, in load**
** peek = cls.peek(filepath)**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/result.py", line 59, in peek**
** return ResultMetadata(archive.Archiver.peek(filepath))*
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/core/archive/archiver.py", line 336, in peek**
** archive = cls.get_archive(filepath)**
** File "/home/fatihahnajihah/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/core/archive/archiver.py", line 322, in get_archive**
** raise ValueError("%s is not a QIIME archive." % filepath)** ValueError: dna-sequences.fasta is not a QIIME archive.
thank you for all your suggestions, finally I managed to import the files into sequences.qza and taxonomy.qza after installing the new version of qiime2 (qiime2-2023.7).
However, I still cannot merged the both file from individually folder of sequence and taxonomy as below to become one compiled files which contain sequence and taxonomic information in only one file for undergo blast using ubuntu.
folder of sequences only:
(this one I had to show it in the fasta format) but I have changed it to qza format.
however, in the qiime rescript merge-taxa dont have any requirements for --i-taxonomy .
the command for merge-taxa only involved with --i-data and --o-merged-sequences.
Therefore may I know any other way to merged the sequence and taxonomic given by NCBI database for me undergo blast using vsearch.
thank you.
When you say merge the sequence and taxonomy artifacts, what you want to create is a fasta file wherein each sequence has its taxonomic assignment as its header, is that correct?
yes correct, since in vsearch they need the --db for run the command for blast. however the database I got from NCBI using RESCRIPt the output was seperated with sequences and taxonomy artifacts.
so I have to merge the files as one fasta file for act as database such as example below for run the command vsearch --usearch_global FILENAME --db FILENAME --id 0.97 --alnout FILENAME:
What is the link between your FeatureData[Sequence] and FeatureData[Taxonomy]? It seems like you still need to classify your sequences in some way, otherwise how do you know which sequences get which taxonomy headers, right? Let me know if I'm missing something.