thank you so much i already solved it, i have another question regarding taxonomy i used the green gene and i have problem that approximately 43.271% not assign to family and genera, i wanted to know if there is a command i can export the specific OTU with the sequence for this taxa then i will able to blast to, i attached screenshot for the taxonomy, thank you
I am not sure for 100%, but I guess that here you can see pooled hashes that all were assigned to this taxonomy. You can open taxonomy.tsv file (from taxonomy.qza) and find there all hashes that are assigned to this bacteria, after it find all the representative sequences in dna-sequences.fasta (rep-seqs.qza) by hashes. After you can blast it.
unfortunately, it does not work so there is no other way to solve this problem ?
Note: you are not getting family-level classification because there is not a perfect match in the database, or too many matches that have different family-level taxonomies. Using BLAST to find the top hit can give you some more clarity on what this might be, but I would discourage you from relying on that BLAST result, especially for publication/reporting.
This is how I export ASV sequences given a specific taxon in python API.
import pandas as pd from qiime2 import Artifact def taxon2fasta(taxonomy, sequences, taxon, path): ''' taxonomy is an artifact of type FeatureData[Taxonomy] sequences is an artifact of type FeatureData[Sequence] taxon is the annotated OTU we are interested in. input string path is where to export the fasta files. input string ''' # convert FeatureData[Taxonomy] to pandas dataframe df_taxon = taxonomy.view(pd.DataFrame) # filter ASV that were annotated to 'taxon' df_taxon = df_taxon.loc[(df_taxon.loc[:,'Taxon'] == taxon)] # convert FeatureData[Sequence] to pandas series ser = sequences.view(pd.Series) # filter seqs that were annotated to 'taxon' ser_taxon = ser[df_taxon.index] # covert filtered seqs to artifact taxon_seq = Artifact.import_data('FeatureData[Sequence]', ser_taxon) # export fasta files to given path taxon_seq.export_data(path)
And you can import your .qza files using Artifact module like this:
seqs = Artifact.load('path_of_your_qza_file.qza')
Hope this helps.
By the way @Nicholas_Bokulich , can you explain more about this or is there any reference?
most microbiome sequencing methods rely on rather short DNA fragments — e.g., the V4 domain of the 16S rRNA gene — which contains limited taxonomic information on its own. This is very frequently insufficient to truly classify to species level, and using NCBI BLAST can provide misleading results if you BLAST, take the top hit, and move on without ensuring that there are not other equally (or nearly) good hits as well. BLAST is fine if you carefully consider the other hits: LCA methods — like the
classify-consensus-* methods in QIIME 2’s q2-feature-classifier — use BLAST or other aligners for database searching, but then consider what taxonomic consensus there is among the top hits to determine, e.g., whether multiple species are hit and whether that sequence can truly be classified to species level.
Thanks for your prompt and detailed explanation!!
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.