Hi!
I am working on a dataset in which I recived processed tables (after dada2 + taxnomic annotations) from a collaborator, which includes an ASV feature table and a taxa table from Phyloseq, both have been uploaded fairly easy to qiime2.
asv_table_q2 = q2.Artifact.import_data(type="FeatureTable[Frequency]", view=asv_table.T)
taxa_master = taxa_master.fillna("")
taxa_master["Taxon"] = taxa_master.apply
(lambda x: f"k__{x['Kingdom']};p__{x['Phylum']};c__{x['Class']}
;o__{x['Order']};f__{x['Family']};g__{x['Genus']};s__{x['Species']}", axis=1)
taxa_master = taxa_master["Taxon"]
taxa_master = taxa_master.rename_axis("Feature ID")
taxa_master_q2 = q2.Artifact.import_data("FeatureData[Taxonomy]", taxa_master)
In addition, I want to create a FeatureData[Sequence] artifact, derived from the taxa_master, to create a phylogenetic tree. the taxa master contains an index which is the 16S seq after DADA2 and a columns with taxonomic information (unnecessary for the tree).
I have tried to create a table with index and column both rpresenting the sequence, as follows:
sequence_table = pd.DataFrame(taxa_master).copy()
sequence_table["Sequence"] = sequence_table.index
sequence_table = sequence_table.drop("Taxon", axis=1)
And then either use the dataframe or a tsv file to create a FeatureData[Sequence] artifact, unsuccessfully.
This is the error from using the dataframe:
No transformation from <class 'pandas.core.frame.DataFrame'> to <class 'q2_types.feature_data._format.DNASequencesDirectoryFormat'>
And the error for using the tsv file:
First line of file is not a valid description. Descriptions must start with '>'
What would be a workaround here to create the desired artifact?
Thanks!!