Question about error no transformation class for dataframe to dir


(John Chase) #1

I will try to be concise here, so please let me know if I inadvertently miss anything important.

I am developing a plugin for QIIME2 and have been playing around with developing a new semantic type to store some of the data generated. When I run the plugin with the new type I get this error:

Exception: No transformation from <class 'pandas.core.frame.DataFrame'> to <class 'qiime2.plugin.model.directory_format.FactorsDirFmt'>

I would be quite impressed if there was a transformation class for a pandas dataframe! So the error is not surprising, however I don’t understand at what point, or why qiime is trying to convert a dataframe into the FactorsDirFmt

I have an artifact definition:

Factors = SemanticType('Factors', variant_of=FeatureData.field['type'])


class FactorsFormat(model.TextFileFormat):
   # Update with required formatting
    def validate(*args):
        pass


FactorsDirFmt = model.SingleFileDirectoryFormat(
    'FactorsDirFmt', 'factors.tsv', FactorsFormat)

And transformation classes:

@plugin.register_transformer
def _1(ff: FactorsFormat) -> pd.DataFrame:
    return pd.read_csv((str(ff)), sep='\t')

@plugin.register_transformer
def _2(df: pd.DataFrame) -> FactorsFormat:
    ff = FactorsFormat()
    df.to_csv(str(ff), sep='\t', header=True, index=True)
    return ff

in the plugin_setup.py I register the formats

plugin.register_formats(FactorsFormat, FactorsDirFmt)
plugin.register_semantic_types(Factors)
plugin.register_semantic_type_to_format(FeatureData[Factors], FactorsDirFmt)

and in the funtion register I have the return type set as FeatureData[Factors]

outputs=[
       ('featuretable', FeatureTable[Frequency]),
       ('tree', Phylogeny[Unrooted]),
       ('factors', FeatureData[Factors])

The return type of the function call itself is a pandas data frame:

def phylofactor(table: biom.Table,
                phylogeny: NewickFormat,
                metadata: Metadata,
                family: str,
                formula: str = 'Data ~ X',
                choice: str = 'F',
                nfactors: int = 10,
                ncores: int = 1
                ) -> (biom.Table, skbio.tree.TreeNode, pd.DataFrame):

My understanding here was that the dataframe being returned from phylofactor would be converted to a FeatureData[Factors] based on the plugin register which defines the output, but this does not seem to be happening for the pandas dataframe. I am stuck finding where exactly QIIME is attempting to convert the dataframe to a FactorsDirFmt. Thank you for your help!


(Evan Bolyen) #2

Hey @John_Chase,

The issue is that your file with transformers is never imported from the same file the entry-point has. So the @plugin.register_transformer decorators never actually get to be executed (and thus your transformers don’t really exist in the plugin).

Adding:

importlib.import_module('q2_phylofactor._transform')

to the bottom of plugin_setup.py should fix it!


(John Chase) #3

Oh my gosh. I had actually run into this same problem awhile back and fixed it with your solution, and then apparently forgot all of that until now…

Should it just be importlib.import_module('q2_phylofactor._transform') without the .py?

Thanks for your response!


(Evan Bolyen) #4

No problem!

and yes it should. Edited above, thanks!