Artifact API Import Error for Series of Sequences

Hi all,

I recently upgraded to qiime2-2020.6. I have some old code from qiime2-2020.2 where I was importing some test data:

import pandas as pd
from skbio import DNA
from qiime2 import Artifact

region1_db_seqs = Artifact.import_data('FeatureData[Sequence]', pd.Series({
    'seq1|seq2': DNA('GCGAAGCGGCTCAGG', metadata={'id': 'seq1|seq2'}),
    'seq3@0001': DNA('ATCCGCGTTGGAGTT',  metadata={'id': 'seq3@0001'}),
    'seq3@0002': DNA('TTCCGCGTTGGAGTT', metadata={'id': 'seq3@0002'}),
    'seq5': DNA('CGTTTATGTATGCCC', metadata={'id': 'seq5'}),
    'seq6': DNA('CGTTTATGTATGCCT', metadata={'id': 'seq6'}), 
    }))

In the qiime2-2020.2 enviroment, it works fine.

I created a new enviroment, and I get a type error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-cb84a5d4a004> in <module>
      4     'seq3@0002': DNA('TTCCGCGTTGGAGTT', metadata={'id': 'seq3@0002'}),
      5     'seq5': DNA('CGTTTATGTATGCCC', metadata={'id': 'seq5'}),
----> 6     'seq6': DNA('CGTTTATGTATGCCT', metadata={'id': 'seq6'}),
      7     }))

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/result.py in import_data(cls, type, view, view_type)
    239         provenance_capture = archive.ImportProvenanceCapture(format_, md5sums)
    240         return cls._from_view(type_, view, view_type, provenance_capture,
--> 241                               validate_level='max')
    242 
    243     @classmethod

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/result.py in _from_view(cls, type, view, view_type, provenance_capture, validate_level)
    252                 % type)
    253 
--> 254         pm = qiime2.sdk.PluginManager()
    255         output_dir_fmt = pm.get_directory_format(type)
    256 

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/plugin_manager.py in __new__(cls, add_plugins)
     52         if cls.__instance is None:
     53             self = super().__new__(cls)
---> 54             self._init(add_plugins=add_plugins)
     55             cls.__instance = self
     56         else:

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/plugin_manager.py in _init(self, add_plugins)
     81                 plugin = entry_point.load()
     82 
---> 83                 self.add_plugin(plugin, package, project_name)
     84 
     85     def add_plugin(self, plugin, package=None, project_name=None):

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/plugin_manager.py in add_plugin(self, plugin, package, project_name)
    101                 'for `project_name` or set `plugin.project_name`.')
    102 
--> 103         self._integrate_plugin(plugin)
    104         plugin.freeze()
    105 

~/miniconda3/envs/test/lib/python3.6/site-packages/qiime2/sdk/plugin_manager.py in _integrate_plugin(self, plugin)
    135             if output in self.transformers[input]:
    136                 raise ValueError("Transformer from %r to %r already exists."
--> 137                                  % transformer_record)
    138             self.transformers[input][output] = transformer_record
    139             self._reverse_transformers[output][input] = transformer_record

TypeError: not all arguments converted during string formatting

I’ve also tried doing

region1_db_seqs = Artifact.import_data('FeatureData[Sequence]', pd.Series({ 
     'seq1|seq2': 'GCGAAGCGGCTCAGG', 
     'seq3@0001': 'ATCCGCGTTGGAGTT', 
     'seq3@0002': 'TTCCGCGTTGGAGTT', 
     'seq5': 'CGTTTATGTATGCCC', 
     'seq6': 'CGTTTATGTATGCCT', 
      }))                                                                                                                                                             

Which throws the same error in the new version.

So, I’m not sure what’s going on? I need this to work with the new version, so Im not sure where to go.

Thanks!
Justine

Hi @jwdebelius,

I just tried this in my 2020.6 env and I got it to work by adding skbio. in front of DNA.

That is:

[1]: import pandas as pd 
   ...: import skbio 
   ...: from qiime2 import Artifact 
   ...:                                                                                                                

In [2]: region1_db_seqs = Artifact.import_data('FeatureData[Sequence]', pd.Series({ 
   ...:     'seq1|seq2': DNA('GCGAAGCGGCTCAGG', metadata={'id': 'seq1|seq2'}), 
   ...:     'seq3@0001': DNA('ATCCGCGTTGGAGTT',  metadata={'id': 'seq3@0001'}), 
   ...:     'seq3@0002': DNA('TTCCGCGTTGGAGTT', metadata={'id': 'seq3@0002'}), 
   ...:     'seq5': DNA('CGTTTATGTATGCCC', metadata={'id': 'seq5'}), 
   ...:     'seq6': DNA('CGTTTATGTATGCCT', metadata={'id': 'seq6'}),  
   ...:     }))                                                                                                        
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-cb84a5d4a004> in <module>
      1 region1_db_seqs = Artifact.import_data('FeatureData[Sequence]', pd.Series({
----> 2     'seq1|seq2': DNA('GCGAAGCGGCTCAGG', metadata={'id': 'seq1|seq2'}),
      3     'seq3@0001': DNA('ATCCGCGTTGGAGTT',  metadata={'id': 'seq3@0001'}),
      4     'seq3@0002': DNA('TTCCGCGTTGGAGTT', metadata={'id': 'seq3@0002'}),
      5     'seq5': DNA('CGTTTATGTATGCCC', metadata={'id': 'seq5'}),

NameError: name 'DNA' is not defined

In [3]: region1_db_seqs = Artifact.import_data('FeatureData[Sequence]', pd.Series({ 
   ...:     'seq1|seq2': skbio.DNA('GCGAAGCGGCTCAGG', metadata={'id': 'seq1|seq2'}), 
   ...:     'seq3@0001': skbio.DNA('ATCCGCGTTGGAGTT',  metadata={'id': 'seq3@0001'}), 
   ...:     'seq3@0002': skbio.DNA('TTCCGCGTTGGAGTT', metadata={'id': 'seq3@0002'}), 
   ...:     'seq5': skbio.DNA('CGTTTATGTATGCCC', metadata={'id': 'seq5'}), 
   ...:     'seq6': skbio.DNA('CGTTTATGTATGCCT', metadata={'id': 'seq6'}),  
   ...:     }))                                                                                                        

In [4]: region1_db_seqs                                                                                                
Out[4]: <artifact: FeatureData[Sequence] uuid: 5975c241-ebdb-42b9-9e78-aa5c5efa4510>

In [5]: region1_db_seqs.view(pd.Series)                                                                               
Out[5]: 
seq1|seq2    (((G)), ((C)), ((G)), ((A)), ((A)), ((G)), ((C...
seq3@0001    (((A)), ((T)), ((C)), ((C)), ((G)), ((C)), ((G...
seq3@0002    (((T)), ((T)), ((C)), ((C)), ((G)), ((C)), ((G...
seq5         (((C)), ((G)), ((T)), ((T)), ((T)), ((A)), ((T...
seq6         (((C)), ((G)), ((T)), ((T)), ((T)), ((A)), ((T...
dtype: object

1 Like

Hi @SoilRotifer,

Thanks! Its a different error from the one I’m getting. (Sorry, i mis-copied the imports, i’ll fix those above!)

1 Like

Yeah, I forgot to comment on that bit. I am wondering if there is something else odd in your environment?

1 Like

I have sparse, dask, and i moved numba back to 0.48. …I can re-factor to do without sparse, but i need dask.

So, maybe the refactor is the answer. Again. :roll_eyes: :slightly_smiling_face: :computer:

1 Like

Hey @jwdebelius!

This part of the first traceback jumped out at me:

That bit of the framework checks for duplicated transformers, and according to that it found one... What plugins do you have in this env?

2 Likes

@Oddant1 can you look into this unrelated part of the error message - looks like the "transformer already exists" string interpolation might be broken (I suspect related to last fall's plugin manager project).

3 Likes

It’s a standard QIIME 2 install with the plugin Im developing installed. I’ve added several new transformers for new semantic types Im developing (mostly back and forth between pandas and metadata) and then a transformer to go from aligned sequences to a series.

The code isn’t up on github because I threw the error trying to make sure my travis testing would work before I made the commit because I only have like 3 travis builds left.

Best,
Justine

2 Likes

Cool, yeah, I would double-check the transformers you’ve registered - transformers are global, maybe you accidentally re-registered one from q2-types.

2 Likes

You guys added a way to view aligned sequences as series without me updating! Cool

4 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.