Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load supplemental material #50

Open
Gitiauxx opened this issue Oct 10, 2018 · 3 comments
Open

Load supplemental material #50

Gitiauxx opened this issue Oct 10, 2018 · 3 comments

Comments

@Gitiauxx
Copy link

Gitiauxx commented Oct 10, 2018

When building objects from a pickled file (using from_dict method), we need to make sure that the method knows to which attribute the reconstructed object corresponds to.

For example, if we reconstruct a method RandomForestRegressor(), we need a line that says

obj.model = unpickled_content

To automate the process, we add a key 'object_name' for each element of the supplemental_objects dictionary and then at construction time (i.e. in from_dict) use the following code:

for supplement in d['supplemental_objects']:

    `name = supplement['object_name']`
   
    `content = supplement['content']`
		
    `setattr(obj, name, content)`

It does require some tweaking in the modelmanager code (load_supplemental_object and save_supplemental_object).

@Gitiauxx
Copy link
Author

In the xavier_testing branch, I went ahead and implement the above idea. It seems to work.

@smmaurer
Copy link
Member

It sounds like this is the essence of the difference. Am i understanding it right?

Version A (current):

'supplemental_objects': [{
    'name': 'model-object',  # semantic name, for identification + use in filename
    'content_type': 'pickle',
    'content': obj,
    'required': True }, ... ]

Version B (proposed):

'supplemental_objects': [{
    'name': 'model-object',  # drop this?
    'object_name': 'model',  # name corresponding to template property
    'content_type': 'pickle',
    'content': obj,
    'required': True }, ... ]

Advantages of B are that we can automatically load the supplemental object without specifying what to do with it in the template code.

But I kind of like how version A lets us keep the yaml file (and supplemental object filename) distinct from implementation details of the template class. It also leaves the door open to templates doing something with supplemental objects other than saving them directly as a top-level class property, which seems valuable.

So I'd lean toward keeping version A. But am i missing other tradeoffs?

@Gitiauxx
Copy link
Author

I see your point about keeping implementation details out of the yaml file. That makes sense.

One potential trade-off is that if we have multiple items in the supplemental objects list, it may be more difficult to know what to do with each item (which property do they correspond to? Or whatever other attributes they are supposed to represent?)

Right now if going with A, I just write
obj.model = content
in the step template, where content comes from the only file I unpickle. But that would not work well if we have multiple contents.

An alternative, while keeping A, might be to add a tag in content before pickling it that will tell us what that content is supposed to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants