Load supplemental material #50

Gitiauxx · 2018-10-10T17:19:04Z

When building objects from a pickled file (using from_dict method), we need to make sure that the method knows to which attribute the reconstructed object corresponds to.

For example, if we reconstruct a method RandomForestRegressor(), we need a line that says

obj.model = unpickled_content

To automate the process, we add a key 'object_name' for each element of the supplemental_objects dictionary and then at construction time (i.e. in from_dict) use the following code:

for supplement in d['supplemental_objects']:

    `name = supplement['object_name']`
   
    `content = supplement['content']`
		
    `setattr(obj, name, content)`

It does require some tweaking in the modelmanager code (load_supplemental_object and save_supplemental_object).

The text was updated successfully, but these errors were encountered:

Gitiauxx · 2018-10-10T17:49:28Z

In the xavier_testing branch, I went ahead and implement the above idea. It seems to work.

smmaurer · 2018-10-18T19:29:10Z

It sounds like this is the essence of the difference. Am i understanding it right?

Version A (current):

'supplemental_objects': [{
    'name': 'model-object',  # semantic name, for identification + use in filename
    'content_type': 'pickle',
    'content': obj,
    'required': True }, ... ]

Version B (proposed):

'supplemental_objects': [{
    'name': 'model-object',  # drop this?
    'object_name': 'model',  # name corresponding to template property
    'content_type': 'pickle',
    'content': obj,
    'required': True }, ... ]

Advantages of B are that we can automatically load the supplemental object without specifying what to do with it in the template code.

But I kind of like how version A lets us keep the yaml file (and supplemental object filename) distinct from implementation details of the template class. It also leaves the door open to templates doing something with supplemental objects other than saving them directly as a top-level class property, which seems valuable.

So I'd lean toward keeping version A. But am i missing other tradeoffs?

Gitiauxx · 2018-10-19T12:12:12Z

I see your point about keeping implementation details out of the yaml file. That makes sense.

One potential trade-off is that if we have multiple items in the supplemental objects list, it may be more difficult to know what to do with each item (which property do they correspond to? Or whatever other attributes they are supposed to represent?)

Right now if going with A, I just write
obj.model = content
in the step template, where content comes from the only file I unpickle. But that would not work well if we have multiple contents.

An alternative, while keeping A, might be to add a tag in content before pickling it that will tell us what that content is supposed to be.

smmaurer mentioned this issue Oct 18, 2018

Template for random forest models #43

Open

smmaurer mentioned this issue Oct 18, 2018

Add Random Forest Step #46

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load supplemental material #50

Load supplemental material #50

Gitiauxx commented Oct 10, 2018 •

edited

Loading

Gitiauxx commented Oct 10, 2018

smmaurer commented Oct 18, 2018

Gitiauxx commented Oct 19, 2018

Load supplemental material #50

Load supplemental material #50

Comments

Gitiauxx commented Oct 10, 2018 • edited Loading

Gitiauxx commented Oct 10, 2018

smmaurer commented Oct 18, 2018

Gitiauxx commented Oct 19, 2018

Gitiauxx commented Oct 10, 2018 •

edited

Loading