-
Notifications
You must be signed in to change notification settings - Fork 26
Added SER recipe creation #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added SER recipe creation #192
Conversation
This pull request has been linked to Shortcut Story #74814: Add the ability to create a Standalone Evaluation Recipe using the API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM appart from the typo in pydoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Let's go a bit further on doc.
dataikuapi/dss/recipe.py
Outdated
# Add the newly created json payload to the recipe settings | ||
# Note that with this method, all the settings that were not explicitly set are instead set to their default value. | ||
# e.g. there is an empty cost matrix setting for this recipe because 'cost matrix' is not defined in the payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I would give an example in this doc of setting the costMatrix to the default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also fix in ER recipe documentation, line 1374
of dataikuapi/dss/recipe.py
:
builder = project.new_managed_dataset("output_scored")
Should be:
builder = project.new_managed_dataset("output_metrics")
ser_payload['metricParams'] = dict(costMatrixWeights=dict(tpGain=0.4, fpGain=-1.0, tnGain=0.2, fnGain=-0.5)) | ||
|
||
# Add the newly created json payload to the recipe settings and save the recipe | ||
# Note that with this method, all the settings that were not explicitly set are instead set to their default value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have this behaviour for cost matrix: values are just deleted in the SER if I remove metricParams
key from the ser_payload
.
ser_settings = new_recipe.get_settings() | ||
|
||
ser_settings.set_json_payload(ser_payload) | ||
ser_settings.save() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one more thing I've just noticed. When you update the settings this way in the notebook, and open the recipe in the GUI, it is in a dirty state and requires to be saved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging on that item, we found the same behaviours with other recipes (evaluation, prepare). But for those, it was due to the labels, and it is another subject.
For the SER, the issue comes from completely replacing the settings dict with a new one that do not have autoOptimizeThreshold
key. In the GUI, this key is added to the recipe desc, and therefore the recipe is considered changed.
To avoid that, and being consistent with other examples, we suggest to avoid putting the dict replacement in the documentation, and replacing it with an update of the existing dict:
ser_settings = new_recipe.get_settings()
ser_json_payload = ser_settings.get_json_payload()
ser_json_payload['predictionType'] = "BINARY_CLASSIFICATION"
ser_json_payload['targetVariable'] = "Survived"
ser_json_payload['predictionVariable'] = "prediction"
ser_json_payload['isProbaAware'] = True
ser_json_payload['dontComputePerformance'] = False
ser_settings.set_json_payload(ser_json_payload)
ser_settings.save()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. You can ask @instanceofme or @cstenac for a review.
Nope, we shall ask them to review the base branch after merge. |
Companion test PR https://github.com/dataiku/dip/pull/14421