Bucketing

In this article

Bucketing setup

Prior to performing a bucketing, you may want to do the preliminary setup.

The Labelling panel lets you set names for what the various fields in the bucketing will be displayed as. This can be fixed for a model or for all models within a project.

The Aggregate Columns panel lets you define the column types, and the aggregation methods to apply to the data under each header. A custom column header can also be defined if desired. The Aggregate column tab is same for binary and multiclass model as shown below.

The majority and minority fields are removed for a regression model and multiclass model:

Regression models will have RMSE and RMSE/Range instead of Default and Proportion.

There are five aggregation types that you can apply onto a column:

The instances can be filtered by the class (either majority or minority), or kept as default (Filter: None) which retains all instances in the dataset.

Creating a new bucketing

To create a bucketing go to the model sidebar, and select the New Bucketing button after clicking Bucketing under Analysis tab.

Select the dataset you want to perform a bucketing analysis on (or upload it). Select the feature to plot in the graph, and finally, give it a unique name.

If you would like to pass it through another model for filtering, you may do so using the Filter Model option.

After the bucketing job has been completed, it will be displayed with all other existing bucketings performed in that model on the main screen.

Main view

After you click on a completed bucketing process, it will display a chart with the scores of all the instances binned into six buckets, with equal widths between 0 and 1. You can click on a bucket name to rename it, and change the boundaries of a bucket underneath. Hovering over a bucket will give you further information about it.

If you want to change the number of buckets the data is separated into, click the Change Bucket Count button on the top right corner, and enter the number of buckets desired.

After confirmation, it will take you to the main bucketing screen again.

Hovering the mouse over a bucket will display the previously chosen statistics.

Goal seek

The Goal Seek option lets you create custom buckets based on criteria such as 

  • Number of instances to have in each bucket
  • Number of minority class instances to have in each bucket
  • Percentage of minority class instances in each bucket

All buckets can be set to have equal number of instances using the Equalise widths button

After confirmation, it will take you to the main bucketing screen again. In this case the bucket boundaries have been resized so that each one contains the same number of instances.

An example of goal seek based on an equal number of minority class instances per bucket is shown below:

Filter models

If you want to filter the scores for a bucketing through another model, you can select one from the drop-down under the Filter Model label.

Note: filtering is not available for regression models.

If you want to look further into the model metrics before selecting a filter model, click on the Choose Model button. A popup will appear that is similar to the models main page, showing the existing models and metrics about their performance to assist your decision. Click on the model you choose.

NOTE: All deployed models need to have a default bucketing defined, as the resulting bucket for a specific inference will be included in the API response.  In order to change the default bucketing, the model has to be undeployed first. Please refer to the model lifecycle documentation for more information. 

Still need help? Contact Us Contact Us