Feature Engineering
Binning
A continuous feature can be transformed into a categorical one by binning the values into buckets. To do this, you must go to the feature view and create a binned feature from the Info tab. Note that all forms of feature engineering must be performed through the All Features set, so the original feature must be accessed via this set.
After the binned feature has been created, going to its view will show statistics and metrics on the feature, except now it will be on the binned features as categories.
The default number of bins created is 4 but this value may be changed by the user.
The default range of bins may be changed by the user by changing the boundary values of the bins. After the data has been binned, you can view the chart which shows the metrics as any other categorical feature.
The grid view is the default view - you can switch to the chart view for a graphical representation.
Figure 1: Graphical representation of the distribution of a feature in Binary classification model.
Figure 2: Graphical representation of the distribution of a feature in Multiclass classification model.
Figure 3: Graphical representation of the distribution of a feature in Regression model.
If the feature is mixed or continuous, user can select 2 or more categories from the data tab and merge it by clicking on the Combine Categories button at top right corner of the data tab. You can create 1 or more merged categories for one feature.
If the number of features selected are less than 2, clicking on combine features button will give an error.
Clicking on the combine categories will create a new merged category as shown below:
You can click on the Create New Encoding button at top right corner to retain this merged category. A new feature with the merged category will be added to the feature set along with the original feature.