Feature Selection

In this article

This page allows you to perform a feature selection process to choose the best features for modelling. It displays all the currently existing feature selections that have been run already, along with basic information .

Feature selection for binary classification models

The New Feature Selection button initiates a new feature selection. From this screen, you can choose any existing feature set to be your starting point. In addition, you manually add or remove a feature from that reference feature set.

Select the features to run a feature selection on within the Features tab.

The features can be edited as a text list for convenience using the Edit as text button if you have a list of inputs externally. The Reset button will return it to the entire base feature set.

You can use an existing feature set from the Select a feature to use option as a starting point for the process. This will automatically update the ticks and crosses displayed if you would like to alter it any further.

Moving to the settings pane, you can adjust many parameters that affect how the feature selection process occurs.

The key ones are in Reduction Settings:

  • The target number of features: The number of features that will remain in the last step, after the feature selection is complete. After the target number of features is reached, the process is halted. The default is 50 features if the original dataset has 100 columns or more, and 20 features if it has less.
  • Feature removal rate: The percentage of features that are removed at every iteration. The default is 0.2, which is the removal of the lowest-ranked 20% of features at every step.

For detailed information on the rest of the parameters, please refer to the manual: Feature-Selection-manual.pdf

Wrapper Feature Selection

The wrapper method is only available for binary classification. Correlation based feature selection is an option for all intelligence tasks.

The wrapper approach to feature selection works by recursive feature elimination: where models are built on a subset of features, and the worst/least predictive features are removed. Then a new set of models are built from the updated reduced feature set. This process continues until the target number of features is reached.

After the feature selection process is complete, it will bring you to a chart that shows the results of the algorithm at each step of the process.

Each step will show: 

  • The estimated expected Training recall for the models built with that set of features. 
  • The estimated expected Validation recall for the models built with the features.
  • The number of features before and after the feature elimination step.

Important note: Training and validation recalls should be taken as "expected performance" when we use that subset of features in the model building process.

In a standard feature selection, the scores tend to improve at the first few steps, as the useless features that contribute noise to the model are weeded out. After it peaks, then it will start to decrease in predictive power as good features begin to get removed.

In binary classification projects, a Feature Selection will create a Feature Set at every step of the process, as depicted in the following figure:

Correlation Feature Selection

For  Regression and Multiclass classification models, we employ a scalable feature selection technique which scales nicely to much bigger number of features. This feature selection technique we employ is termed  Correlation-based Feature Selection (CFS). CFS evaluates subsets of all features with the idea that “Good feature subsets contain features highly correlated with the target output, yet uncorrelated to each other”. Therefore, we look at the subset of features which maximizes the score based on this principle. The higher the score, the higher the feature set is correlated to the target and the less the features are correlated to each other. 

Here, Feature Selection is a constructive method, meaning it ranks all features iteratively, starting from the first one, adding one feature at every step, until all features have been added.

However, in practice, such optimal subset of features that best satisfies the CFS requirements can settle on a subset of features which is suboptimal for machine learning methods because it chooses few features. To address this suboptimality, the platform informs the user which feature set has the maximum validation score, based on CFS. The platform also suggests the maximum feature set that should be considered. This maximum feature set is based on a 20% decay of the best score feature set. 

To Run an New Feature Selection , click on the Run Feature Selection button from the Feature selection option

Figure 1 and Figure 2 shows how a feature selection component will look initially.

Figure 1: Regression Feature Selection – No Selection

Figure 2: Multiclass Feature Selection – No Selection

To reveal the suggested range of feature sets, click the Suggest Feature Sets button. This will highlight the feature sets to choose from. By default, the feature set with the highest validation score is selected and emphasized with a deeper blue. Figure 3 and Figure 4 show this state. 

Figure 3: Regression Feature Selection – Best Score Selection

Figure 4: Multiclass Feature Selection – Best Score Selection

For better performing models, alternative feature sets should be explored, beyond the set with the best CFS validation score. It is recommended to try feature sets up to and including the feature set where the validation score has decayed no more than 20%. This is illustrated in Figure 5 and Figure 6. 

Figure 5: Regression Feature Selection – Maximum Feature Selection

Figure 6: Multiclass Feature Selection – Maximum Feature Selection

In this case, building models from this screen using the "Build model with these X features" will automatically create a feature set just for that step, containing the corresponding features.

Still need help? Contact Us Contact Us