Tutorials on Feature Generation and FAQ

Why Is OpenFE Ineffective For My Dataset?

There are two possible reasons. The first reason is that OpenFE fails to recall effective candidate features due to improper parameters. Please refer to our guidance on parameter tuning. The second reason is that there are no effective candidate features, since feature generation is not beneficial for all datasets. Our past experience on numerous real-world datasets indicates that feature generation is beneficial for 50%–70% of datasets.

How Many New Features Should I Include In The Dataset?

This relates to the topic of feature selection methods. Users can try to include 10, 20, 30, … new features and see which provides the best results. It is recommended to try more delicate feature selection methods (such as forward feature selection) to achieve better performance.

What is a high-order feature?

A first-order feature is transformed by the base features (zero-order) using one operators, such as \(f_1 + f_2\). A second-order feature is transformed by the first-order features, such as \((f_1 + f_2)\times f_3\), and so on. High-order features refer to features with orders \(\geq 2\).

Are high-order features beneficial for model performance?

According on our previous experience with feature generation, looking for high-order features may not result in a noticeable improvement on the majority of datasets. First, we do not find that generating high-order features is useful for all the benchmarking datasets in our paper. Second, on the two Kaggle competitions (IEEE-CIS Fraud Detection and BNP Paribas Cardif Claims Management), the high-order features generated by the top winning teams also hardly improve the test scores. Readers can refer to more details in our paper.

Some previous papers argue that high-order features are useful by directly searching for high-order features. However, the effectiveness of high-order feature transformations should be evaluated in light of all its low-order components. For example, a second-order feature transformation \(f_1\times f_2\times f_3\) is effective only if it has additional effectiveness to all their first-order components \(f_1\times f_2\), \(f_1\times f_3\), and \(f_2\times f_3\).

How to generate high-order features?

Users can first generate effective first-order features, and then include the first-order features into the base features to generate second-order features, and so on.