py_predpurchase.function_model_cross_val

Module Contents

Functions

model_cross_validation(preprocessed_training_data, ...)

Calculates the cross validation results for a four common off-the-shelf models (Dummy, KNN, SVM and RandomForests)

py_predpurchase.function_model_cross_val.model_cross_validation(preprocessed_training_data, preprocessed_testing_data, target, k, gamma)[source]

Calculates the cross validation results for a four common off-the-shelf models (Dummy, KNN, SVM and RandomForests) using preprocessed and cleaned training and testing datasets. Random forests and Dummy hyperparameters are fixed for simplicity sake.

Parameters:

preprocessed_testing_dataDataFrame

Cleaned and preprocessed testing data.

targetstr

Target column name in the dataset.

kint

Hyperparameter ‘k’ value for KNearestNeighbours.

gammafloat

Hyperparameter ‘gamma’ value for SVM.

Returns:

dict

Contains cross-validation results (mean and std of scores) for each specified model.

Examples:

Assuming dataset is preprocessed and split into training and testing sets, with ‘target’ as the target column:

>>> results = model_cross_validation(preprocessed_training_data, preprocessed_testing_data, 'target', k=5, gamma=0.1)
>>> pd.DataFrame(results)

This will output the cross-validation results for each model, displaying the mean and standard deviation of the scores (also includes train scores).

Notes:

The function assumes that the input data is already scaled and encoded.