py_predpurchase.function_preprocessing

Module Contents

Functions

numerical_categorical_preprocess(X_train, X_test, ...)

Applies preprocessing transformations to the data, including scaling, encoding, and passing through features as specified.

py_predpurchase.function_preprocessing.numerical_categorical_preprocess(X_train, X_test, y_train, y_test, numeric_features, categorical_features)[source]

Applies preprocessing transformations to the data, including scaling, encoding, and passing through features as specified. This function requires target data to be provided and includes it in the output DataFrames.

Parameters:

X_trainDataFrame

Training feature data.

X_testDataFrame

Testing feature data.

y_trainDataFrame or Series

Training target data.

y_testDataFrame or Series

Testing target data.

numeric_featureslist

Names of numeric features to scale.

categorical_featureslist

Names of categorical features to encode.

Returns:

Tuple

Contains preprocessed training and testing DataFrames including target data, and transformed column names.

Examples:

Assume you want to transform the following features and your data set has already been split into train and test

>>> numeric_features = ['feature1', 'feature2']
>>> categorical_features = ['feature3', 'feature4']
>>> train_transformed, test_transformed, transformed_columns = numerical_categorical_preprocess(
        X_train, X_test, y_train, y_test, numeric_features, categorical_features)

The function will transform feature1,2,3,4 accordingly, carrying out scaling and one-hot encoding and storing the preprocessed data in ‘train_transformed’ and ‘test_transformed’. Column names will also be stored in ‘transformed_columns’.