validation_utils#

predict_backend.utils.validation_utils.convert_values(X, col_to_conversion_dict)#

Given a conversion dictionary, exchanges values according to the conversion dictionary.

Parameters:
  • X (DataFrame) – A verbose or ordinally encoded dataframe.

  • col_to_conversion_dict (Dict[str, Dict[Union[int, float, complex, number, str, object], Union[int, float, complex, number, str, object]]]) – A dictionary mapping a column to an additional dictionary which maps oridinal values to values they should be replaced with. This should be a conversion dictionary given from verbose_encodings().

Return type:

DataFrame

Returns:

The encoded dataframe with replaced values according to the conversion dictionary.

predict_backend.utils.validation_utils.one_hot_encodings(one_hot_dict)#

Given a mapping of column names to corresponding one hot columns, returns a pair of dictionaries which facilitate conversion to and from one hot and ordinal encodings through the use of the functions one_hot_to_ordinal() and ordinal_to_one_hot(). These functions are predominantly used within the Dataset class, and should be used within context of that class whenever possible.

Parameters:

one_hot_dict (Dict[str, List[str]]) – A mapping from original column names to a list of one hot column names.

Return type:

Tuple[Dict[str, int], Dict[str, List[int]]]

Returns:

A 2-tuple of dictionaries used for other conversion functions. The first dictionary should be used as the one_hot_column_to_ordinal_encoding parameter for the ordinal_to_one_hot() function. The second dictionary should be used as the feat_to_ordinal_encodings parameter for the one_hot_to_ordinal() function.

predict_backend.utils.validation_utils.one_hot_to_ordinal(X, one_hot_dict, feat_to_ordinal_encodings)#

Converts a one hot encoded dataframe into an ordinally encoded dataframe provided additional metadata.

Parameters:
  • X (DataFrame) – A one hot encoded dataframe.

  • one_hot_dict (Dict[str, List[str]]) – A mapping from original column names to a list of one hot column names.

  • feat_to_ordinal_encodings (Dict[str, List[int]]) – A mapping from original column names to a list of ordinal values. Should be passed through from one_hot_encodings().

Return type:

DataFrame

Returns:

The original dataframe converted to an ordinal encoding.

predict_backend.utils.validation_utils.ordinal_to_one_hot(X, one_hot_dict, one_hot_column_to_ordinal_encoding)#

Converts an ordinally encoded dataframe into a one hot encoded dataframe provided additional metadata.

Parameters:
  • X (DataFrame) – An ordinally encoded dataframe.

  • one_hot_dict (Dict[str, List[str]]) – A mapping from original column names to a list of one hot column names.

  • one_hot_column_to_ordinal_encoding (Dict[str, int]) – A mapping from one hot column names to the corresponding integer encoding.

Return type:

DataFrame

Returns:

The original dataframe converted to a one hot encoding.

predict_backend.utils.validation_utils.verbose_encodings(cat_to_vals)#

Given a mapping of column names to a list of verbose values, returns a pair of dictionaries which facilitate conversion to and from verbose and ordinal encodings through the use of the function convert_values(). These functions are predominantly used within the Dataset class, and should be used within context of that class whenever possible.

Parameters:

cat_to_vals (Dict[str, List[Union[int, float, complex, number, str, object]]]) – A mapping from original column names to a list of corresponding verbose values.

Return type:

Tuple[Dict[str, Dict[Union[int, float, complex, number, str, object], int]], Dict[str, Dict[int, Union[int, float, complex, number, str, object]]]]

Returns:

A 2-Tuple of dictionaries used for other conversion functions. Both can be used in the function convert_values(). The first dictionary should be used as the conversion dictionary when converting from a verbose to an ordinal encoding, while the second dictionary should be used when converting from an ordinal encoding to a verbose encoding.