njab.pandas package#

njab.pandas.col_isin_df(cols: list | str, df: DataFrame) → list[source]#

Remove item (column) from passed list if not in DataFrame. Warning is issued for missing items.

cols can be a comma-separated string of column names.

njab.pandas.combine_value_counts(X: DataFrame, dropna=True) → DataFrame[source]#

Pass a selection of columns to combine it’s value counts.

This performs no checks. Make sure the scale of the variables you pass is comparable.

Parameters:

X (pandas.DataFrame) – A DataFrame of several columns with values in a similar range.
dropna (bool, optional) – Exclude NA values from counting, by default True

Returns:

DataFrame of combined value counts.

Return type:

pandas.DataFrame

njab.pandas.get_colums_accessor(df: DataFrame, all_lower_case=False) → OmegaConf[source]#: Get an dictionary augmented with attribute access of column name as key with white spaces replaced and the original column name as values.

njab.pandas.get_overlapping_columns(df: DataFrame, cols_expected: list) → list[source]#: Get overlapping columns between DataFrame and list of expected columns.

njab.pandas.replace_with(string_key: str, replace: str = '()/', replace_with: str = '') → str[source]#: Replace characters in a string with a replacement.

njab.pandas.set_pandas_number_formatting(float_format='{:,.3f}') → None[source]#

njab.pandas.set_pandas_options(max_columns: int = 100, max_row: int = 30, min_row: int = 20, float_format='{:,.3f}') → None[source]#: Update default pandas options for better display.

njab.pandas.value_counts_with_margins(y: Series) → DataFrame[source]#: Value counts of Series with proportion as margins.

njab.pandas package