reVReports.characterizations.recast_categories#
- recast_categories(df, col, lkup, cell_size_sq_km=None)[source]#
Recast the JSON string found in
df[col]This function recasts the JSON string in
df[col]to new columns in the dataframe. Each element in the embedded JSON strings will become a new column following the casting specified bylkupandcell_size_sq_km.- Parameters:
df (
pandas.DataFrame) – Input pandas dataframecol (
str) – Name of column in df containing embedded JSON values (e.g.,"{'0': 44.3, '1': 3.7}").lkup (
dict) – Dictionary used to map keys in the JSON strings to new, more meaningful names. Following the example above, this might be{"0": "Grassland", "1": "Water"}.This follows the same format one could use forpandas.rename(columns=lkup).cell_size_sq_km (
int, optional) –Optional value indicating the cell size of the characterization data being recast.
If specified, it has two effects. First, it will be used to convert values of the JSON to values of area in units of square kilometers during the recast process. Second, all recast column names specified in
lkupwill have the suffix _area_sq_km added to them. Continuing from the examples above, ifcell_size_sq_km=0.0081, the value 44.3 above would be multiplied by 0.0081, producing a new value of 0.35883. This value would be stored in a new column named"Water_area_sq_km".If not specified (or
None), no conversion to area will be applied, values from the JSON will be passed through (or filled with0if missing), and column names specified inlkupwill be used verbatim in the output dataframe.By default,
None.
- Returns:
pandas.DataFrame– New pandas dataframe with additional recast columns appended to the input dataframe.- Raises:
TypeError – A TypeError will be raised if one or more values in
df[col]is not a str dtype.