serhii.net

In the middle of the desert you can say anything you want

15 May 2023

Pandas categorical types weirdness

Spent hours trying to understand what’s happening.

TL;DR categorical types inside groupbys get shown ALL, even if there are no instances of a specific type in the actual data.

# Shows all categories including OTHER
df_item[df_item['item.item_category']!="OTHER"].groupby(['item.item_category']).sum()

df_item['item.item_category'] =  df_item['item.item_category'].astype(str)

# Shows three categories
df_item[df_item['item.item_category']!="OTHER"].groupby(['item.item_category']).sum()

Rel. thread: groupby with categorical type returns all combinations · Issue #17594 · pandas-dev/pandas

Nel mezzo del deserto posso dire tutto quello che voglio.