1 Create a column Usage_Per_Year from Miles_Driven_Per_Year by discretizing the values into three equally sized categories. The names of the categories should be Low, Medium, and High.
2 Group by Usage_Per_Year and print the group sizes as well as the ranges of each.
3 Do the same as in #1, but instead of equally sized categories, create categories that have the same number of points per category.
4 Group by Usage_Per_Year and print the group sizes as well as the ranges of each.
Мои коды ниже
df["Usage_Per_Year "], bins = pd.cut(df["Miles_Driven_Per_Year"], 3, precision=2, retbins=True)
group_label = pd.Series(["Low", "Medium", "High"])
#3.3.2
group_size = df.groupby("Usage_Per_Year").size()
#print(group_size)
print(group_size.reset_index().set_index(group_label))
#3.3.3
Year2 = pd.cut(df["Miles_Driven_Per_Year"], 3, precision=2)
group_label = pd.Series(["Low", "Medium", "High"])
#3.3.4
group_size = df.groupby("Usage_Per_Year").size()
#print(group_size)
print(group_size.reset_index().set_index(group_label))
вывод ниже:
Usage_Per_Year 0 Low (-1925.883, 663476.235] 6018 Medium (663476.235, 1326888.118] 0 High (1326888.118, 1990300.0] 1 Usage_Per_Year 0 Low (-1925.883, 663476.235] 6018 Medium (663476.235, 1326888.118] 0 High (1326888.118, 1990300.0] 1
но -1925 не так...
Правильный ответ должен быть таким.
Как я могу сделать...