fastbook 08 How to create CrossTab from DataFrame
from fastai.collab import *
from fastai.tabular.all import *
path = untar_data(URLs.ML_100k)
df = pd.read_csv(path/'u.data', delimiter='\t', header=None, names=['user', 'movie', 'rating', 'timestamp'])
df = df.drop(columns='timestamp')
df.head()
Generating a new df only with most frequently reviewing users
user_ids = df.user.value_counts().index.tolist()[:10]
df_users = df[df.user.isin(user_ids)]
df_users.head()
Generating a new df only with most frequently reviewed movies
movie_ids = df.movie.value_counts().index.tolist()[:20]
df_movies = df[df.movie.isin(movie_ids)]
df_movies.head()
Combine both dfs to generate an crosstable aligning on the above '(index, user)' and '(index, movie)'
pd.crosstab(df_users.user, df_movies.movie, values=df.rating, aggfunc='mean').fillna('-')