Project - Detection of Bias in Movies Ratings Systems through Statistical Analysis

This project analyzes three datasets:
a. Walt Hickey’s ‘fandango_score_comparison.csv’ with 146 entries.
b. Alex Olteanu’s ‘movie_ratings_16_17.csv’ with 214 entries.
c. Walt Hickey’s original dataset ‘fandango_scrape.csv’ with 510 entries

Reaches following conclusions:
a. Hickey’s abridged dataset with 146 entries is itself biased towards exaggerating the alleged bias in Fandango ratings, as it only looks at movies with high ratings and high votings.
b. Alex’s dataset follows a similar pattern, though it is primarily aimed at identifying normal distributions in rating patterns of various aggregators (with a rather small dataset)…
c. However, most likely, Hickey’s critique in 2015 has caused a downward revision in Fandango ratings by an amount which has revised the mean to slightly lower than 4.0 from slightly higher than 4.0. This has been achieved by increasing frequency of 4.0 ratings and decreasing the frequency of higher ratings 4.5 and 5.0.
d. Fandango has lately switched to the rating system of Rotten Tomatoes’ “TOMATOMETER”.

Feedback on methodology and conclusions will be appreciated

project_bias_detection_movies_rating_systems.ipynb (243.8 KB)

Click here to view the jupyter notebook file in a new tab