Hi Everyone,

# Project Title: eBay Car Sales Data

#### —} Part E: Data Analysis

``````--> 5. Analyse the data to:
5.1. Calculate the distribution based on the column: 'reg_year'.
5.2. Calculate the distribution based on the columns: 'date_crawled', 'ad_created' and 'last_seen'.
5.3  Select brand and aggregating mean price.
5.4. Calculate the mean mileage and mean price for each of the top brands.
5.5. Find the most common brand/model combinations
5.6. Find out if the average prices follows any patterns based on the mileage.
5.7. Find out how much cheaper are cars with damage than their non-damaged counterparts.
``````

#### —} Part A: Original Dataset

``````---> 1. How the original dataset is organised
1.1.  Observation
``````

#### —} Part B: Reorganising Dataset

``````---> 2. Rows & Columns
2.1. Review of unique values returned as NaN.
2.2. Review of columns with only 2 unique values.
2.3. Convert datetype of 'price' and 'odometer' columns from object to integer.
2.4. Translate non-English word to English words.
2.5. Chang the use camelcase to snakecase in the names of columns and reorganising the columns.
``````

#### —} Part C: Organised Dataset

``````---> 3. Quick Review of the organised dataset.
``````

#### —} Part D: Cleaning Data Entries

``````---> 4. Data entries
4.1. Remove data for antique vehicles from the columns: 'reg_year'.
4.2. Remove inaccurate entries in the column: 'reg_year'.
4.3. Review data entry for columns: 'reg_month'.
4.4. Check for outliers in the column: 'adometer_km'.
4.5. Check for outliers in the column: 'price_\$'.
``````

Thank you.