Guided Project: My second guided project, My own data analysis before my guided project

Shell_Fuel_Prices_UK.ipynb (125.4 KB)

Shout out to: @wanzulfikri for the help.

I wanted to try and build on the skills I have learnt. Its been a great experience and I have enjoyed it immensely.

I know I have a lot to improve on, but comparing this with the first project I defiantly see progress. The data I think is pretty useful in a real world situation.

Thanks,
Frankie

Click here to view the jupyter notebook file in a new tab

2 Likes

Hi @Frankie,

I’m glad to see that your hard work has paid off. I will also argue that your project is not exactly a Dataquest guided project, but mostly a self-directed project which is great because it demonstrates successful transfer of learning and intellectual independence.

Anyhow, well done and thank you for sharing it.

Some thoughts and suggestions on improving the project:

1. Rerun all the code cells before sharing a notebook

This is good for one last bug-check before sharing and also improves the aesthetic a bit i.e. code cell numbers go from 1, 2, 3, … and so forth.

2. Some minor spelling mistakes and also inconsistent capitalization e.g. UK and uk.

Not that big a deal but well, pedantism haha.

3. Add more details to the Excel data cleaning

The data needed a bit of cleaning in Excel. I converted the price from a string into a float and removed empty cells.I also removed any columns that only had one type of fuel price listed. So I do expect a fractional variance.

Telling the readers that you’ve done some data cleaning beforehand in Excel is a good call.

Ideally though, you want to the whole data cleaning within the notebook without relying on Excel or similar software. The idea is some readers who read your notebook might want to reproduce your results on their own computer and some of them might not even have Excel; even if they do, how certain are we that they can modify the sheet exactly as you’ve done? They might be able to do it for simpler projects but probably not for more complex ones.

It’s very good that you’ve provided the steps you took to modify the data with Excel. But you might want to give a bit more detail. For example, “I also removed any columns that only had one type of fuel price listed.” → list the exact columns you’ve removed so readers can verify that they’ve removed the same columns as you did.

Another option is you can provide readers with the file you’ve modified. This way they don’t need to do any Excel cleaning; they can just load your file, run the notebook, and immediately get the same results you had.

Only in Jupyter: Last line in a code cell will automatically be printed

In the Jupyter notebook, the last line in a code (specifically, if it’s an expression) will automatically be printed by default. For example print(fuel_data) can be replaced with fuel_data and it’ll still be printed.

This is only for Jupyter from what I can tell.

Handling the flat list conversion

In the following, the reason it became a flat list is you’re not appending the whole row but each row’s element one by one.

which_region = []
# looping over the rows and appending to a list named which region
for row in fuel_data[1:]:
    region = (row[1])
    station =(row[0])
    diesel = (row[2])
    petrol = (row[3])

# Using a connditional to check if the station is located in the region i need    
    if region == 'Shropshire':
        which_region.append(region)
    

        which_region.append(station)
        
 
        which_region.append(diesel)
    

        which_region.append(petrol)
        
print(which_region)

The fix is to just append the whole row once you’ve verified that the region is Shropshire.

which_region = []
# looping over the rows and appending to a list named which region
for row in fuel_data[1:]:
    # we only need the region for checking
    region = (row[1])

# Using a connditional to check if the station is located in the region i need    
    if region == 'Shropshire':
        # append the whole row if the region is Shropshire
        which_region.append(row)
   
print(which_region)

That way you’ll maintain the row by row list.

You can also mix and match. Say, you want to remove diesel price:

which_region = []
# looping over the rows and appending to a list named which region
for row in fuel_data[1:]:
    region = (row[1])
    station =(row[0])
    petrol = (row[3])

# Using a connditional to check if the station is located in the region i need    
    if region == 'Shropshire':
        which_region.append([region, station, petrol])
        
print(which_region)

I’m trying to understand the yearly savings calculation

yearly_savings = .17 * 50 * 4 * 12 → is there a 17% price reduction when switching from Millom to Highbury East?

The total savings divided by Millom’s price – 17 / 173.9 – is around 0.1 which is 10%. So the yearly savings should be around 240 . Lower than 408 but still substantial.

The conclusion might need some additional details

Since petrol price is dynamic, you might want to emphasize the time frame for the data set. Your findings will fit that time frame but the findings will be different when data is collected at a different time.

“If a driver switched stations they could save £408” → you might need to highlight a bit that this is achievable when a driver who uses the petrol station with the highest unleaded price switches to the petrol station with the lowest unleaded price. I don’t know how far Millom is from Highbury East, but if those two are quite far apart, it’s worthwhile to touch a bit about the practicality of switching. Also note that the potential 408 saving is only for unleaded fuel and not diesel.

That yearly savings figure is also based on the assumption that the price gap between highest and lowest priced petrol stations will be the same throughout the year.


That’s it from me. Hope that helps.

Keep up the hard work. Cheers.

3 Likes

Hi Wan.

Thank you very much for the feedback. I have revisited some of the point you mentioned and made some adjustments. All very valuable feedback!

The math may have been off. I was basing it off if a customer spends £50 a week and he was saving 17p per litre. They would save that much. But the data is flawed because fuel prices would change.

Also I looked at the distance between the two and they are 5 hours apart. So it doesn’t make sense why they are both listed as being located in “Shropshire”.

I have changed the conclusion to reflect this.

I am making another project in Jupiter and keeping these in mind. I’m using a little Html staying to add color this time and it’s looking nicer and more structured which is helping.

I think this project has taught me I’m still lacking the basics. I have decided to back and learn the basics before I keep progressing. It’s important I have a solid grasp of these basics before I move on.

Thank you kindly for all your feedback and guidance :grinning:

2 Likes

No worries, @Frankie.

That reminds me of this excerpt I read from Ben Hogan’s 5 Lessons. I’m not a golfer. But I thought I wanted to try golfing, read that book, and surprisingly found some lessons that are applicable to many things and not just golf.

Anyhow, here’s the excerpt (emphasis mine):

In 1946 my attitude suddenly changed. I honestly began to feel that I could count on playing fairly well each time I went out, that there was no practical reason for me to feel I might suddenly “lose it all.” I would guess that what lay behind my new confidence was this: I had stopped trying to do a great many difficult things perfectly because it had become clear in my mind that this ambitious over-thoroughness was neither possible nor advisable, or even necessary. All you needed to groove were the fundamental movements — and there weren’t so many of them. Moreover, they were movements that were basically controllable and so could be executed fairly well whether you happened to be sharp or not so sharp that morning. I don’t know what came first, the chicken or the egg, but at about the same time I began to feel that I had the stuff to play creditable golf even when I was not at my best, my shot making started to take on a new and more stable consistency. THE BASIS FOR THIS PROGRESS, LET ME REPEAT, WAS MY GENUINE CONVICTION THAT ALL THAT IS REALLY REQUIRED TO PLAY GOOD GOLF IS TO EXECUTE PROPERLY A RELATIVELY SMALL NUMBER OF TRUE FUNDAMENTAL MOVEMENTS.

2 Likes