Pandas data cleaning

Screen Link: Learn data science with Python and R projects

My Code:

import pandas as pd
laptops = pd.read_csv('laptops.csv', encoding='Latin-1')

def clean_col(col):
    col = col.strip()
    col = col.replace("Operating System","os")
    col = col.replace("","_")
    col = col.replace("(","")
    col = col.replace(")","")
    col = col.lower()
    return col

new_columns = []
for row in laptops.columns:
    clean_c = clean_col(row)
laptops.columns = new_columns

What I expected to happen:
Nice work

What actually happened:
Value of laptops is not what we expected

In “laptops.columns = new_columns”, the laptops keep highlight in red while it’s nothing wrong, so I don’t know the reason why it’s not what it’s expected

What is the final output of print(laptops.columns)? Providing the output of your code in your posts will help others to help you. When I run your code, I get this:

Index(['_m_a_n_u_f_a_c_t_u_r_e_r_', '_m_o_d_e_l_ _n_a_m_e_',
       '_c_a_t_e_g_o_r_y_', '_s_c_r_e_e_n_ _s_i_z_e_', '_s_c_r_e_e_n_',
       '_c_p_u_', '_r_a_m_', '_s_t_o_r_a_g_e_', '_g_p_u_', '_o_s_',
       '_o_s_ _v_e_r_s_i_o_n_', '_w_e_i_g_h_t_', '_p_r_i_c_e_ __e_u_r_o_s__'],

This definitely not the result we are looking for! So what’s the problem? Why are we getting all these extra _ between letters? Take a look at your definition of clean_col to see if you can figure out where these underscores are coming from, then try tweaking your code to do what you want it to do.

Let me know if this isn’t clear or would like some more instructions.

1 Like

Hi @ipngasi,

You need to put the space inside the quote marks to remove spaces:

col = col.replace(" ","_")

And then assign back the result to the DataFrame.columns attribute:

laptops.columns = new_columns

@WilfriedF FYI – I was going for a “teachable moment” here and didn’t want to just provide the answer…but I do agree with you…that is the solution!

Oh @mathmike314 I was suspecting it, sorry! Good to know, next time, I will wait for a while before posting. It’s like a game, I like to search and find the solution, but whatever, I understand your point of view, it’s more interesting for the student to be teached than to obtain the solution immediately without burning his brain on it. :slight_smile:

@WilfriedF No worries, I do that too! It’s just that for some situations, I find there is more value to the learner in being guided rather than just providing the answer. It allows them that wonderful feeling of “Oh! I see what’s happening here!” and then solve it themselves. Particularly in situations like this where the solution only requires inserting a space character in the right place. Allowing the learner to find this themselves can be very gratifying and motivating. Simply providing the answer can have the opposite effect (sometimes.)

1 Like

Hi all,

Thank you for both of your advise. I try “” and " " and understand what it means now. I always miss some small details then cause errors. it’s good to learn by finding the error by myself or guide me the reason of error.

Big thank you!

1 Like