How to add weights on a condition of a explanatory variable in regression Python?


I have a problem at hand to identify siblings in a social networking site. I have considered a few variables such as last name, age_gap, tags, gender etc.

age_gap = The gap between both the person under consideration and the other person in their network.

Questions needed help on:

  1. As two people with same last name and same gender can also be father and son, I want to put weights on lower age_gap. How do I do it?
  2. Is it possible to add weights for multiple conditions on same independent variable?
  3. Is it possible to add weights for different independent variables ?
  1. have u tried creating a dummy column = ‘1’ for rows with age_gap < ‘some number to experiment with’?
  2. same approach
  3. same approach
Name age _gap is_it_big is_it_medium is_it_small
Adam 5 0 0 1
Mark 58 1 0 0
Kar 20 0 1 0

an alternative approach I’d experiment with is to create a column df[‘hundred-age_gap’] and deduct age gap, then raise it to power of…something (experiment with numbers). that will create a higher number with more weight for rows with small age_gap

df['hundred-age_gap']  = (100 - df['age_gap'])**(some_number)

same idea can be achieved with root sqare but can be harder to explain :slight_smile:

1 Like

Thanks @adam.kubalica . I know about the dummy column way but the second way you suggested is what I was looking for.

1 Like