Hello Everyone, this might be simple and I might just be confused. I hope anyone could please help me out in achieving the following ( Apologies If anything has not been explained properly or is not well structured, this is my first time posting a question, I hope you could please help me out) :
I have few columns in my dataframe.
Note : range is inclusive of the limits. And columns I to K are calculated columns after extracting dataset.
Column A - discrete integer values between [3-20]
Column B - numeric - range [0,10000]
Column C - numeric - range [0,10000]
Column D - numeric - range [0,10000]
Column E - 0 or 1
Column F - 0 or 1
Column G - 0 or 1
Column H - 0 or 1
Column I= column C / column B ( will generate NaN/ inf values so should be careful)
Column J= column D / column C ( will generate NaN/ inf values so should be careful)
Column K= column F / column E ( might contain NaN/ inf values so should be careful)
Column L - Category - a or b
Column M - category - c or d
Column N - category - e or f
I am currently trying to plot Column A vs columns I or J or F or G using seaborn line plot
sns.lineplot(x=‘column A’, y=‘column I/J/F/G’, hue =‘Column H’, data)
- I am able to plot Column A vs Column G/F with hue as Column H. I see the mean of the column H for each discrete value in Column A with a 95% Confidence Interval band. However, I want to understand how does seaborn interpret the y values for the calculated columns I,J. Does it calculate the mean of the column I/J/as :
y value for calculated columns I/J = MEAN (NUMERATOR of the respective column)/ MEAN(DENOMINATOR of respective)
Is it just Mean ( column I/J) ?
Further I want to plot column A vs sum(column C)/sum(Column B) , sum(column D)/sum(Column E) with hue as Column H .
Any advice on How I could achieve this using snslineplot? Operations I need to perform on the dataframe, Manipulate the dataframe, reset index, pivot Table etc . Any help/ explanation could be really helpful and will help me learn , please? Thank you.
- Further I am trying to use sns.replot or sns.facetGrid to create multiple plots in the following way without for loops :
dfm1 = temp_df1.melt(id_vars=['Column A','Column H','Column L'], value_vars=['Column G', 'Column I','Column J','Column F']) sns.relplot(data=dfm1, x='Column A'', y='value', col='Column H', hue='Column L',row='variable', kind='line')
Although I get the plots, I find the plots for columns I/J incomplete ( suspecting this is due to the NaN or inf values). I receive the following error :
invalid value encountered in reduce invalid value encountered in add invalid value encountered in reduce invalid value encountered in multiply
Can anyone please help me resolve it. Also would be helpful If you could help me with an effeceint and less code version to plot the following :
x=Column A vs y =For Columns in ( G, I,J,F) for each value ( 0 or 1) in column H with hue = for Columns in (L,M,N) .
So 4x2x3 would give a total of 24 plots.
Any advice, help or references would be extremely helpful. I apologise as I cannot share the data. thanks.