I’m working with a large dataset that has many observations (food purchases) for a number of different households, and I’m wanting to reshape the dataset such that all of each household’s individual purchases are contained in one row for each household. Therefore, I’ve been using the pivot_wider function. However, in pivoting wider using the code below, I’ve created a dataframe where each cell is a list of many values. This wasn’t my intention originally but, now that I’ve done it, it actually works better than stretching all of the many many observations to the point where the dataframe is extremely wide. However, the fact that each cell in the dataframe now has a long list is complicating my ability to sum() all of the values in each of those lists. For each row/household, I want to calculate the sum of the values in the hoolohan_100_adjusted list; I do not want to sum all of the lists of every household in the hoolohan_100_adjusted column. Does that make sense? Here is the code I’ve tried thus far:
sample_3_wider.df ← pivot_wider(sample_3.df, names_from = NULL,
values_from = c(BARCODE, BARCODEDESC,
DEPARTMENT, PC, SIZE, UNIT,
BRAND, DATE, SHOPDESC, QTY,
PRICE, EPF, ONPROMO, WEIGHT,
# the sum being calculated and placed in every cell of the sum_GHGEs_100years column is the sum of the values in all of the households’ lists. I want to calculate the sum for each household’s individual list of values in the $hoolohan_100_adjusted column
I’ve tried creating iterative loops to calculate the sum for each row/household, but it keeps giving me just the sum of all the rows/households. Can anyone offer assistance with this matter? Please let me know if there is anything I can do to clarify this question for you.
Thank you in advance for any assistance you can offer!