Standard practice for higher dimensional arrays in Pandas

Let me provide a motivating example to contextualize my question:
Suppose we survey a group of 20 people with 10 questions and responses with 5 categories that range “Strongly Disagree” to “Strongly Agree.” Then we could structure a Pandas dataframe with 20 rows (for each question) and 5 columns (with entries that count the number of such responses), e.g.,

|     | Strongly Disagree | Disagree | Neutral | ... |
|-----|-------------------|----------|---------|-----|
| Q1  | 5                 | 2        | 0       |     |
| Q2  | 1                 | 3        | 4       |     |
| ... |                   |          |         |     |

Now suppose we perform this survey with a bunch of other groups.

__Question:__What is the standard, i.e., best practice, approach to structure such data?

MultiIndexing seems to provide one solution. Is this commonly used?

Is it more common to simply add the rows, a column that signifies which group it came from, and then leverage groupby techniques?

Would it be preferable to restructure the data so that the groups are row entries and the columns are both the questions and the responses.

All of these approaches feel rather coercive. Is there another approach that mimics the 3d-tensor structure more directly?

Links to resources are more than welcome. Thanks in advance!

All these approaches sound workable to me. You can also try the panel datastructure of pandas though i have never used it and never seen anyone use it.
For groupby, it doesn’t matter if the information you want to group on is in the index or a column. (To clarify your implication that multi-index and groupby don’t fit together).
Is there a particular reason you want it to be 3d?
My personal choice for this case is multi-Indexing over adding a column because information used to identify observations make sense to go in the index. Also, it allows you to make use of index-alignment features of pandas and set theory operations.
That said, you can also go with whatever operations most easily achieve your goal. Some work on series in columns, some work on index. Just use reset_index or set_index to move information between columns and index to make use of pandas methods.

It could be as simple as this:

| Respondent_ID | Question | Answer |
|---------------|----------|--------|
| ID1           |    Q1    |     -5 |
| ID1           |    Q2    |      3 |
| ID1           |    Q3    |      1 |
| ID2           |    Q1    |      3 |
| ...           |          |        |

Where answer value is coded as -5 for “Strongly Disagree” to “Strongly Agree” as 5.

Or like this:

|     | Q1 | Q2 | Q3 | ... |
|-----|----|----|----|-----|
| ID1 | -5 |  2 |  0 |     |
| ID2 |  1 |  3 |  4 |     |
| ... |    |    |    |     |

Then you can process this data as you like, grouping and summing it.

Thank you for the feedback.

Multi-index seems straightforward. As a self-learner it can be a bit difficult to discern between what is possible and what is actually used by others. So thank you for clarifying.

Is there a particular reason you want it to be 3d?

Mostly curiosity. Because the dataframe for each individual group would have the exact same row and column headers, it seemed tempting to find a solution that reflected this repetition. That said, I’m content to have a found a solution that works :slight_smile: