The Data Cleaning Walkthrough lesson called 6. Exploring the Remaining Data asks us to do the following:
Loop through each key in
data. For each key:
- Display the first five rows of the dataframe associated with the key.
This is as simple enough task to do, but the output is pretty hard to read because you have to print out the first five rows for six dataframes that are all pretty huge. The columns don’t align well in printed format making the output hard to read and draw conclusions from. Even when there’s only one row like this, the misalignment can be confusing:
dbn bn schoolname d75 studentssurveyed highschool schooltype rr_s
"01M015" "M015" "P.S. 015 Roberto Clemente" 0 "No" 0 "Elementary School" 88
We can’t leverage the nice clean display that Jupyter Notebook has when trying to display multiple dataframes unless we do them one by one, which can be a little tedious. We’d have to loop through and print.
Is there a better way to do this than printing?
I have tried a few things for this, and none of them work.
This seems to be more of a limitation of their Terminal, as there is no horizontal scrolling enabled for it. The text just gets wrapped.
The most you can do is -
- Print out one row, as you did. And adjust the width of the terminal. You can use the divider for that
Source: Trouble with command line interface
The above might not help much, especially if there are too many columns. The other option is to work on your own Jupyter Notebook instead on your local system so that you can print out things how you want to. Or you can open up a Notebook on the Dataquest Platform and work on it, but you will have to upload that data accordingly as well.
You can also suggest this as feedback to them to improve the terminal using the
Contact Us button in the top-right of this page.
Because their terminal isn’t the best, I work in my own Jupyter Notebook to test the code and then just copy and paste it into their terminal. Even then it’s still kind of a mess:
I don’t know if there’s any way around it. Thanks anyway for the feedback.
For Jupyter Notebook, you can look at the suggestions here - https://stackoverflow.com/questions/51288869/print-visually-pleasing-dataframes-in-for-loop-in-jupyter-notebook-pandas
display() for this as well, and it’s quite helpful.
That worked great, thanks!