Need some advise here

Hello people,

My name is Alex, and lately I’ve been checking the forum in order to see how other students use to study the content in Dataquest, but, unfortunately I couldn’t find any doubt regarding that. So here is mine. I’d like to ask your personal opinion of my studying way and if maybe I could improve something as well.

Right now I am starting the Step 2, Mission 4 (Data Cleaning and Analysis). So far, and specially since the content starts with Pandas and Numpy, I’ve been doing all the study using Jupyter Notebook, in order to deal with an authentic interface. This allowed me to understand how to call files, use them, and see personally the results of the code I was writing. During the exercises, when I get stuck I check the hints, and if even then I can’t resolve a problem, I check the answer.

So, all this, allowed me to build some notes, that I rescue when I get stuck trying to remember specific things like… “df.iloc”, “df [ “” ] or df [ [ “” ] ]?” etc. These kind of questions about how my code is written and that sometimes we forget.

In coding, is normal to forget these kinds of things? or are you trying to memorize all the libraries and the classes, atributes, parameters etc.?

Sharing some advise about this might help me check if I have to change something or if I can keep my track this way.

Thank you guys :slight_smile:

2 Likes

Hi @a.marimon.fernandez, welcome to the community! We’re glad you’re here!

Everybody learns at their own pace and in their own way. For me, I had trouble remembering certain things as well at the beginning (including, yes, things like df.iloc vs df.loc!). At one point I went back through the lessons and took some handwritten notes, where instead of focusing on completing the mission, I focused on making sure I understood the logic of why we coded a certain way. I’m planning to go back again through the missions and do everything in Jupyter notebook, especially the areas that still give me trouble. For me, it’s more important to understand the logic behind what I’m doing because I’m more likely to remember how to do it in the future (or remember where to find what I’m looking for!) It also helps me to think about how I could explain what I’m doing to someone else (which is how I got started participating in the forums here!).

I think it’s pretty normal to forget code until you’re used to using it regularly. I find myself going over the documentation for pandas and matplotlib several times, and searching through Stack Overflow to pick up some new tricks. I played around on codewars.com just for fun when I started getting frustrated with Python or SQL, and it helped me get push past it.

My goal for next month I’m going to start focusing on doing some projects and working with other datasets to practice data cleaning and creating visualizations!

Good luck on your journey! :smile:

5 Likes

I found this course to be very helpful in learning coding.

Learning How to Learn - Barbara Oakley

1 Like

Create a good note.

Example:
Numpy Characteristic

  • Array can only contain 1 datatype
  • When a cell display’s nan mean it was original a non-numeric data

import numpy as np – Importing Numpy

ARRAY = np.array([LIST]) – Convert a list into a n-dimensional array

Process of importing numpy file

ARRAY.shape – Output the number of columns and rows in an array

  • Use ARRAY.shape[1] to display number of rows
  • Use ARRAY.shape[0] to display number of columns

ARRAY[ROWINDEX, COLUMNINDEX] – Select an object based on positional indexing

ARRAY[ROWINDEX] – Select an object based on row indexing

ARRAY[:, COLUMNINDEX] – Select an object based on column indexing

ARRAY[[ROW1,ROW2],[COLUMN1,COLUMN2]] – Select an object in multiple rows and columns

Example : columns_1_4_7 = taxi[:,[1, 4, 7]]

VECTORCALCULATION = ARRAY[INDEX] + ARRAY[INDEX] – Using maths operation on multiple arrays

The arrays used to calculate the vector must be of the same shape (Same number of rows and columns)

ARRAY.METHOD() – To use a function on a n-dimensional array

ARRAY.max(axis = 1) – To output the max value of every row or column

Use axis = 1 for row and axis = 0 for column

ARRAY = np.genfromtext(“FILE”, delimiter = “,”) – Read a delimited file into numpy array directly

Example : taxi = np.genfromtxt(‘nyc_taxis.csv’, delimiter = ‘,’)

ARRAY.type – Shows the datatype used in the array

Using maths symbols on an array – The function of the math symbol will apply on all objects in an array

  • Example : np.array([2,4,6,8]) + 10 >>> [12 14 16 18]
  • Example : np.array([2,4,6,8]) < 5 >>> [True True False False]

Boolean Indexing – Using Boolean to filter out object that meets specified criteria and select those objects

Modifying object in array (Shortcut Method) – ARRAY[CONDITION] = VALUE

2 Likes

Hi @a.marimon.fernandez,

I am doing Anki to get used to “n” number of Data science concepts. check this link Anki. Good Luck.