" note that the minimum and maximum values only informed our choice — they don’t necessarily have to be the starting and ending point of the intervals "
I understand min & max commands but I don’t understand this note.
This note is written in dictionaries and frequency mission, here: https://app.dataquest.io/m/314/dictionaries-and-frequency-tables/13/filtering-for-the-intervals
Here it means that the end points of our chosen ranges (I mean, these values: 0,10000000, 50000000, 100000000 MB, etc.) not necessarily can be found in our
data_sizes list. They are just nice round values which we decided to use for delimitation of the categories, and not obligatory real values from our list.
But they are based on the list values that we have, we are estimating to make things easy, am I right?
Yes, you are right, they are based on the list values, but only in the following sense:
We estimated these minimum and maximum values of the whole list. And then we decided to categorize the values of the list, i.e. to divide them into several categories. Judging by our min and max values (589,824 and 4,025,969,664), we thought that the appropriate ranges here can be: 0-10 mln, 10-50 mln, 50-100 mln, 100-500 mln, and 500+ mln. Our min value will be (together with some other values) in the first range, our maximum of 4 bln (together with some other values) - in the last interval. Here we assume that there will be few values greater than 500 mln.
So, let’s say, our ranges above are just a rough estimation. We will categorize the values of the list according to these ranges, and then afterwards, if we are not happy with the results (for example, too many values went to one of the intervals and too few - in all the others), we can actually reconsider these ranges, refine them.