How do we generate our own test data in this manner?

The project states we should generate test data like so:

  1. Generate many entries (about 50,000 should be enough).
  2. Generate many query intervals. It’s better to generate in a way that increases the number of query results from 1 to some given value. This way, we can see the impact of the number of results in the performance gained.

How do we go about generating such data without using the provided csv files?

Also, in regards to number 2, why is this manner of testing more appropriate than generating random numbers?