GP: Analyzing CIA Factbook Data using SQL - Where does `factbook.db` come from?

Screen Link: Guided Project: Analyzing CIA Factbook Data Using SQL

In order to do this guided project locally, we are told to download factbook.db from this link: https://dsserver-prod-resources-1.s3.amazonaws.com/257/factbook.db

Fist question: Who is the owner/creator of this file …Dataquest? …the CIA? …Amazon?

I have looked around the CIA’s World Factbook website in order to learn more about what data is included in this database and what the columns represent but the site does not appear to be directly related to the above database file.

Second question: Why can’t I find *.db files on the CIA website to download myself?

Any clarification on this is greatly appreciated!

That’s specified at the top -

In this project, we’ll work with data from the CIA World Factbook,

It’s at least owned by the CIA unless specified otherwise. You can read more here - The World Factbook - Wikipedia

It is possible that Dataquest created a SQL database from what all is available from that website, but no clarification has been provided on that.

They seem to have updated the CIA website and it’s difficult to find out where the files are. Unfortunately, it doesn’t seem there is any way to get a SQL database from their website itself.

But you can go to https://www.cia.gov/the-world-factbook → About → Explore Our Archives → The World Factbook Archives → https://www.cia.gov/the-world-factbook/about/archives/

That link has all the archives and you can download the ones you want. They are zipped folders containing all the data but not as a SQL database. There are some raw text files in there in the fields folder which could be used to create a database, but I don’t know what the process behind that is right now. (it could be a pretty good exercise as per me, though, and a valuable thing to mention in anyone’s portfolio)

If you want to work locally you will have to work with the db file that DQ shares for now, by the looks of it.

That’s the “clarification” I was looking for in question #1: no clarification has been provided :stuck_out_tongue_winking_eye: After all the reading I did on the CIA website as well as Wikipedia, I knew the data belonged to the CIA but I was more interested in knowing where the SQL database file itself came from and how it came to be. Since many of the links in the mission are outdated, I wasn’t sure if it had been moved on the CIA website or if DataQuest simply created the file themselves. I guess it shall remain a mystery!

I was hoping to play around with more recent and complete data locally but as you said, that’s an exercise in and of itself. Perhaps I will come back to this endeavor after learning some webscraping skills?? Even when files are directly downloaded through the archives, the data is in HTML format. :unamused:

As always, your prompt and thorough responses are greatly appreciated!

1 Like