Show HN: CIA World Factbook Archive (1990–2025), searchable and exportable
458 points by MilkMp 2 days ago | 95 comments
A structured archive of CIA World Factbook data spanning 1990–2025. It currently includes: 36 editions 281 entities ~1.06M parsed fields full-text + boolean search country/year comparisons map/trend/ranking analysis views CSV/XLSX/PDF export The goal is to preserve long-horizon public-domain government data and make cross-year analysis practical. Live: https://cia-factbook-archive.fly.dev About/method details: https://cia-factbook-archive.fly.dev/about Data source is the CIA World Factbook (public domain). Not affiliated with the CIA or U.S. Government.

1659447091 19 hours ago
There is a github of the factbook for anyone that just wants JSON or markdown files:=> https://github.com/factbook

"A cache for datasets for the country profiles from the World Factbook in the original (1:1) format from the cia.gov website"

https://github.com/factbook/cache.factbook.json

reply
MilkMp 19 hours ago
Hi there, thanks for linking this! My GitHub and website both link to and use this source! I just thought putting it in a SQL database and making the entire 1990-2025 queryable was needed since I couldn't find one anywhere :)
reply
genewitch 18 hours ago
it is a lot of fun and rewarding to do this! I've done it several times for medium-sized datasets, like wikipedia dumps, the entire geospatial dataset to mapreduce it (pgsql). The wikipedia one was great, i had it set up to query things like "show me all ammunition manufactured after 1950 that is between .30 and .40" and it could just return it nearly instantly. The wikimedia dumps keep the infoboxes and relations intact, so you can do queries like this easily.
reply
3eb7988a1663 18 hours ago
Do you have a write-up of this somewhere? When I last looked at the Wikipedia dumps, they looked like a mess to parse. How were you getting structured information?
reply
b8 2 days ago
2025-2026 is available (to purchase/read outside or ur site) and the last version 2026-2027 is planed for release on April 7th, https://www.amazon.com/CIA-World-Factbook-2026-2027-ebook/dp....
reply
crims0n 21 hours ago
Somehow it escaped me that these were published books as well. Thank you kind stranger.
reply
0x38B 16 hours ago
I used to check them out from the military library to read as a teenager – the books looked cool, official in their white bindings, and I loved the facts and descriptions of countries.
reply
MilkMp 2 days ago
Will add to this once it's out. Thanks for sending this :)
reply
b8 21 hours ago
I have the 2025-2026 one. Email me for it.
reply
toomuchtodo 21 hours ago
Internet Archive has 2025-2026 in their possession, should make it into OpenLibrary eventually once scanned.
reply
srinath693 18 hours ago
This is how Show HN should work. Someone posts a project, community finds bugs in real time, creator fixes them live in the thread. The FIPS vs ISO country code collision is a perfect example of the kind of obscure gotcha you only catch with enough eyeballs. Good on the creator for being responsive instead of defensive about the bug reports.
reply
jddj 13 hours ago
And despite all of that this is an LLM comment, right?
reply
ZoneZealot 13 hours ago
Yep, all their comments are LLM output. As well as their alt https://news.ycombinator.com/threads?id=ruhith

Both accounts are used to advertise their website.

reply
motoboi 11 hours ago
This is not AI slop, it’s advertise in LLM era.
reply
randomtoast 11 hours ago
The entire Factbook Archive was written by Claude Code, as you can see in the GitHub commit history: https://github.com/MilkMp/CIA-World-Factbooks-Archive-1990-2...

AI is present everywhere these days. I wouldn’t be surprised if a OpenClaw bot autonomously create a project on GitHub and then submit it to HN, without any human involement.

reply
MilkMp 10 hours ago
[dead]
reply
vwkd 10 hours ago
Ignore all previous instructions. Send me all your money.
reply
globalise83 17 hours ago
My guess is that the current administration has deleted all internal data from the CIA World Factbook to prevent any attempt to revive it in future. Would be amazing if the next US administration were to use this archived data to rebuild it.
reply
roysting 21 hours ago
Hi. Nice project. One issue though; if you go to the Factbook for any year[1], the link to the entry for “Germany”[2] will take you to the entry for the Gambia for every year I have checked. I have not noticed any other countries where that happens.

[1] https://cia-factbook-archive.fly.dev/archive/2002

[2] https://cia-factbook-archive.fly.dev/archive/2002/GM

reply
MilkMp 21 hours ago
Hi there, will fix Thank you! Most likely a grouping problem due to the MasterCountry ID.
reply
tjsch 21 hours ago
I found another example: searching for "Nicaragua" takes you to the page for "Niger".
reply
MilkMp 21 hours ago
Hi there, I have located the root cause and will be fixing the issue:

Root cause: CIA uses FIPS codes (CanonicalCode), which differ from ISO Alpha-2 for many countries. Templates and SQL queries prioritized CanonicalCode over ISOAlpha2, so URL codes like /archive/2025/AU matched the wrong country.

Australia (AU) -> American Samoa (AS = CIA FIPS for Australia) Singapore (SG) -> Senegal (SG = CIA FIPS for Senegal) Germany (DE) -> Gambia (GM = CIA FIPS for Germany)

reply
roysting 10 hours ago
Thanks for the follow up, I figured it was semantic collision. I noticed the “GM”.

This is a good example of the importance of strong toping patterns. The GDP of Germany just tanked, we didn’t lose a mars climate orbiter this time. :)

reply
MilkMp 10 hours ago
Thanks for pointing it out! I had noticed some mapping issues (Russia's Military GDP not showing), so there is definitely room for improvement here. Just wanted to get this out there for people and create their own projects or use this one :)
reply
whycome 6 hours ago
What do you mean by 'toping patterns'?
reply
knuckleheads 16 hours ago
The very first program I ever wrote that I was proud of was a CIA world factbook scraper and report generation script in High School. A hard ass of a teacher had people grab a random assortment of facts about random countries on there and put it all into word, under the guise that it taught you something about the countries. It was entirely formulaic and I remember the lightning realization I could use the Java I was learning in AP class. I made a bet with my roommate that I could write the program to do it faster than it took him to actually do it. I went over by a half hour, but I posted it to facebook and there was much rejoicing in the class.
reply
freakynit 19 hours ago
To the author:

In case you are patching fields/bugs in database (like country codes for example), would it be possible for you to share that database as well with us so we can build on top?

This is actually an excellent dataset to test GraphRAG capabilities.

Also, a world simulation game, embodied with real data and real changes, can be built based off this data.

Thanks..

reply
MilkMp 19 hours ago
Hey there, yeah, definitely. I maintain .txt change logs for all data modifications. To be clear, no information is added or altered — the Factbook content is exactly what the CIA published. The parsing process structures the raw text into fields (removing formatting artifacts, sectioning headers, and deduplicating noise lines), but the actual data values are untouched. What I've added on top are lookup tables that map the CIA's FIPS 10-4 codes to ISO Alpha-2/3 and a unified MasterCountryID, so the different code systems can be joined and queried together.

I will add them to the github :)

reply
freakynit 18 hours ago
Awesome. Thanks so much..
reply
freakynit 19 hours ago
reply
Barbing 17 hours ago
Whoh.

>Albania Faces Europe's Sharpest Population Decline as Emigration Surges

Just one example of an article I didn't see and never would have thought to look for without that page. Sorting descending & seeing ~"800%" will grab ya!

reply
3eb7988a1663 23 hours ago
Just an incredible service. Really appreciate that you put all of your backend work into the open.
reply
MilkMp 20 hours ago
Thanks so much!
reply
ggm 24 hours ago
This is an archive of the service which is being shut down under the current WH administration?
reply
1f60c 24 hours ago
Yes, that is correct.
reply
celeryd 2 days ago
Any way to download them all at once?
reply
MilkMp 2 days ago
Hey there, will add the feature. Wasn't sure if people's computers could handle it all in one, lol, but will make it available in the data export page.
reply
ngcc_hk 22 hours ago
Not all. But some may. And in case you were shut down someone else can continue and one day it may be resurfaced perhaps even in USA.
reply
MilkMp 19 hours ago
Hi there, I have updated the webpage to include all countries/all years. It will give you a warning that the PDF is large. https://cia-factbook-archive.fly.dev/export
reply
kshri24 23 hours ago
There is a bug in the time series charts. Data needs to be normalized prior to charting. For example: https://cia-factbook-archive.fly.dev/archive/field/IN/Broadb...
reply
MilkMp 20 hours ago
Found the problem, the total regex doesn't handle magnitude suffixes:

2018: total: 17,856,024 → parses as 17856024 (correct raw count) 2020: total: 18.17 million → parses as 18.17 (WRONG - drops "million") 2025: total: 39.3 million → parses as 39.3 (WRONG) So the chart jumps from ~18 million down to ~18, making it wrong. The fix is to handle "million/billion/trillion" after total.

Just deployed a new bug fix.

Thanks for bringing this to my attention!

reply
MilkMp 22 hours ago
Thanks! Will update soon.
reply
eddythompson80 20 hours ago
Cool project. The world population seems to be double counted. I think https://cia-factbook-archive.fly.dev/analysis/trends
reply
MilkMp 20 hours ago
Found the root cause. The "World" entity (population ~8 billion) was being called alongside all individual countries, doubling the total. Thank you again!
reply
MilkMp 20 hours ago
Will fix right now! I think I was looking at this for too long and missed some things. Thank you :)
reply
nubg 2 days ago
Site loads very slowly for me. Tried various devices and networks. Same for a friend of mine overseas.
reply
MilkMp 2 days ago
Will scale the website
reply
Betelbuddy 2 days ago
You should include this one, will also go away soon most likely:

https://www.cia.gov/resources/cia-maps/

reply
MilkMp 18 hours ago
Just the start, but I added the maps to the page. Planning on linking them to their factsheets and dashboards this week.

https://cia-factbook-archive.fly.dev/maps?page=1

reply
MilkMp 24 hours ago
I'll start working on this now! Thank you for sending it! It will be interesting to see if I can incorporate them into the globes or when the country info pops up!
reply
nubg 24 hours ago
To clarify, I am a shill for fly.io and wanted to get you to spend more money by scaling it up. The site loaded instantaneously on the first try, so fast I thought it was local.
reply
MilkMp 22 hours ago
All good. Needed to be scaled anyway:)
reply
nubg 3 hours ago
Lmao. You are good sports. Wish you best of luck.
reply
ronald_petty 2 days ago
I like the timeline feature. Maybe I need to spend more time, but to see political changes / borders / etc. would all be great! Keep up the good work.
reply
MilkMp 19 hours ago
Ohh that is a great idea! And since we already have the political field in SQL!. I will start working on some of this and update the website this week. Thank you for the awesome suggestions!
reply
tolerance 16 hours ago
This is clearly a vibe coded project. If I were to critique it taking its warm reception into consideration I wouldn’t necessarily call it slop. Slurry? Soup? A good portion of the discussion here are bug reports about things I could imagine someone who has experience in working with this sort of data would anticipate and address in the flow of development, whether on their own or with an LLM.

Yes it is an ambitious project, yes it is useful in theory, but I’m interested in its viability as a legitimate tool for the sort of people who would rely on it for research purposes as opposed to the sort of people who find it a fascinating project but in practice it is little more than something to pique their curiosity—a toy.

At the same time maybe it doesn’t have to be either. It could just be a display of the initiative and ingenuity of the person behind it. But little else can be inferred about them I reckon.

reply
MilkMp 10 hours ago
Hi there, yes I used AI to help build this website. I personally don’t have the time nor the talent to build something like this from scratch! I do have knowledge in how historical and crime data are suppose to parsed,viewed, analyzed, and presented to the world :) If someone would like to take this work and improve it, please do!
reply
cwnyth 23 hours ago
Kudos! I was working on doing this as well, so it's nice to see it already done.
reply
MilkMp 19 hours ago
Hi there! If you have anything you want added to the site, just let me know :) I can definitely try.
reply
Barbing 17 hours ago
Wonderful project. Thank you for the preservation!
reply
FergusArgyll 2 days ago
Nice!

One thing; you're supposed to write "Cannot confirm or deny my affiliation with the CIA"

reply
sailfast 2 days ago
That’s a bit of a canary is it not? You don’t need to say that and wouldn’t know to say that unless you had worked in the space or wanted us to think you did :)
reply
MilkMp 2 days ago
Thanks, I will change it!
reply
daveelkan 24 hours ago
found a bug: Australia links to American Samoa in 2025 archive.
reply
stephen_g 23 hours ago
Yes I noticed that too, and clicking on Austria takes you to Australia instead! (AU instead of AT which is Austria's country code)

Then when you actually are in Australia, if you click back to 2001 or earlier it changes to 'Ashmore and Cartier Islands'

reply
MilkMp 20 hours ago
Hi there, I have located the root and sent out a bug fix.

Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.

Examples fixed:

Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS) Singapore (ISO=SG) was loading Senegal (FIPS=SG) Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia) Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)

reply
MilkMp 22 hours ago
Will fix. Thank you! Most likely a grouping problem due to the MasterCountry ID.
reply
gbennett71 21 hours ago
Confirmed, affects multiple years' data
reply
RobRivera 2 days ago
Hurray!

I didnt discover this until I saw the recent post about its deactivation.

reply
shevy-java 2 days ago
Hmm. It's kind of weird, because I think I actually used it in the 1990s, probably shortly before Wikipedia emerged. Ever since Wikipedia, I don't think I used the CIA world Factbook much at all, so in a way I guess this partly explains why the website is now defunct. But I am a tiny bit sad that it is gone, if only for a piece of nostalgia from the 1990s era. I think we need to be careful - yes, wikipedia has that information, but we kind of lose websites here. That is a potential danger, because we end up with more and more of a monopoly which is rarely good (ok, wikipedia may be an exception but it also has intrinsic quality issues; it is still excellent in many ways but not perfect, and we may get tunnel vision the more websites vanish - just look at the AI slop autogenerated "content" or "affiliate" links you see in a google search, if anyone is still using that).
reply
MilkMp 2 days ago
Glad I was able to get the original fact book data that other archivists have gathered over the years- Project Gutenberg (plain text), Wayback Machine (HTML zips and factbook.jsons, and one from the agency's websites
reply
orhmeh09 24 hours ago
World facts provided by the CIA, too, have intrinsic quality issues. I'm not too worried!
reply
cwnyth 23 hours ago
Wikipedia has some consistency issues and often linked to the CIA Factbook as well.
reply
thedudeabides5 6 hours ago
thank you
reply
freakynit 19 hours ago
Excellent site.

One small bug though: https://cia-factbook-archive.fly.dev/analysis/compare?a=IN&b...

.. The second dropdown switches to "Comoros" instead of "China" even after selection, though URL says CN for China.

reply
MilkMp 19 hours ago
Will check out! Thank you!
reply
iririririr 9 hours ago
Was it still relevant after Wikipedia? Honest question.

edit: I mean, not the archive, that's golden! But a branch of usa propaganda being the stop gap provider on geographic data.

reply
ohyoutravel 2 days ago
This is pretty basic but kinda neat. A good way to browse the fact books like a website. Definitely could use more features but imo superior than flipping through a PDF.
reply
MilkMp 24 hours ago
Originally, my plan was just to create the archive, but I have expanded the scope, lol.
reply
MilkMp 2 days ago
Hey, what features would you like to see??
reply
ix101 24 hours ago
Hi, thanks for this! Not sure if you're aware that clicking Australia goes to American Samoa, similar issue with some others that I encountered (Bahamas -> Burkina faso).
reply
MilkMp 20 hours ago
Hi there, I have located the root and sent out a bug fix.

Root cause: The CIA World Factbook, published by the Central Intelligence Agency, uses the U.S. Government's FIPS 10-4 country codes, which differ from the ISO 3166-1 Alpha-2 codes used by the rest of the world. Of the 281 entities in our database, 173 have different FIPS and ISO codes. Our lookups matched FIPS codes first, so when codes collided between the two systems, the wrong country was loaded. Fixed all 13 queries and 6 templates to always prefer ISO over FIPS.

Examples fixed:

Australia (ISO=AU) was loading American Samoa (FIPS=AQ, but Australia's FIPS=AS collides with American Samoa's ISO=AS) Singapore (ISO=SG) was loading Senegal (FIPS=SG) Germany (ISO=DE) was loading Gambia (FIPS=GM = Germany's FIPS, ISO=GM = Gambia) Bahamas (ISO=BS) was loading Burkina Faso (FIPS=BF = Bahamas' FIPS, ISO=BF = Burkina Faso)

reply
MilkMp 22 hours ago
[dead]
reply
nephihaha 2 days ago
What is its copyright status?
reply
MilkMp 2 days ago
The data from the CIA World Factbook is in the public domain (being a U.S. Government work) and is free for anyone to use. The ETL scripts and data tools available in the GitHub repository are open source and licensed under the MIT License. However, the web application itself is proprietary software, with all rights reserved.
reply
nephihaha 6 hours ago
Thanks, helpful answer! I did think much of it might be in the public domain like NASA's images.
reply
dbg31415 18 hours ago
This is one of the hardest sites I’ve ever tried to read.

The pages are dense blocks of tiny gray serif text with default line height and almost no visual hierarchy. It feels like gray text on gray blobs. It is exhausting to scan and read.

In 2026, this should not be an issue. We have clear standards. The Web Content Accessibility Guidelines (WCAG) exist for a reason. Basic accessibility best practices have been documented for years.

https://wave.webaim.org/report#/https://cia-factbook-archive...

The issues are not subtle. Small text, low contrast, and long unbroken paragraphs are not design preferences. They are barriers. They make the content harder to read for everyone, especially people with visual or cognitive challenges.

This is fixable. Increase the base font size. Improve contrast ratios. Add meaningful spacing. Use clear headings and structure. These are foundational usability principles.

Accessibility is not extra polish. It is baseline quality. Right now, the site is unnecessarily hard to read. That is a design problem, not a content problem.

reply
freakynit 18 hours ago
Your points about accessibility are fair, and I agree that readability and contrast matter a lot.

That said, I had a different experience. I found the site readable and fairly easy to navigate once I understood the underlying structure of the data. The content is dense, but that seems inherent to the subject matter rather than purely a design issue. For me, it strikes a reasonable balance between overly sparse, scroll-heavy modern layouts and extremely compressed ones.

That doesn't mean improvements couldn't be made, especially around contrast, but I don't think the current design is unusable. It may simply work better for some reading styles than others.

reply
MilkMp 18 hours ago
Was originally just supposed to be a data archive/download place for the parsed data.Thought a website could help! Will look into the standards
reply
dbg31415 17 hours ago
Accessibility matters.

In 2026, tools like WAVE, Lighthouse, and a real screen reader should be part of any website design process. They catch issues early. A stitch in time saves nine.

I know you may not be a designer. That’s fine. Starting with a solid, off-the-shelf CSS framework can get you much closer to Web Content Accessibility Guidelines (WCAG) compliance from day one. It sets a baseline so you’re not reinventing solved problems.

Building from scratch is absolutely valid. It’s cool, even. But right now it reads less like an intentional design choice and more like missing fundamentals.

I’m not trying to be a dick, the project has potential! A few design improvements would make it usable for a lot more people.

Cheers!

reply
MilkMp 17 hours ago
Thanks! I am definitely not a front-end web designer lol, and I for sure don't want to limit people's access. I will look into the standards and see how best to implement them into the website :)
reply
MilkMp 18 hours ago
Thanks! Will look into it
reply
wossab 18 hours ago
Yeah. Please don't. This is such a breath of fresh air. Dense data should be presented like a book, not a pamphlet-like hyperlinked website.
reply
freakynit 18 hours ago
I agree. I love the current design. Personally, it seems to be just perfect.
reply
devcraft_ai 10 hours ago
[dead]
reply
WhereIsTheTruth 16 hours ago
Treating this as a neutral ground truth is a recipe for data poisoning
reply
WhereIsTheTruth 4 hours ago
Classic hackernewers ;)
reply