Project Gutenberg – keeps getting better
231 points by JSeiko 2 hours ago | 72 comments

JSeiko 2 hours ago
Hi! I'm one of the programmers at Gutenberg. We've been improving the site a lot over the past few months (and more is coming!). If you haven't visited the page recently, it's worth checking out again: https://www.gutenberg.org/
reply
Falimonda 2 hours ago
The book list elements on front page render as both horizontally and vertically scrollable divs on mobile - seems like an opportunity for improvement.

Keep up the good work!

reply
JSeiko 2 hours ago
good feedback thanks! Doing an iteration on the homepage design is actually pretty high on the priority list. will keep your feedback in mind!
reply
xrd 2 hours ago
Thank you for your work. This site is an international treasure.
reply
excitednumber 59 minutes ago
Thank you for being one of the best places on the internet
reply
TimorousBestie 3 minutes ago
Wanna let you know you’re doing great work and you have my dream job, thanks to the team for everything!
reply
ExtremisAndy 2 hours ago
Oh, my! This does look nice. Thank you for your hard work!
reply
JSeiko 58 minutes ago
Thanks! We're currently working on a design update of the page of any specific book. Should be online soon (next 1-2 weeks or so)
reply
smallnix 54 minutes ago
There's a minor bug with chrome in android where the menu will not close when you tap outside the menu or on the menu link/button
reply
JSeiko 28 minutes ago
I've messaged the guy who's best suited to fixing this. He'll be on it this weekend
reply
JSeiko 47 minutes ago
will open an "Issue" for it
reply
shuvrojit 55 minutes ago
Great Work. Thank you. I'm also a programmer. If you are ever short on help, let me know. I would love to contribute.
reply
JSeiko 19 minutes ago
https://github.com/gutenbergtools

autocat3 and gutenbergsite are repos responsible for generating gutenberg.org

reply
BiraIgnacio 35 minutes ago
Thanks so much for the work you and your team do!
reply
samcollins 2 hours ago
Very cool! Do you have a recommended way for an agent to see an index of the books and epub links?

(I can’t quite tell if that’s an egregious abuse of the site or you’re perfectly fine to share without human eye balls hitting your www?)

reply
jzs 2 hours ago
Now i'm not associated with gutenberg in any form, but they do have a page for offline consumption:

https://www.gutenberg.org/ebooks/offline_catalogs.html

Perhaps you can find the information you are looking for there.

However if you plan on scraping or otherwise hitting them with a ton of traffic, consider at least to donate a good amount for the traffic you cause them. It ain't free after all.

reply
JSeiko 2 hours ago
Donations are always appreciated ;)
reply
kay_o 2 hours ago
Check out https://www.gutenberg.org/ebooks/offline_catalogs.html

Don't hit the site with agent. The section furtherst bottom machine readable.

reply
samcollins 48 minutes ago
Thanks for the answers! Found it:

> All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.

And strongly consider a donation! (My addition)

https://www.gutenberg.org/ebooks/offline_catalogs.html#the-p...

reply
JSeiko 2 hours ago
not yet, but that's not a bad idea imo. Dealing with Ai crawler traffic is definitely a challenge if that's what you were referring to.
reply
ancientcatz 2 hours ago
OPDS?
reply
gluejar 25 minutes ago
OPDS 2.0 coming RSN. email us if you want to test. OPDS 0.x is currently available (not recommended) by adding .opds to the end of a url
reply
e0d075b569cd 2 hours ago
brother ... are we really THAT stupid now?
reply
throw0101c 2 hours ago
While PG has probably gotten a lot of use and growth with the growth/maintreaming of the Internet since the 1990s, (TIL) it started back in 1971:

> Michael S. Hart began Project Gutenberg in 1971 with the digitization of the United States Declaration of Independence.[5] Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. […] This computer was one of the 15 nodes on ARPANET, the computer network that would become the Internet. Hart believed one day the general public would be able to access computers and decided to make works of literature available in electronic form for free. […]

* https://en.wikipedia.org/wiki/Project_Gutenberg

reply
smilespray 5 minutes ago
I remember printing out project Gutenberg books in the mid-90s, four regular pages to an A4 page, double-sided on my inkjet. I had a background in typography, so I made it work.

Now, in my early fifties and with declining eyesight, that's out of reach now.

reply
gluejar 16 minutes ago
Nice to see so much appreciation for what we do. (I'm the new-ish executive director.) Any wikipedians reading this, the article about PG is... aging. Last I looked, it said we offered Plucker files. @Jseiko has done some nice work.
reply
Someone1234 2 hours ago
I'm surprised no eBook Reader vendor has a Project Gutenberg "Store." Where you can just browse Gutenberg, find a book, and just grab it down to the reader. Instead, they either are actively hostile (Kindle), or require the use of Calibre (which itself is good, it is just the friction).
reply
horsawlarway 60 minutes ago
I've used https://standardebooks.org/ to pull nicely formatted Project Gutenberg books on any e-reader that supports a browser (in my case, Boox).

Technically, I can also just directly pull the epub from Project Gutenberg, but sometimes the formatting leaves a lot to be desired.

Once you get an e-reader that runs a semi-capable OS (ex - stock android, even an older version), it's hard to go back to something like a kindle.

reply
everybodyknows 38 seconds ago
[delayed]
reply
JSeiko 54 minutes ago
standardebooks.org is great!
reply
WillAdams 13 minutes ago
Used to be one could sort of get that with the Project Librivox:

https://librivox.org/

e-book app Gutebooks (in addition to their audio app), but it seems to have been deprecated (I'm no longer able to connect to the server on my copy (which I only got 'cause there was an in-app purchase to fund Project Librivox).

FWIW, Barnes & Noble has been plundering the public domain using a book composition/keying house in the Philippines to make their public domain books which they make available in their stores --- Amazon apparently has a similar setup for the Kindle Store:

https://www.amazon.com/Public-Domain-Books-Kindle-Store/s?k=...

Rather a shame that PG didn't monetize by putting their books up there pre-emptively.

reply
GaryBluto 2 hours ago
Most of them offer their own paid storefronts and have a perverse incentive not to offer a large area full of free books.
reply
JSeiko 55 minutes ago
probably true. Maybe an true open-source eReader should exist.
reply
JSeiko 2 hours ago
I've heard that the newest Kobo e-readers have a browser that you could use to go to gutenberg.org and directly download files.

but yes, generally I agree with your point. Library of 75k books seems pretty valuable to have direct access to.

reply
daveoc64 11 minutes ago
You can download books directly from the Project Gutenberg website using the web browser on most eBook readers - even the Kindle supports it.
reply
cstever 42 minutes ago
No money for them.
reply
ndr42 33 minutes ago
The project was geo-blocked in Germany for a long time: https://news.ycombinator.com/item?id=29024039
reply
kreyenborgi 12 minutes ago
Gutenberg is awesome. There is also

https://www.fadedpage.com/ from Canada I think

https://runeberg.org/ from Sweden

reply
JKCalhoun 55 minutes ago
Project Gutenberg had (has?) a tendency toward plaintext that always put me off. (And it has been over a decade I'm sure since I explored the site—so I am no doubt now misinformed.)

I like a styled formatted book—would prefer PDFs. (I know, not a popular format apparently.)

I like the idea of Project Gutenberg but guess I found book scans on archive.org my preference.

My go-to example is Lewis Carroll's "Through the Looking Glass" with the fantastic art of John Tenniel and Carroll's sometimes creative formatting of the prose…

I see they (Project Gutenberg) have ePub now, which can be good if well done.

(If not well done it can be a kind of mess. Re-flowable "HTML", paginated… Anyone ever try to print a long web page and did you enjoy the result? Perhaps that is as much on the ePub reader though.)

reply
JSeiko 50 minutes ago
We're supporting EPUB3 for the vast majority of books! At the same time we also have a "Plain Text" version for each as in a sense it's the most robust. PdFs are in the works!
reply
RattlesnakeJake 53 minutes ago
Check out Standard eBooks. They take the text from Gutenberg and add a level of polish to the ePubs.
reply
JLO64 52 minutes ago
As others here have mentioned, https://standardebooks.org/ is excellent and my understanding is that they use Gutenberg books as a source for theirs but done up much nicer.
reply
everybodyknows 20 minutes ago
You can contribute to Standard Ebooks by finding OCR errors, then pushing your fixes to https://github.com/standardebooks
reply
dempedempe 38 minutes ago
Source can be anything with the original text, but, more often than not, ends up being PG.
reply
skrtskrt 21 minutes ago
The common issue with PDFs is that e-readers generally have terrible support for them.
reply
jiffygist 50 minutes ago
I on the other hand prefer epubs for fiction. I mostly read on the phone.
reply
gluejar 23 minutes ago
PDF coming this year.
reply
graemep 53 minutes ago
I have got quite a few books over the years from Gutenberg, and the epubs have been fine 0 even of illustrated ones.
reply
the_af 38 minutes ago
I like plain text. You can always post process it into any other format you prefer.
reply
bryankaplan 10 minutes ago
I find it interesting that the context of this comments page apparently overrides the normal definition of “PG” on HN.
reply
JSeiko 8 minutes ago
:D
reply
JSeiko 8 minutes ago
personally I'm a fan of the other "PG" as well.
reply
RattlesnakeJake 54 minutes ago
As a Kindle user, I still miss the old version of the site. The new one looks great on normal desktop, but the old one was simple enough to load and directly download books on the device's built-in browser.
reply
JSeiko 53 minutes ago
That's interesting. What about the new design prevents you from doing it? Genuinely asking here. We may fix it if it's actionable
reply
RattlesnakeJake 34 minutes ago
And now it's time to put my foot in my mouth. I haven't used it in a while because it was frustrating, but you guys seem to have already fixed it :)

The previous version of the site had two major flaws:

1. The search bar had been removed from the top of the page, and hidden behind a "Click here to search" (or similar) link partway down the page

2. Once you opened that page, the coloring of the site was so washed out on e-ink that the text input was hard to find.

Thanks for fixing it!

reply
JSeiko 30 minutes ago
"you guys seem to have already fixed it" - that's what we like to hear :)
reply
graemep 52 minutes ago
Is that a Kindle issue?

You can download books in most browsers. I know Amazon have done things to make life difficult for other stores in the past.

reply
oidar 16 minutes ago
I'm slightly curious how PG handles heavily illustrated books. I've downloaded some years ago, and the quality of the illustrations was always pretty poor. Has it been improved lately? What's the QA like for illustrations?
reply
gluejar 7 minutes ago
Nowadays we depend on scans from Internet Archive, Hathitrust, and other sources. Some scans are better than others. Bear in mind that our illustrations need to be in the public domain and usually from the same edition as the text. https://www.gutenberg.org/help/errata.html
reply
AndrewStephens 17 minutes ago
PG remains one of the best things on the internet. The amount of fascinating material almost beggers belief.
reply
JSeiko 10 minutes ago
the amount of weird/interesting stuff that one would find nowhere else is possibly the coolest aspect of PG imo
reply
seizethecheese 2 hours ago
A big pet peeve of mine with Project Gutenberg was the lack of mobile styling. Looks like it’s been fixed! Awesome.
reply
JSeiko 2 hours ago
good to hear - that was a lot of work!
reply
mowmiatlas 2 hours ago
Made an app that allows reading PG books as audiobooks on iPhone https://loudreader.io/
reply
JSeiko 2 hours ago
that's cool!
reply
aronhegedus 2 hours ago
Recently downloaded Moby Dick from here:) very easy to use
reply
JSeiko 22 minutes ago
Moby Dick is consistently one of the Top Downloads
reply
carlosjobim 34 minutes ago
Their feeds of new books is a goldmine:

https://www.gutenberg.org/ebooks/feeds.html

Every day you'll get much more than you're bargaining for, right into your feed or inbox. Easy download books you're interested in and put them on your Kindle.

reply
WillAdams 10 minutes ago
I used to use the Online Books Page new books listing similarly:

https://onlinebooks.library.upenn.edu/new.html

reply
taubek 2 hours ago
Thank you for reminding me about this project. Didn’t visit it in a long time.
reply
solarity_studio 2 hours ago
Awesome
reply
brcmthrowaway 2 hours ago
I can't read anymore due to fear of not being productive with AI
reply
JSeiko 59 minutes ago
maybe there's a way to read more productively using AI: https://x.com/karpathy/status/1990577951671509438

could be a trick to ease that fear :D

reply