Hacker News

229 points by roywashere 14 hours ago | 168 comments

nateb2022 28 minutes ago

I'll plug Pyreqwest here: https://github.com/MarkusSintonen/pyreqwest

It's been a pleasure to use, has a httpx compatibility layer for gradually migrating to its API, and it's a lot more performant (right now, I think it's the most performant Python http client out there: https://github.com/MarkusSintonen/pyreqwest/blob/main/docs/b...)

zahlman 4 hours ago

> The fix was ignored and there was never any release since November 2024. Me, and others, asked repeatedly for a release containing my fix. I sent email to the author personally. I got response when I added that I was considering forking. The author replied “1.0 development is on course”.... I do understand about maintainer burnout, and preferring to work on ‘next’, and that there is life outside of Python, but I think not doing anything for maintenance and also not letting other people help out in maintaining, for such a high profile module, is problematic.

I feel like it's counterproductive in situations like this to mention forking. It will come across like a threat, when there isn't really anything intrinsically aggressive about it. So just do it; and when you have a decent amount of separate development, you can decide whether to make PRs back, advertise your fork, etc.

renegat0x0 24 minutes ago

There are many nice http clients:

- httpx

- curl cffi

- httpmorph

- httpcloak

- stealth crawler

I wrote a framework, link below, which uses them all. You can compare each to verify crawling speed. Some sites can be cleanly crawled with a one particular framework.

Having read the article I am in a pain. I do break things while development. I rewrite stuff. Maybe some day I will find a way to develop things "stable". One thing I try to keep in good shape is 'docker' image. I update it once everything seems to be quite stable.

https://github.com/rumca-js/crawler-buddy

mesahm 13 hours ago

the http landscape is rather scary lately in Python. instead of forking join forces... See Niquests https://github.com/jawah/niquests

I am trying to resolve what you've seen. For years of hard work.

samset7 10 hours ago

We have switched to niquests in my company and yes I can confirm that it's 10x better than httpx :)

_boffin_ 6 hours ago

What issues do / did you have with HTTPx?

samset7 4 hours ago

The main pain points for us were: thread-safety issues (httpx claims to be thread-safe but we hit race conditions in production), no HTTP/3 support, and the redirect behavior requiring explicit opt-in everywhere. Also the multiplexing story in httpx is quite limited compared to what niquests offers out of the box. On top of that, httpx maintenance has been slow to acknowledge valid bug reports, the thread-safety issue took over a year to even be acknowledged...

PyWoody 7 hours ago

Did you have any warts when switching? httpx has been "fine" for me but this thread has me seriously considering changing to niquests.

samset7 4 hours ago

The switch was surprisingly smooth. I think there's an official migration guide in the doc. Honestly the API is closer to the classic requests library so nobody will be lost.

mesahm 9 hours ago

nice to hear :)

u_sama 13 hours ago

It is indeed a shame that niquests isn't used more, I think trying to use the (c'est Français) argument to in French will bring you many initial users needed for the inertia

mesahm 13 hours ago

ahah, "en effet"! je m'en souviendrai.

more seriously, all that is needed is our collective effort. I've done my part by scarifying a lot of personal time for it.

u_sama 11 hours ago

I saw there are almost no bugs or things to contribute, are there other ways to help ?

mesahm 9 hours ago

yes, plenty! testing it extensively, finding edge bugs, (...) and of course: spread the word on other project to help increasing adoption.

mananaysiempre 10 hours ago

No Trio support yet, right? That’s the main reason to use httpx for me at least, and has been since I first typed “import httpx” some years ago.

(Also the sponsorship subscription thing in the readme gives me vague rugpull vibes. Maybe I’ve just been burned too much—I don’t mean to discourage selling support in general.)

mesahm 9 hours ago

help for getting it working is appreciated, we have it in mind. duly noted about the sponsorship, we accept constructive criticism, and alternative can be considered.

duskdozer 13 hours ago

Is it knee-quests or nigh-quests?

I've started seeing these emoji-prefixed commits lately now too, peculiar

mesahm 13 hours ago

it's the gitmoji thing, I really don't like it, it was a mistake. Thinking to stop it soon. I was inspired by fastapi in the early days. I prefer conventionalcommits.org

croemer 12 hours ago

Please don't be too much inspired by FastAPI - at least regarding maintainer bus factor and documentation (FastAPI docs are essentially tutorial only), and requiring dozens of hoops to jump through to even open an issue.

mesahm 12 hours ago

agreed. as I said, it was a mistake from my end. and clearly looking to better myself.

u_sama 13 hours ago

There is a series of extensions for Vscode that add this functionality like https://github.com/ugi-dev/better-commits

duskdozer 12 hours ago

ah ok, I am familiar with and not exactly against (non-emoji) commit message prefixes

rob 7 hours ago

I better start seeing some caterpillar emojis in your next commits or we're gonna have a real problem!

mesahm 13 hours ago

nee-quests, I am French native.

Biganon 9 hours ago

niquests as in "niquer" ?

J'aime bien, j'aime bien

tomjakubowski 2 hours ago

what a delightful pun, bravo

duskdozer 12 hours ago

I guess kind of obvious now noticing the rhyme

greatgib 13 hours ago

The basis of httpx is not very good at all.

I think that it owes its success to be first "port" of python requests to support async, that was a strong need.

But otherwise it is bad: API is not that great, performance is not that great, tweaking is not that great, and the maintainer mindset is not that great also. For the last point, few points were referenced in the article, but it can easily put your production project to suddenly break in a bad way without valid reason.

Without being perfect, I would advise everyone to switch to Aiohttp.

sgt 9 hours ago

I literally the other week had the choice between using requests and httpx. I chose httpx after deliberating a bit. I don't need async capabilities right now but I figured it'll be more consistent if that changes later.

nyrikki 7 hours ago

I started using the ports and adapters pattern and protocol for any packages that have replacements or concerns.

Basically treating HTTP requests as an orthogonal, or cross-cutting concern.

It is sometimes hard to tell if these upstream packages are stable or abandoned.

I should probably document my methodology so it can help others or at least have the chance to find out what mistakes or limitations they might have.

mesahm 13 hours ago

aiohttp is an excellent library. very stable. I concurs, but! it's too heavily tied to HTTP/1, and well, I am not a fan of opening thousands of TCP conn just to keep up with HTTP/2 onward. niquests easily beat aiohttp just using 10 conn and crush httpx see https://gist.github.com/Ousret/9e99b07e66eec48ccea5811775ec1...

fwiw, HTTP/2 is twelve years old, just saying.

sammy2255 13 hours ago

aiohttp is for asynchronous contexts only

Orelus 13 hours ago

Can confirm, more features, a breeze to switch.

roywashere 11 hours ago

Thanks, I'll link to your project

mesahm 11 hours ago

Thank you. Appreciated, you're welcome here anytime.

hrmtst93837 11 hours ago

Half-melded side projects just pollute PyPI more, you get less grief long-term by biting the bullet and shipping a fork that owns its tradeoffs.

ayhanfuat 13 hours ago

More related drama: The Slow Collapse of MkDocs (https://fpgmaas.com/blog/collapse-of-mkdocs/)

duskdozer 12 hours ago

>thread to call out Read the Docs for profiting from MkDocs without contributing back.

>They also point out that not opening up the source code goes against the principles of Open Source software development

I will never stop being amused when people have feelings like this and also choose licenses like BSD (this project). If you wanted a culture that discouraged those behaviors, why would you choose a license that explicitly allows them? Whether you can enforce it or not, the license is basically a type of CoC that states the type of community you want to have.

vocx2tx 10 hours ago

The reason is simple: they'd like to reap all the benefits of a permissive licence (many people and companies won't or can't touch GPL code), without any of the downsides; but these downsides are the very reason behind the rules in more 'restrictive' licenses like the GPL.

This usually doesn't work, and in the end all they can do is complain about behaviours that their license choice explicitly allowed.

72deluxe 10 hours ago

Yes I agree completely. I am baffled why they choose that license in the first place. It just seems to engender drama when people actually follow the license they've chosen! Perhaps open source is actually powered by drama, where developers have more meaning from the drama they create than the actual things they create?

znpy 13 hours ago

Oh i recognised one of the involved people immediately, drama person.

I still think that hijacking the mkdocs package was the wrong way to go though.

The foss landscape has become way too much fork-phobic.

Just fork mkdocs and go over your merry way.

globular-toast 12 hours ago

Right, my suspicion was correct. When I interacted with them a few years ago they seemed perfectly nice and friendly, but seem to have gone off the rails more recently. It's an uncomfortable situation and I've a feeling people are afraid to discuss this kind of thing but we really need to. People are a risk factor in software projects and we need to be resilient to changes they face. Forking is the right way, but places like GitHub have sold people on centralisation. We need to get back to decentralised dev.

znpy 9 hours ago

> but places like GitHub have sold people on centralisation. We need to get back to decentralised dev.

I don’t think that’s the case. It’s more of a marketing/market incentive. It’s great pr to be associated with the most famous project, way less so to be associated with a fork, at least until the fork becomes widespread and well recognised.

GitHub does make it fairly easy to fork a project, I wouldn’t blame the situation on github.

kurtis_reed 7 hours ago

Who are they?

rglullis 13 hours ago

Drama around Starlette. Drama around httpx. Drama around MkDocs. I just hope that DRF is not next, I still have some projects that depend on it.

mananaysiempre 10 hours ago

Per TFA, there’s similarly-shaped low-key drama around DRF too[1] although issues and discussions have been reënabled since then.

[1] https://github.com/orgs/encode/discussions/11#discussioncomm...

forkerenok 12 hours ago

What's the drama around starlette? (Can't find anything)

mananaysiempre 10 hours ago

https://github.com/Kludex/starlette/issues/3180 and before that https://github.com/Kludex/starlette/issues/3042

noirscape 10 hours ago

I think that may be the first time I've seen licensing drama over something as minor as adding another author to the copyright list.

Pretty sure those are completely standard for major changes in maintainers/hostile forks/acknowledging major contributors. I've seen a lot of abandoned MIT/BSD projects add a new line for forks/maintainers being active again in order to acknowledge that the project is currently being headed by someone else.

From my "I am not a lawyer" view, Kludex is basically correct, although I suppose to do it "properly", he might need to just duplicate the license text in order to make it clear both contributors licensed under BSD 3-clause. Probably unnecessary though, given it's not a license switch (you see that style more for ie. switching from MIT to BSD or from MIT/BSD to GPL, since that's a more substantial change); the intent of the license remains the same regardless and it's hard to imagine anyone would get confused.

I suspect (given the hammering on it in responses), that Kludex asking ChatGPT if it was correct is what actually pissed off the original developer, rather than the addition of Kludex to the list in and of itself.

mananaysiempre 10 hours ago

(Not a lawyer either but—)

The original author said they were “the license holder”, specifically with a “the”, in discussions around both Starlette and MkDocs, which yes, just isn’t true even after rounding the phrase to the nearest meaningful, “the copyright holder”. This appears to be an honest misconception of theirs, so, not the end of the world, except they seem to be failing at communication hard enough to not realize they might be wrong to begin with.

Note though that with respect to Starlette this ended up being essentially a (successful and by all appearances not intentionally hostile?) project takeover, so the emotional weight of the drama should be measured with respect to that, not just an additional copyright line.

kwsp 12 hours ago

[dead]

0x073 10 hours ago

If this would be a tv show I probably would view it, but wow what a drama.

WesolyKubeczek 9 hours ago

On one hand, that account of the attempted project takeover smelled to me like Jia Tan.

On the other hand, the comments the MkDocs author is making about perceived gender grievances feel so unhinged that I wouldn't be touching anything made by them with a barge pole.

bojan 8 hours ago

> On one hand, that account of the attempted project takeover smelled to me like Jia Tan.

Oleh was basically the sole maintainer for many years, and the development basically stopped when he left.

WesolyKubeczek 7 hours ago

Yes, I know you can be legit, but when you first contribute a few useful things, then jump to maintainership and want keys to the kingdom, the pattern looks similar (sans the last step which is embedding some backdoor). At least in how the article described it.

Kwpolska 13 hours ago

What is it about Python that makes developers love fragmentation so much? Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client. But not in Python, stdlib only has the ugly urllib.request, and everyone is using third party stuff like requests or httpx, which aren't always well maintained. (See also: packaging)

dirkc 12 hours ago

You would think that sending HTTP requests is a basic capability, but I've had fun in many languages doing so. Long ago (2020, or not so long ago, depending on how you look at it) I was surprised that doing an HTTP request on node using no dependencies was a little awkward:

  const response = await new Promise( (resolve, reject) => {
    const req = https.request(url, {
    }, res => {
      let body = "";
      res.on("data", data => {
        body += data;
      });
      res.on('end', () => {
        resolve(body);
      });
    });
    req.end();
  });

wging 11 hours ago

These days node supports the fetch API, which is much simpler. (It wasn't there in 2020, it seems to have been added around 2022-2023.)

dirkc 11 hours ago

Yes, thankfully! It's amusing to read what they say about fetch on nodejs.org [1]:

> Undici is an HTTP client library that powers the fetch API in Node.js. It was written from scratch and does not rely on the built-in HTTP client in Node.js. It includes a number of features that make it a good choice for high-performance applications.

[1] - https://nodejs.org/en/learn/getting-started/fetch

Pay08 7 hours ago

Why is it amusing?

dirkc 4 hours ago

I say amusing because it points out that something I (and many other people) assume to be basic clearly has a lot more nuance to it.

b450 7 hours ago

Note that node-fetch will silently ignore any overrides to "forbidden" request headers like Host, since it's designed for parity with fetch behavior in the browser. This caused a minor debugging headache for me once.

rzmmm 10 hours ago

Web standards have rich support for incremental/chunked payloads, the original node APIs are designed around it. From this lens the Node APIs make sense.

simlevesque 8 hours ago

And you don't handle errors at all...

ivanjermakov 12 hours ago

HTTP client is at the intersection of "necessary software building block" and "RFC 2616 intricacies that are hard to implement". Has nothing to do with Python really.

maccard 12 hours ago

> Then I found out it was broken. I contributed a fix. The fix was ignored and there was never any release since November 2024.

This seems like a pretty good reason to fork to me.

> Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client. But not in Python,

Or Javascript (well node), or golang (http/net is _worse_ than urllib IMO), Rust , Java (UrlRequest is the same as python's), even dotnet's HttpClient is... fine.

Honestly the thing that consistently surprises me is that requests hasn't been standardised and brought into the standard library

francislavoie 12 hours ago

What, Go's net/http is fantastic. I don't understand that take. Many servers are built on it because it's so fully featured out of the box.

maccard 8 hours ago

The server side is great. Sending a http request is… not

lenkite 12 hours ago

Your java knowledge is outdated. Java's JDK has a nice, modern HTTP Client https://docs.oracle.com/en/java/javase/11/docs/api/java.net....

ffsm8 10 hours ago

Ahh, java. You never change, even if you're modern

    HttpClient client = HttpClient.newBuilder()
        .version(Version.HTTP_1_1)
        .followRedirects(Redirect.NORMAL)
        .connectTimeout(Duration.ofSeconds(20))
        .proxy(ProxySelector.of(
           new InetSocketAddress("proxy.example.com", 80)
        ))
        .authenticator(Authenticator.getDefault())
        .build();

       HttpResponse<String> response = client.send(request, BodyHandlers.ofString());

       System.out.println(response.statusCode());
       System.out.println(response.body());

For the record, you're most likely not even interacting with that API directly if you're using any current framework, because most just provide automagically generated clients and you only define the interface with some annotations

awkwardpotato 10 hours ago

What's the matter with this? It's a clean builder pattern, the response is returned directly from send. I've certainly seen uglier Java

freedomben 8 hours ago

Just my opinion of course, but:

> What's the matter with this?

To me what makes this very "Java" is the arguments being passed, and all the OOP stuff that isn't providing any benefit and isn't really modeling real-world-ish objects (which IMHO is where OOP shines). .version(Version.HTTP_1_1) and .followRedirects(Redirect.NORMAL) I can sort of accept, but it requires knowing what class and value to pass, which is lookups/documentation reference. These are spread out over a bunch of classes. But we start getting so "Java" with the next ones. .connectTimeout(Duration.ofSeconds(20)) (why can't I just pass 20 or 20_000 or something? Do we really need another class and method here?) .proxy(ProxySelector.of(new InetSocketAddress("proxy.example.com", 80))), geez that's complex. .authenticator(Authenticator.getDefault()), why not just pass bearer token or something? Now I have to look up this Authenticator class, initialize it, figure out where it's getting the credentials, how it's inserting them, how I put the credentials in the right place, etc. The important details are hidden/obscured behind needless abstraction layers IMHO.

I think Java is a good language, but most modern Java patterns can get ludicrous with the abstractions. When I was writing lots of Java, I was constantly setting up an ncat listener to hit so I could see what it's actually writing, and then have to hunt down where a certain thing is being done and figuring out the right way to get it to behave correctly. Contrast with a typical Typescript HTTP request and you can mostly tell just from reading the snippet what the actual HTTP request is going to look like.

looperhacks 7 hours ago

> but it requires knowing what class and value to pass

Unless you use a text editor without any coding capabilities, your IDE should show you which values you can pass. The alternative is to have more methods, I guess?

> why can't I just pass 20 or 20_000 or something

20 what? Milliseconds? Seconds? Minutes? While I wouldn't write the full Duration.ofSeconds(20) (you can save the "Duration."), I don't understand how one could prefer a version that makes you guess the unit.

> proxy(ProxySelector.of(new InetSocketAddress("proxy.example.com", 80))), geez that's complex

Yes it is, can't add anything here. There's a tradeoff between "do the simple thing" and "make all things possible", and Java chooses the second here.

> .authenticator(Authenticator.getDefault()), why not just pass bearer token or something?

Because this Authenticator is meant for prompting a user interactively. I concur that this is very confusing, but if you want a Bearer token, just set the header.

freedomben 6 hours ago

Fair points.

> Unless you use a text editor without any coding capabilities, your IDE should show you which values you can pass. The alternative is to have more methods, I guess?

Fair enough, as much as I don't like it, in Java world it's safe to assume everyone is using an IDE. And when your language is (essentially) dependent on an IDE, this becomes a non-issue (actually I might argue it's even a nice feature since it's very type safe).

> 20 what? Milliseconds? Seconds? Minutes? While I wouldn't write the full Duration.ofSeconds(20) (you can save the "Duration."), I don't understand how one could prefer a version that makes you guess the unit.

I would assume milliseconds and would probably have it in the method name, like timeoutMs(...) or something. I will say it's very readable, but if I was writing it I'd find it annoying. But optimizing for readability is a reasonable decision, especially since 80% of coding is reading rather than writing (on average).

Pay08 7 hours ago

> why can't I just pass 20 or 20_000 or something? Do we really need another class and method here?

If you've ever dealt with time, you'll be grateful it's a duration and not some random int.

zahlman 4 hours ago

> What's the matter with this? It's a clean builder pattern

I feel like you answered yourself. Java makes you do this by not supporting proper keyword arguments.

colejohnson66 8 hours ago

The boilerplate of not having sane defaults. .NET is much simpler:

    using HttpClient client = new();
    HttpResponseMessage response = await client.GetAsync("https://...");
    if (response.StatusCode is HttpStatusCode.OK)
    {
        string s = await response.Content.ReadAsStringAsync();
        // ...
    }

pjmlp 7 hours ago

Yeah, so much simpler,

"Common IHttpClientFactory usage issues"

https://learn.microsoft.com/en-us/dotnet/core/extensions/htt...

"Guidelines for using HttpClient"

https://learn.microsoft.com/en-us/dotnet/fundamentals/networ...

And this doesn't account for all gotchas as per .NET version, than only us old timers remember to cross check.

colejohnson66 6 hours ago

I didn't mention IHttpClientFactory - just HttpClient. I will concede that ASP manages to be confusing quite often. As for the latter, guidelines are not requirements anymore than "RTFM" is; You can use HttpClient without reading the guidelines and be just fine.

pjmlp 6 hours ago

For various outcomes of fine, depending on .NET version, given that not everyone is on very latest.

lmz 8 hours ago

That's just an example. It does have defaults: https://docs.oracle.com/en/java/javase/11/docs/api/java.net.... (search for "If this method is not invoked")

PxldLtd 9 hours ago

Yeah this is all over Rust codebases too for good reason. The argument is that default params obfuscate behaviour and passing in a struct (in Rust) with defaults kneecaps your ability to validate parameters at compile time.

Pay08 7 hours ago

It does have defaults, the above example manually sets everything to show people reading the docs what that looks like.

lenkite 9 hours ago

Your http client setup is over-complicated. You certainly don't need `.proxy` if you are not using a proxy or if you are using the system default proxy, nor do you need `.authenticator` if you are not doing HTTP authentication. Nor do you need `version` since there is already a fallback to HTTP/1.1.

  HttpClient client = HttpClient.newBuilder()
    .followRedirects(Redirect.NORMAL)
    .connectTimeout(Duration.ofSeconds(20))
    .build();

ffsm8 9 hours ago

It was literally just copy pasted from the linked source (the official Oracle docs)

Tostino 8 hours ago

And those docs were likely trying to show you how to use multiple features, not the most basic implementation of it

ffsm8 5 hours ago

I mean dont get me wrong, I work with Java basically 8 hours per day. I also get _why_ the API is as it is - It essentially boils down to the massive Inversion of Control fetish the Java ecosystem has.

It does enable code that "hides" implementation very well, like the quoted examples authentication API lets you authenticate in any way you can imagine, as in literally any way imaginable.

Its incredibly flexible. Want to only be able to send the request out after you've touched a file, send of a Message through a message broker and then maybe flex by waiting for the response of that async communication and use that as a custom attribute in the payload, additionally to a dynamically negotiated header to be set according to the response of a DNS query? yeah, we can do that! and the caller doesnt have to know any of that... at least as long as it works as intended

Same with the Proxy layer, the client is _entirely_ extensible, it is what Inversion of Control enables.

It just comes with the unfortunate side-effect of forcing the dev to be extremely fluent in enterprisey patterns. I dont mind it anymore, myself. the other day ive even implemented a custom "dependency injection" inspired system for data in a very dynamic application at my dayjob. I did that so the caller wont even need to know what data he needs! it just get automatically resolved through the abstraction. But i strongly suspect if a jr develeoper which hasnt gotten used to the java ecosystem will come across it, he'll be completely out of his depth how the grander system works - even though a dev thats used to it will likely understand the system within a couple of moments.

Like everything in software, everything has advantages and disadvantages. And Java has just historically always tried to "hide complexity", which in practice however paradoxically multiplies complexity _if youre not already used to the pattern used_.

Tostino 4 hours ago

Thanks for the thoughtful response, I appreciate it.

Yeah, I remember the first time I encountered a spring project (well before boot was out) and just about lost my shit with how much magic was happening.

It is productive once you know a whole lot about it though, and I already had to make that investment so might as well reap the rewards.

umvi 11 hours ago

What's wrong with Go's? I've never had any issues with it. Go has some of the best http batteries included of any language

jerf 5 hours ago

Go's net/http Client is built for functionality and complete support of the protocol, including even such corner cases as support for trailer headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/... Which for a lot of people reading this message is probably the first time they've heard of this.

It is not built for convenience. It has no methods for simply posting JSON, or marshaling a JSON response from a body automatically, no "fluent" interface, no automatic method for dealing with querystring parameters in a URL, no direct integration with any particular authentication/authorization scheme (other than Basic Authentication, which is part of the protocol). It only accepts streams for request bodys and only yields streams for response bodies, and while this is absolutely correct for a low-level library and any "request" library that mandates strings with no ability to stream in either direction is objectively wrong, it is a rather nice feature to have available when you know the request or response is going to be small. And so on and so on.

There's a lot of libraries you can grab that will fix this, if you care, everything from clones of the request library, to libraries designed explicitly to handle scraping cases, and so on. And that is in some sense also exactly why the net/http client is designed the way it is. It's designed to be in the standard library, where it can be indefinitely supported because it just reflects the protocol as directly as possible, and whatever whims of fate or fashion roll through the developer community as to the best way to make web requests may be now or in the future, those things can build on the solid foundation of net/http's Request and Response values.

Python is in fact a pretty good demonstration of the risks of trying to go too "high level" in such a client in the standard library.

Orygin 10 hours ago

I guess he never used Fiber's APIs lol

The stdlib may not be the best, but the fact all HTTP libs that matter are compatible with net/http is great for DX and the ecosystem at large.

maccard 8 hours ago

Thr comment I replied to was talking about sending a http requests. Go’s server side net/http is excellent, the client side is clunky verbose and suffers from many of the problems that Python’s urllib does.

localuser13 12 hours ago

>Honestly the thing that consistently surprises me is that requests hasn't been standardised and brought into the standard library

Instead, official documentation seems comfortable with recommending a third party package: https://docs.python.org/3/library/urllib.request.html#module...

>The Requests package is recommended for a higher-level HTTP client interface.

Which was fine when requests were the de-facto-standard only player in town, but at some point modern problems (async, http2) required modern solutions (httpx) and thus ecosystem fragmentation began.

Spivak 12 hours ago

Well, the reason for all the fragmentation is because the Python stdlib doesn't have the core building blocks for an async http or http2 client in the way requests could build on urllib.

The h11, h2, httpcore stack is probably the closest thing to what the Python stdlib should look like to end the fragmentation but it would be a huge undertaking for the core devs.

zahlman 4 hours ago

> but it would be a huge undertaking for the core devs.

More importantly, it would be massively breaking to remove the existing functionality (and everyone would ignore a deprecation), and confusing not to (much like it was when 2.x had both "urllib" and "urllib2").

It'd be nice to have something high level in the standard library based on urllib primitives. Offering competition to those, not so much.

Kwpolska 12 hours ago

Node now supports the Fetch API.

pjc50 11 hours ago

> dotnet's HttpClient is... fine.

Yes, and it's in the standard library (System namespace). Being Microsoft they've if anything over-featured it.

xnorswap 10 hours ago

It's fine but it's sharp-edged, in that it's recommended to use IHttpClientFactory to avoid the dual problem of socket exhaustion ( if creating/destroying lots of HttpClients ) versus DNS caching outliving DNS ( if using a very long-lived singleton HttpClient ).

And while this article [1] says "It's been around for a while", it was only added in .NET Framework 4.5, which shows it took a while for the API to stabilise. There were other ways to make web requests before that of course, and also part of the standard library, and it's never been "difficult" to do so, but there is a history prior to HttpClient of changing ways to do requests.

For modern dotnet however it's all pretty much a solved problem, and there's only ever been HttpClient and a fairly consistent story of how to use it.

[1] https://learn.microsoft.com/en-us/dotnet/core/extensions/htt...

pixl97 7 hours ago

>"It's been around for a while"

is 14 years not a while?

xnorswap 6 hours ago

It is, but it's also a decade after the language was first released.

Kwpolska 4 hours ago

Python’s urllib2 (now urllib.request) started out in the year 2000 [0].

.NET’s WebRequest was available in .NET Framework 1.1 in 2003 [1].

But since then, Microsoft noticed the issues with WebRequest and came up with HttpClient in 2012. It has some issues and footguns, like those related to HttpClient lifetime, but it’s a solid library. On the other hand, the requests library for Python started in 2011 [2], but the stdlib library hasn’t seen many improvements.

[0] https://github.com/python/cpython/blob/6d7e47b8ea1b8cf82927d...

[1] https://learn.microsoft.com/en-us/dotnet/api/system.net.webr...

[2] https://github.com/psf/requests/blob/main/HISTORY.md#001-201...

gjvc 11 hours ago

requests is some janky layer onto of other janky layers. last thing you want in the stdlib.

it's called the STD lib for a reason...

thedanbob 8 hours ago

> Sending HTTP requests is a basic capability in the modern world, the standard library should include a friendly, fully-featured, battle-tested, async-ready client.

I've noticed that many languages struggle with HTTP in the standard library, even if the rest of the stdlib is great. I think it's just difficult to strike the right balance between "easy to use" and "covers every use case", with most erring (justifiably) toward the latter.

tclancy 10 hours ago

Don't think it's Python-specific, it's humanity-specific and Python happens to be popular so it happens more often/ more publicly in Python packages.

woodruffw 7 hours ago

AFAICT, lacking a (good) standard HTTP library is kind of the norm in popular languages. Python, Ruby, Rust, etc. all either have a lackluster standard one or are missing one. I think it sits between two many decision pressures for most languages: there are a _lot_ of different RFCs both required and implied, lots of different idioms you could pick for making requests, lots of different places to draw the line on what to support, etc.

The notable exception is Go, which has a fantastic one. But Go is pretty notable for having an incredible standard library in general.

Kwpolska 4 hours ago

I thought Rust’s got a very small standard library, only focusing on things that must be in a standard library, mainly primitives or things which require co-operation with the underlying OS (e.g. thread and process management)? That’s completely opposite of Python’s “batteries included” approach.

woodruffw 57 minutes ago

Sure, I'm not making a categorical argument about big vs. small stdlibs. I'm just noting that "a good default HTTP library" is in fact kind of unusual, whether or not the language is batteries-included or not.

(As an outsider I had the impression that Go's net/http was good, but a lot of people in this thread are complaining about it as well. So it may be 0-4 instead of 1-3).

Pay08 6 hours ago

Is Rust popular? It's popular among HN users, and among certain other bubbles, but can it be called generally popular? Ruby sure can't be.

woodruffw 6 hours ago

It's popular enough to be worth using as a datapoint. What's the point of the question?

Pay08 22 minutes ago

I don't think it is worth using as a datapoint. Webdev is simply not what Rust was made for. It'd be somewhat like PHP having inline assembly.

functionmouse 9 hours ago

Bram's Law: https://files.catbox.moe/qi5ha9.png

Python makes everything so easy.

fsckboy 7 hours ago

converted to text:

I realized this the other day, and dub it Bram's Law -- Bram

Bram's Law

The easier a piece of software is to write, the worse it's implemented in practice. Why? Easy software projects can be done by almost any random person, so they are. It's possible to try to nudge your way into being the standard for an easy thing based on technical merit, but that's rather like trying to become a hollywood star based on talent and hard work. You're much better off trading it all in for a good dose of luck.

This is why HTTP is a mess while transaction engines are rock solid. Almost any programmer can do a mediocre but workable job of extending HTTP, (and boy, have they,) but most people can't write a transaction engine which even functions. The result is that very few transaction engines are written, almost all of them by very good programmers, and the few which aren't up to par tend to be really bad and hardly get used. HTTP, on the other hand, has all kinds of random people hacking on it, as a result of which Python has a 'fully http 1.1 compliant http library which raises assertion failures during normal operation.

Remember this next time you're cursing some ubiquitous but awful third party library and thinking of writing a replacement. With enough coal, even a large diamond is unlikely to be the first thing picked up. Save your efforts for more difficult problems where you can make a difference. The simple problems will continue to be dealt with incompetently. It sucks, but we'll waste a lot less time if we learn to accept this fact.

matheusmoreira 10 hours ago

Everybody's got a different idea of what it means for a library to be "friendly" and "fully-featured" though. It's probably better to keep the standard library as minimal as possible in order to avoid enshrining bad software. Programming languages could have curated "standard distributions" instead that include all the commonly used "best practice" libraries at the time.

duskdozer 9 hours ago

https://xkcd.com/927/

zahlman 4 hours ago

That isn't really what was proposed, and is an unnecessarily snarky way to respond.

matheusmoreira 8 hours ago

That situation should be avoided. People should have to create their own libraries until everyone empirically converges into a de facto standard that can then be made official.

WhyNotHugo 7 hours ago

httpx has async support (much like aiohttp), whereas urllib is blocking-only. If you need to make N concurrent requests, urllib requires N threads or processes.

BigTTYGothGF 7 hours ago

I think the python maintainers are still feeling burnt by the consequences of the "batteries included" approach from the old times.

yoyohello13 7 hours ago

Most Python developers these days weren't even programming when the 2 -> 3 split happened. Unless you're referencing something else.

zahlman 4 hours ago

There are quite a few old hands among Python core devs. Certainly the culture of that burnout is in place, if you look at the responses that proposals for new standard library additions get these days. There also seems to be a lot of trauma from the loud complaints about backward compatibility breaks.

I still hear people complain about how such and such removal between "minor versions" of Python 3 (you really should be thinking of them as major versions nowadays — "Python 3 is the brand", the saying goes now), where they were warned like two years in advance about individual functions, supposedly caused a huge problem for them. It's hard for me to reconcile with the rhetoric I've heard in internal discussions; they're so worried in general about possible theoretical compatibility breaks that it seems impossible to change anything.

denimnerd42 7 hours ago

the batteries included approach is the stdlib that can do everything. turns out it’s hard to maintain and make good.

yoyohello13 6 hours ago

Yeah that's true. Go seems to be handling the 'fat stdlib' approach pretty well though. I really don't want Python to got the path of Rust where nothing is included.

denimnerd42 4 hours ago

I feel like Java does it the best. Golang didn't start with generics so it's a bit odd IMO.

LtWorf 11 hours ago

The HTTP protocol is easy to implement the basic features but hard to implement a full version that is also efficient.

I've often ended up reimplementing what I need because the API from the famous libraries aren't efficient. In general I'd love to send a million of requests all in the same packet and get the replies. No need to wait for the first reply to send the 2nd request and so on. They can all be on the same TCP packet but I have never met a library that lets me do that.

So for example while http3 should be more efficient and faster, since no library I've tried let me do this, I ended up using HTTP1.1 as usual and being faster as a result.

mesahm 8 hours ago

I spend 3 years developing Niquests, and believe me, HTTP is far from easy. Being a client means you have to speak to everyone, and no one have to speak to you (RFC are nice, but in practice never applied as-is). Once you go deep under the implementation, you'll find a thousand edge cases(...). And yes, the myth that as developer http/1 is "best" only means that the underlying scheduler is weak. today, via a dead simple script, you'll see http/2+ beat established giant in the http/1 client landscape. see https://gist.github.com/Ousret/9e99b07e66eec48ccea5811775ec1... if you are curious.

LtWorf 6 hours ago

I never said i was using asyncio

paulddraper 4 hours ago

Web browsers -- LIKE THE THINGS THAT LIVE AND DIE ON HTTP -- didn't have an ergonomic HTTP API until 2017.

Node.js got its production version in 2023.

Rust doesn't include an HTTP client at all.

Even for stdlib that have a client, virtually none support HTTP/3, which is used for 30% of web traffic. [1]

HTTP (particularly 2+) is a complex protocol, with no single correct answers for high-level and low-level needs.

[1] https://radar.cloudflare.com/adoption-and-usage

kurtis_reed 7 hours ago

Python doesn't have a big company behind it

swiftcoder 13 hours ago

Somehow I confused httpx with htmlx

g947o 12 hours ago

I guess you mean htmx. Same here. I read the article for a while, and was confused by "HTTPX is a very popular HTTP client for Python." and wondering "why is OpenAI using htmx", until I eventually realized what's going on.

eknkc 13 hours ago

And also htmlx with htmx I guess?

jordiburgos 12 hours ago

I've been reading the whole article wrong too.

croemer 13 hours ago

Same! Only just realized it thanks to your comment.

Tade0 10 hours ago

I thought your comment was starting with "Samuel". Plenty of people on sick leave as of late - must be difficult for many to focus their sight.

joouha 9 hours ago

This sounds like an ideal use case for modshim [0]

One of its intended use cases is bridging contribution gaps: while contributing upstream is ideal, maintainers may be slow to merge contributions for various reasons. Forking in response creates a permanent schism and a significant maintenance burden for what might be a small change. Modshim would allow you to create a new Python package containing only the fixes for your bugbears, while automatically inheriting the rest from upstream httpx.

[0] https://github.com/joouha/modshim

robmccoll 9 hours ago

Since modshim isn't money patching and appears to only be wrapping the external API of a package, if the change is deep enough inside the package, wouldn't you end up reimplementing most of the package from the outside?

joouha 7 hours ago

Modshim does more than just wrap the external API of a package - it allows you to tweak something internal to the module while leaving its interface alone, without having to re-implement most of the package in order to re-bind new versions of objects.

There are a couple of example of this readme: (1) modifing the TextWrapper object but then use it through the textwrap library's wrap() function, and (2) modifing the requests Session object, but then just using the standard requests.get(). Without modshim (using standard monkey-patching) you would have to re-implement the wrap and get methods in order to bind the new TextWrapper / Session classes.

nathell 13 hours ago

Congratulations on forking!

Always remember that open-source is an author’s gift to the world, and the author doesn’t owe anything to anyone. Thus, if you need a feature that for whatever reason can’t or won’t go upstream, forking is just about the only viable option. Fingers crossed!

cachius 12 hours ago

This is not merely open-source, but taking part in a huge package ecosystem in a foundational role in an XKCD 2347 type of way for HTTP requests.

Put your side project on your personal homepage and walk away - fine.

Make it central infrastructure - respond to participants or extend or cede maintainership.

troad 11 hours ago

If "taking part in a huge ecosystem in a foundational role" means 'other people choosing to use your FOSS software', and I can't think of what else it would mean, then no, you have no obligation to do any of that.

FOSS means the right to use and fork. That's all it means. That's all it ever meant. Any social expectations beyond that live entirely in your imagination.

nathell 8 hours ago

No. Even if it’s a central piece of infrastructure, any and all maintainership effort is still a token of good will of the maintainer – and needs to be appreciated, rather than expected.

If you need stronger guarantees, pay someone to deliver them.

Yokohiii 11 hours ago

I guess frustration speaks here?

There is simply no responsibility an OSS maintainer has. They can choose to be responsible, but no one can force them. Eventually OSS licensing is THE solution at heart to solve this problem. Maintainers go rogue? Fork and move on. But surprise, who is going to fork AND maintain? Filling in all the demands from the community, for potentially no benefit?

No one can force him to take the responsibility, just like no one can force anyone else to.

cachius 9 hours ago

Right, frustration about the no strings attached sentiment for OSS devs. Of course you've no obligations for support or maintenance, but with increasing exposure responsibility grows as de facto ever more projects, people, softwares depend on you.

This doesn't come over night and this is a spectrum and a choice. From purely personal side project over exotic Debian package to friggin httpx with 15k Github stars and 100 million downloads a week the 46th most downloaded PyPI package!

If this shall work reasonably in any way, hou have to step up. Take money (as they do, https://github.com/sponsors/encode), search fellow maintainers or cede involvement - even if only temporarily.

An example of a recent, successful transition is UniGetUI https://github.com/Devolutions/UniGetUI/discussions/4444

I feel there should be support from the ecosystem to help with that. OpenJS Foundation seems doing great: https://openjsf.org/projects. The Python Software Foundation could not only host PyPI but offer assistance for the most important packages.

troad 9 hours ago

>> Of course you've no obligations for support or maintenance, but with increasing exposure responsibility grows as de facto ever more projects, people, softwares depend on you.

This is an oxymoron. Either you have obligations, or you don't. There's no such thing as having "no obligations" but also "growing responsibility".

I don't understand how you can possibly conclude that just because you've chosen to become dependent on some FOSS library, they owe you anything. You don't get to somehow impose obligations on other people by your choices. They get none of your profits, but they're somehow responsible to you for your business risks? Nonsense.

It is a condition of your use of the code that you've accepted its license, and FOSS licenses are CRYSTAL CLEAR (ALL CAPS) on what obligations or responsibilities the authors have towards you - none whatsoever. Your use of the software is contingent on your acceptance of that license.

If that lack of warranty poses an unacceptable business risk to you, go buy support. Pay a dev to fix the issues you're having, rather than inventing some fictitious responsibility they have to you to do it for free.

Yokohiii 53 minutes ago

Yeah. Previous poster points out sources how a maintainer could get resources (money, support, etc). Maintainers may be exhausted or overwhelmed by the (imposed) responsibility / work. Actively acquiring those resources would just push that over the edge.

There is also the possibility that a maintainer simply doesn't care about what the community wants, it's his baby and he can do what he wants.

Forking a project is built-in by licensing. A lot of complaints, but those complainers don't fork. Why is that? Yeah right.

Side Note: Transferring projects to foundations etc with funding may be a solution for projects that are highly depended on and require active, reliable maintenance. They wont work well for innovation or experimentation. Just saying they are just a part of the equation and not the sole solution.

duskdozer 11 hours ago

A foundational role in a huge open-source package ecosystem? I wonder what such an esteemed position pays.

Yokohiii 11 hours ago

A (hypothetical) professional propriety project at same scale would probably feed a handful of people, with much less stress. FOSS version is zero cash and exaggerated community demands. Dream job.

sdovan1 13 hours ago

I guess the Discussion on Hacker News href should be "https://news.ycombinator.com/item?id=47514603" instead of "news.ycombinator.com/item?id=47514603"

glaucon 13 hours ago

Good line from the blog post ...

"So what is the plan now?" - "Move a little faster and not break things"

mettamage 13 hours ago

> Visitor 4209 since we started counting

Loved that little detail, reminds me of the old interwebs :)

croemer 12 hours ago

It's gone from 45 when I looked at it an hour ago to 261 just now.

zeeshana07x 12 hours ago

The lack of a well-maintained async HTTP client in Python's stdlib has been a pain point for a while. Makes sense someone eventually took it into their own hands

WhyNotHugo 7 hours ago

An async HTTP client in the stdlib would also be great for tools like pip, which could really benefit from doing more async work. One of the reasons that uv is much faster is precisely this.

notatallshaw 5 hours ago

As a pip maintainer I don't think that's really true. The resolver in both pip and uv are fundamentally sequential and single threaded, you can't really queue up or split out jobs.

What uv does is parallelize the final download of packages after resolution, and batch pre-fetch metadata during resolution. I don't think these benefit from async, due to their batch nature classic multi-threaded download pools are probably the better solution, but I could be wrong!

Experiments have been done on the former in pip and didn't find much/any improvement in CPython, this may change in free threaded CPython. For the latter we currently don't have the information from the resolver to extract a range of possible metadata versions we could pre-range, I am working on this but it requires new APIs in packaging (the Python library) and changes to the resolver, and again we will need to benchmark to see if adding pre-fetching actually improves things.

localuser13 12 hours ago

I'm not a lawyer, but are there any potential trademark issues? AFAIK in general you HAVE to change the name to something clearly different. I consider it morally OK, and it's probably fine, but HTTPXYZ is cutting it close. It's too late for a rebrand, but IMO open-source people often ignore this topic a bit too much.

CorrectHorseBat 12 hours ago

Don't you need to register and actively defend you trademark for it to apply?

sushibowl 7 hours ago

There are unregistered trademarks as well as registered ones. Usually the "TM" symbol is applied to unregistered trademarks, and the ® symbol for registered ones. Both enjoy protection, although it's generally an easier time in court when your trademark is registered.

Whether actively defending your trademark is actually required is a bit of a nuanced topic. Generally, trademarks can be lost through genericide (the mark becomes a generic term for the type of product) or abandonment. Abandonment happens when either the mark owner stops using the mark itself, or takes an action that weakens the mark. The question, then, is whether failing to defend infringing use constitutes a weakening action. Courts differ on this, and there is a large gray area between "we didn't immediately sue a local mom-and-pop shop" and "we allowed a rival company to use the mark erroneously across several states for years without taking action."

nwellnhof 7 hours ago

In this case, the name is already so generic that you might even be denied a trademark in the first place.

ahoka 12 hours ago

I don't think HTTPX is a registered trademark.

Gander5739 12 hours ago

Is httpx trademarked? I couldn't find anything indicating it was.

IshKebab 12 hours ago

He would probably win in a legal case, but is he actually going to take it to court? I doubt it. Also I wouldn't be too offended about the name if I were him and for users it's better because it makes the link clearer.

I think if had named it HTTPX2 or HTTPY, that would be much worse because it asserts superiority without earning it. But he didn't.

cachius 12 hours ago

Another abandoned project hurting users: https://github.com/benweet/stackedit

Spivak 12 hours ago

Do you see yourself taking over httpcore as well as it's likely to have the same maintainership problem? It would certainly instill more confidence that this is a serious fork.

This certainly wouldn't be the first time an author of a popular library got a little too distracted on the sequel to their library that the current users are left to languish a bit.

cies 13 hours ago

Hi Michiel!

Just a small headsup: clicking on the Leiden Python link in your About Me page give not the expected results.

And a small nitpick: it's "Michiel's" in English (where it's "Michiels" in Dutch).

Thanks for devoting time to opensource... <3

roywashere 10 hours ago

thanks, I hope I fixed the https://pythonleiden.nl website now

globular-toast 13 hours ago

It's a shame, httpx has so much potential to be the default Python http library. It's crazy that there isn't one really. I contributed some patches to the project some years ago now and it was a nice and friendly process. I was expecting a v1 release imminently. It looks like the author is having some issues which seem to afflict so many in this field for some reason. I notice they've changed their name since I last interacted with the project...

WesolyKubeczek 9 hours ago

You try to touch low level HTTP with Python, and once you dive into both RFC2616 and Python deep enough, your brain is cooked, basically. Look at what happened to the author of requests, a textbook example.

Or maybe it is that your brain is cooked already, or is on the brink, and your condition attracts you to HTTP and Python, after which it basically has you.

The only way to not go bonkers is to design a library by commitee, so that the disease spreads evenly and doesn't hit any one individual with full force. The result will be ugly, but hopefully free of drama.

leontloveless 10 hours ago

[dead]

maltyxxx 10 hours ago

[dead]

federicodeponte 11 hours ago

[dead]

paseante 13 hours ago

[dead]

bustah 9 hours ago

[dead]

eats_indigo 12 hours ago

smells like supply chain attack

souvlakius 8 hours ago

Yeah, it's a shame because otherwise the library is really nice and could have become the default HTTP library, but it feels like someone will manage to inject some weird behaviour soon and half the planet will be compromised