also:
Parse, Don’t Validate – Some C Safety Tips - https://news.ycombinator.com/item?id=44507405 - July 2025 (73 comments)
Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=41031585 - July 2024 (102 comments)
Parse, don't validate (2019) - https://news.ycombinator.com/item?id=35053118 - March 2023 (219 comments)
Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=27639890 - June 2021 (270 comments)
Parsix: Parse Don't Validate - https://news.ycombinator.com/item?id=27166162 - May 2021 (107 comments)
Parse, Don’t Validate - https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230 comments)
Parse, Don't Validate - https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4 comments)
(p.s. these links are just to satisfy extra-curious readers - no criticism is intended! I add this because people sometimes assume otherwise)
It is a handy way to prevent divide by zero as in the article, or to have fun with lambda calculus by asking the type system if 3 + 4 == 8. You can reason about the full range of inputs. Same for file format parsing - making as many failure modes as possible fail as early as possible!
But be VERY wary of using them to represent business logic or state machines that allow only the transitions you believe can exist at this point in time. You just don’t know what wild things people will want to do in business logic, and if your software can’t handle those scenarios, people will just work around it and your stuff no longer matches reality.
get_elem_at_index(array, index)
cannot ever have index outside the bounds of the array, but checked statically at compilation time - and this is the key, without knowing a priori what the length of array is."In Idris, a length-indexed vector is Vect n a (length n is in the type), and a valid index into length n is Fin n ('a natural number strictly less than n')."
Similar tricks work with division that might result in inf/-inf, to prevent them from typechecking, and more subtle implications in e.g. higher order types and functions
Which refers to https://docs.rs/anodized/latest/anodized/
Dependent typing requires:
- Generic types that can take runtime values as parameters, e.g. [u8; user_input]
- Functions where the type of one parameter depends on the runtime value of another parameter, e.g. fn f(len: usize, data: [u8; len])
- Structs/tuples where the type of one field depends on the runtime value of another field, e.g. struct Vec { len: usize, data: [u8; self.len] }
Type-Driven Development with Idris[1] is a great introduction for dependently typed languages and covers methods such as these if you're interested (and Edwin Brady is a great teacher).
[1] https://www.manning.com/books/type-driven-development-with-i...
Maybe Int
So your program splits into two branches:
1. Nothing branch: you failed to obtain an Int.
There is no integer to use as an index, so you can’t even attempt a safe lookup into something like Vect n a.
2. Just i branch: you do have an Int called i.
But an Int is not automatically a valid index for Vect n a, because vectors are indexed by Fin n (a proof carrying “bounded natural”).
So inside the Just i branch, you refine further:
3. Try to turn the runtime integer i into a value of type Fin n.
There are two typical shapes of this step:
* Checked conversion returning Maybe (Fin n)
If the integer is in range, you get Just (fin : Fin n). Otherwise Nothing.
Checked conversion returning evidence (proof) that it’s in range
For example: produce k : Nat plus a proof like k < n (or LTE (S k) n), and then you can construct Fin n from that evidence.
(But it’s the same basically, you end up with a “Maybe LTE…”
Now if you also have a vector: xs : Vect n a
… the n in Fin n and the n in Vect n a are the same n (that’s what “unifies” means here: the types line up), so you can do: index fin xs : a
And crucially:
there is no branch in which you can call index without having constructed the Fin n first, so out-of-bounds access is unrepresentable (it’s not “checked later”, it’s “cannot be expressed”).
And within _that_ branch of the program, you have a proof of Fin n.
Said differently: you don’t get “compile-time knowledge of i”; you get a compile-time guarantee that whatever value you ended up with satisfies a predicate.
Concretely: you run a runtime check i < n. _ONCE_
If it fails, you’re in a branch where you do not have Fin n.
If it succeeds, you construct fin : Fin n at runtime (it’s a value, you can’t get around that), but its type encodes the invariant “in bounds”/check was done somewhere in this branch.
Like how clojure basically uses maps everywhere and the whole standard library allows you to manipulate them in various ways.
The main problem with the many type approach is several same it worse similar types, all incompatible.
The way I've thought about it, though, is that it's possible to design a program well either by encoding your important invariants in your types or in your functions (especially simple functions). In dynamically typed languages like Clojure, my experience is that there's a set of design practices that have a lot of the same effects as "Parse, Don't Validate" without statically enforced types. And, ultimately, it's a question of mindset which style you prefer.
The real world often changes though, and more often than not the code has to adapt, regardless of how elegant are systems are designed.
Start with a more dynamic type, do stuff that doesn't care about the shape, parse into a more precise type, do stuff that relies on the additional invariants, drop back into the more dynamic type again.
https://www.stroustrup.com/Concept-based-GP.pdf
{
Number<unsigned int> ii = 0;
Number<char> cc = '0';
ii = 2; // OK
ii = -2; // throws
cc = i; // OK if i is within cc’s range
cc = -17; // OK if char is signed; otherwise throws
cc = 1234; // throws if a char is 8 bits
}If we take their "parse" approach, then the types of the arguments a, b, and c have to somehow encode the constraint `b^2 - 4ac >= 0`. This would be a total mess--I can't think of any clean way to do this in Rust. It makes _much_ more sense to simply return an Option and do the validation within the function.
In general, I think validation is often the best way to solve the problem. The only counterexample, which the author fixates on in the post, is when one particular value is constrained in a clean, statically verifiable way. Most of the time, validation is used to check (possibly complex) interactions between multiple values, and "parsing" isn't at all convenient.
Sure, we can follow the advice of creating types that represent only valid states but then we end up with `fn(a: A, b: B, c: C) transformed into `fn(abc: ValidABC)`
Instead of validating visual outputs after the fact (like linting CSS or manual design reviews), you parse the constraints upfront. If a layout component is strictly typed to only accept discrete grid multiples, an arbitrary 13px margin becomes a compile error, not a subjective design debate. It forces determinism.
```
impl Add for NonZeroF32 { ... }
impl Add<f32> for NonZeroF32 { ... }
impl Add<NonZeroF32> for f32 { ... }
```
What type would it return though?
Generally yes. `NonZeroU32::saturating_add(self, other: u32)` is able to return `NonZeroU32` though! ( https://doc.rust-lang.org/std/num/type.NonZeroU32.html#metho... )
> I cannot think of any way to enforce "non-zero-ness" of the result without making it return an optional Result<NonZeroF32>, and at that point we are basically back to square one...
`NonZeroU32::checked_add(self, other: u32)` basically does this, although I'll note it returns an `Option` instead of a `Result` ( https://doc.rust-lang.org/std/num/type.NonZeroU32.html#metho... ), leaving you to `.map_err(...)` or otherwise handle the edge case to your heart's content. Niche, but occasionally what you want.
I was confused at first how that could work, but then I realized that of course, with _unsigned_ integers this works fine because you cannot add a negative number...
I think the article would have been better with NonZeroPositiveF32 as the example type, since then addition would be safe.
To me, invalid values are best expressed with optional error returns along with the value that are part of the function signature. Types are best used to only encode information about the hierarchy of structures composed of primitive types. They help define and navigate the representation of composite things as opposed to just having dynamic nested maps of arbitrary strings.
What would you say to someone who thinks that nested maps of arbitrary strings have maximum compatibility, and using types forces others to make cumbersome type conversions?
edit: To put it differently: To possibly be compatible with the nested "Circle" map, you need to know it is supposed to have a "Radius" key that is supposed to be a float. Type definitions just make this explicit. But just because your "Radius" can't be 0, you shouldn't make it incompatible with everything else operating on floats in general.
I would just (as a default; the situation varies)... validate prior to the division and handle as appropriate.
The analogous situation I encounter frequently is indexing, e.g. checking if the index is out of bounds. Similar idea; check; print or display an error, then fail that computation without crashing the program. Usually an indication of some bug, which can be tracked down. Or, if it's an array frequently indexed, use a (Canonical for Rust's core) `get` method on the whatever struct owns the array. It returns an Option.
I do think either the article's approach, or validating is better than runtime crashes! There are many patterns in programming. Using Types in this way is something I see a lot of in OSS rust, but it is not my cup of tea. Not heinous in this case, but I think not worth it.
This is the key to this article's philosophy, near the bottom:
> I love creating more types. Five million types for everyone please.
(This isn’t to say it’s always wrong, but that having it be an error state by default seems very reasonable to me.)
Ah, that pretentious little blog post that was taken up by many as gospel, for some mysterious reason...
This tells me that any idea, even if stupid (or obvious in this case) ones, can go mainstream, if you rhyme it right.
That blog post is like a "modern art" painting by some famous author that totally look a child has flingned some paint to a wall, but every person who looks at it tries so hard to find some meaning that every one end up finding "something"..
In fact the top comment right now express that their interpretation of blog post is different from the author's...
Amusing!
Alexis King, the author of the original "Parse, Don't Validate" article, also published a follow-up, "Names are not type safety" [0] clarifying that the "newtype" pattern (such as hiding a nonzero integer in a wrapper type) provide weaker guarantees than correctness by construction. Her original "Parse, Don't Validate" article also includes the following caveat:
> Use abstract datatypes to make validators “look like” parsers. Sometimes, making an illegal state truly unrepresentable is just plain impractical given the tools Haskell provides, such as ensuring an integer is in a particular range. In that case, use an abstract newtype with a smart constructor to “fake” a parser from a validator.
So, an abstract data type that protects its inner data is really a "validator" that tries to resemble a "parser" in cases where the type system itself cannot encode the invariant.
The article's second example, the non-empty vec, is a better example, because it encodes within the type system the invariant that one element must exist. The crux of Alexis King's article is that programs should be structured so that functions return data types designed to be correct by construction, akin to a parser transforming less-structured data into more-structured data.
[0] https://lexi-lambda.github.io/blog/2020/11/01/names-are-not-...
For full-on parse-don't-validate, you essentially need a dependent type system. As a more light-weight partial solution, Rust has been prototyping pattern types, which are types constrained by patterns. For instance a range-restricted integer type could be simply spelled `i8 is 0..100`, or a nonempty slice as `[T] is [_, ..]`. Such a feature would certainly make correctness-by-construction easier in many cases.
The non-empty list implemented as a (T, Vec<T>) is, btw, a nice example of the clash between practicality and theoretical purity. It can't offer you a slice (consecutive view) of its elements without storing the first element twice (which requires immutability and that T: Clone, unlike normal Vec<T>), which makes it fairly useless as a vector. It's okay if you consider it just an abstract list with a more restricted interface.
Scala 3 is much more mainstream and has path dependent types. I've only used Scala 2, and there the boilerplate for dependent types was frustrating imo, but I've heard its better in 3.
Scala 3 indeed is more mainstream but it seems also on the way out. At least here in corporate world it is replaced by Kotlin and Java 21+ for a large part.
[0] https://geeklaunch.io/blog/make-invalid-states-unrepresentab...
[1] https://www.youtube.com/watch?v=2JB1_e5wZmU
EDIT: Parent comment was edited to amend the "impossible/unrepresentable" wording
The invariant is less obvious, but you could still do this if you really wanted. The observation is that addition of two fixed-size nonnegative integers cannot overflow unless at least one is greater than half the range. So your `non_overflowing_addition` function would need take two inputs of type `NonNegativeLessThanHalfMaxValue`, where the constructor for the latter type enforces the invariant. Multiplication is similar, but with the square root of the range (and I suppose `NonNegativeLessThanSqrtMaxValue` could be a subtype of `NonNegativeLessThanHalfMaxValue` if you want to be fancy).
Your intention of having the restricted domain "NonNegativeLessThanHalfMaxValue" is probably so that both addends have the same domain. If you go down that route, perhaps you'll also want the closure property, meaning that the range should also belong to the same set. However, then you have to deal with the overflow problem all over again...
The real point is that when adding two N-bit integers, the range must be N+1 bits, because the "carry bit" is also part of the output. I think this is a scenario where "Parse, Don't Validate" can't easily help, because the validity of the addition is intrinsically a function of both inputs together.