And that's why I think the future of the software industry is data-driven, and we will end up with another GNU-like movement around free and open data models/schemas. I think we already have a good starting point: Linked Data[1] and schema.org[2]
[1]: https://www.w3.org/wiki/LinkedData
[2]: https://schema.org/
Moreover, the science folks are not a picky bunch and they tend to use what works well, whether it be CSV and XML. As long as there's tooling and documentation, everything is acceptable, which is something I like.
I've used this schema to merge together AWS, GCP, and Azure into 1 unified cloud bill, which unlocks a ton of understanding of where the money is going inside the cloud bills.
A primitive lets you create a shared language and ritual ("tweet"), compound advantages with every feature built on top, and lock in switching costs without ever saying the quiet part out loud.
The article is right that nearly every breakout startup manages to land a new one.
These aspects of domain-driven design aim to foster a
common language shared by domain experts, users, and
developers—the ubiquitous language. The ubiquitous language
is used in the domain model and for describing system
requirements.
0 - https://en.wikipedia.org/wiki/Domain-driven_design#OverviewThe benefit to having the logic and the data combined like this is difficult to overstate. It makes working in complex environments much easier. For example, instead of 10 different web apps each implementing their own session validation logic in some kind of SSO arrangement, we could have them call a procedure on the sql box. Everyone would then be using the same centralized business logic for session validation. Any bugs with the implementation can be fixed in real time without rebuilding any of the downstream consumers.
I have seen business rules as stored procedures lock a business into their current model across with a dozen teams, effectively making system improvements impossible. And because they needed some olap atop oltp in some cases, their very beefy postgres solution crawled down to a max of 2k queries per second. I worked with them for over a year trying to pull apart domain boundaries and unlock teams from one another. Shared, stored procedures was a major pole in the tent of things making scaling the technical org incredibly hard.
Repeat after me: uncoupled teams are faster.
I'm building Typegres to give you both. It lets you (a) write complex business logic in TypeScript using a type-safe mapping of Postgres's 3000+ functions and (b) compiles it all down to a single SQL query.
Easier to show than tell: https://typegres.com/play
Your product's core abstractions determine whether new
features compound into a moat or just add to a feature list.
Which is captured by the Domain Model[0]. How it is managed in a persistent store is where a data model comes into play.See also Domain Driven Design[1].
> It is basically trying to innovate a new way to think about in an existing domain ...
Note that a domain is not the same as a domain model.
> "Your data model is your destiny."
This is why I consider the article a "near miss." If the above quote from the post was instead "your domain model is your destiny", the subsequent quoted statements not only would need no alteration but would substantiate the topic at hand being domain modeling and the organizational value found therein.
0 - https://www.merriam-webster.com/dictionary/term%20of%20art
Innovating, evolving, creating, and capturing new domain concepts to create Blue Ocean solutions inside and outside the Enterprise. Iterating on core concepts, via subject matter expert led/involved discussions and designs, and using new concepts to better articulate the domain. Managing that change over time and accounting for ontological and taxonomical overlap versus Enterprise System development needs.
That’s the foundation that can actively copy insights, and doesn’t rely on Immaculate Specification or premature data modelling. No need to start over, thanks to clearly separated concerns.
Note: copying an insight is a far cry from having the wherewithal to make that insight, there are numerous downstream benefits to articulating your business domain clearly and early.
Where it blurs things: data model != UX strategy != business model, and success isn't only about a novel model, execution and distribution still matter greatly.
My takeaway: read "data model" here as "core conceptual model", and ask whether your product has a clear center that lets new features inherit context instead of becoming one-offs.
This underlying choice of data model actually does define your destiny. What I think the author was thinking of is domain modelling and correct entity identification, which is also important. It's a layered approach - and if you ignore the foundations (the actual data model), you hit limitations higher up.
For example, in real-time AI systems, you might want users to provide a single value (like an order number) to retrieve precomputed features for a model. With Snowflake Schema data models, it works. But for Star Schema data models, you have to provide entity IDs for all tables containing precomputed features - which leads to big problems (the need for a mapping table, a new pipeline, and higher latency).
Reference: https://www.hopsworks.ai/post/the-journey-from-star-schema-t...
Notion's idea of having blocks that can be shared between documents is very cool, but that has an obvious access control problem. What access do you need to see a document with information from multiple other documents? It was easy to find confused users: https://www.reddit.com/r/Notion/comments/12iqsc2/synced_bloc...
In more sophisticated configs adding / removing IP's or TLS certs requires restarting server, configuring applications. This gets out of hand quickly. Like what if your server has primary IP removed, because the IP space is recycled.
At CF all these things were just a row in database, and systems were created to project it down to http server config, network card setting, BGP configurations, etc. All this is fully automated.
So an action like "adding an IP block" is super simple. This is unique. AFAIK everyone else in the industry, back in 2012, was treating IP's and TLS more like hardware. Like a disk. You install it once and it stays there for the lifetime of the server.
https://www.notion.com/blog/data-model-behind-notion
https://git-scm.com/book/en/v2/Git-Internals-Git-Objects
Having worked with real business databases, I truly believe and feel that foundational information modeling (database schema, data types) makes or breaks the utility and capabilities of that data.
I was proud after getting it working, but when I had to run dozens of files through it, it was horribly slow. I don't tend to write a lot of hot code, so I was excited by the fact I had to profile it and make it efficient.
I decided that I should rewrite the system, as my mental model had improved and the code was doing a lot more than it should be doing for no reason. I've spent the last few days redesigning it according to Data-Oriented Design. I'm trying to get the wall-clock time down by more than an order of magnitude. I'm not sure how it's going to turn out, wish me luck :)
Since I mentioned DoD, these three links will probably come up in conversation:
Mike Acton's famous performance talk: https://www.youtube.com/watch?v=rX0ItVEVjHc
DoD in the Zig compiler: https://www.youtube.com/watch?v=IroPQ150F6c
The DoD book: https://dataorienteddesign.com/dodbook.pdf
Interestingly AI Agents are all about disrupting the hard bounds of existing data and interaction models and it turns out the lowest common denominator is often the winner. Eg.: file system > database, grep > embeddings, markdown > pdf, generative ui > portals, computer use > api-s etc.
There simply is no need for all that abstraction / interface / infrastructure to eg. answer questions about documents or to keep track of todo lists, workflows or sending messages etc. when you have glue that can translate between the data models.
I did initially look at RBAC frameworks but since it was too complex for a small greenfield project I went with one or more accounts linked to a user's profile with a RBAC junction table linking account and profile ID in a relational database.
The junction table was the secret sauce, it allows you to stuff the RBAC permissions into its rows.
I could get very far with this model. For example it allows for example who can pay for features(guardian not minor). Have multiple people manage a minor. Validate permissions for a logged in account.
Whatever happened to people being charitable enough to readers to define their acronyms and abbreviations? This page is full of "insider" talk.
- Incident ticket gets created
- It used to go to a department wide alias
- Head of Dept used to open the email, hit forward and then have to To/CC the owner of the system affected
- JIRA (which we used) already the idea of a Component and you could tie an owner to each Component
- Update the notifications to be: in the "To" field was the Component owner and who opened the ticket. In the "Bcc" field was the department wide alias
- Now, the Dept Head could just hit reply and get to the right people. The BCC meant that everyone knew something had occurred.
Worked out OK in the end, but took substantial effort to fix.
From Linus Torvalds:
"git actually has a simple design, with stable and reasonably well-documented data structures. In fact, I'm a huge proponent of designing your code around the data, rather than the other way around, and I think it's one of the reasons git has been fairly successful […] I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important.
...
Bad programmers worry about the code. Good programmers worry about data structures and their relationships."
The article describes domain modeling, what you describe is computational modelling. The former lives at a higher abstraction closer to the user. The latter is about data processing.
A lot of people have mentioned DDD (or similar) in this thread, but I think that is an example of mixing up computational modeling and domain modeling. I think this is what object orientation and its descendants like micro services generally have been doing wrong: Applying domain structure at a level where it makes no sense anymore. This mismatch can add a lot of friction, repetition and overhead.
looks like a less than 2k LOC. Source control wasnt a new concept. How is this "orders of magnitude" impressive though?
« Always define your variables » is the first thing I learned during my engineering studies, in both math and physics class. Professors were insisting a lot about it. I still consider it is the most important thing I ever learned 10 years later.
Aren't both right and left sides of most examples fundamentally different views of the same exact underlying data/relationships?
If you're locked in to one view, then your code sucks. I'm not an expert in this stuff, but I think this is just a natural consequence of thinking in terms of tables and SQL.
If you think in terms of triple stores and graph databases then you can derive either a left side or right side view as needed and you can operate on either abstraction
Organizations as a whole need to talk the same language. "We have the data somewhere, somehow - so figure it out" doesn't work when communicating.
A good proxy measurement is how fast it takes two random employees to talk about the same thing. It's the effective knowledge transfer speed after taking into account the data density / compression.
The infinite many views of a graph don't compress.
Or more practically, any graph will continuously evolve a set of 'common views' that an organization understands and uses.
HR may want to look at employees from one view and your accountant might want to look from an entirely different one. They communicate with people in their own field using their own compression. You don't need to settle on one abstraction for everyone.
The data-model designers need to communicate with each other.. sure.. but it's their job to think in a non-compressed way about the data model.
> By the time the architecture solidifies around these implicit choices, it’s nearly impossible to change.
> Incumbents couldn’t match this without rebuilding from scratch.
My point is that if your actual code-level datamodel is flexible then you often architecturally can pivot to a different view (the Rippling example). My guess as to why HipChat can't change to a Slack model is because they "coded it wrong". It's not somehow inherently orthogonal. The essay present the models are a fait accompli when in fact, if you step back they're for the most part just views on the same fundamental data/relationships - and if modeled correctly I don't see any reason why you wouldn't be able to change between views.
The central thesis that "Your data model is your destiny" does not have to be true
Though that said, there are some examples that probably can't pivot due to some inherent design limitation (ex: the Adobe case)
It's eveywhere in my current company where the top management have somehow agreed that data is the future and like to talk about data products all the time but with an actual understanding of what it means, requires and entails with is close to nil. It's very lucrative for our suppliers however.
It feels to me like some kind of repackaging of what the semantic web was promising with very little in term of actual novelty and no real solution to the problems encountered at the time. It's everywhere in this discussion "domain driven design". I saw a post this week about knowledge graph.
Where is the push coming from? I'm missing some fundamental innovation somewhere which makes this more practical?
In my experience, it always ends the same. It's a slow death by governence as business objects owners always end up lagging what's needed in the field, gateways and caches start popping up everywhere as your model doesn't fit what the software you are buying require, data quality becomes uneven in the IT system, costs creep up through duplication until someone higher up probably promoted while putting in place the mess decides it's time to decouple and simplify and get their next promotion by ending what they created.
Isn’t the slack data model presented here totally possible with hipchats actual data model?
What is my destiny?
As an engineer who's full-stack and has frequently ended up doing product management, I think the main value I provide organizations is the ability to think holistically, from a product's core abstractions (the literal database schema), to how those are surfaced and interacted with by users, to how those are talked about by sales or marketing.
Clear and consistent thinking across these dimensions is what makes some products "mysteriously" outperform others in the long run.
If you get this part right, then everything else becomes and implementation effort. You're no longer fighting the system, you flow with it. Ideas becomes easier to brainstorm and the cost of changes is immediately visible.
DDD suggests continuous two-way integration between domain experts <-> engineers, to create a model that makes sense for both groups. Terminology enters the language from both groups so that everyone can speak to each other with more precision, leading to the benefits you stated.
I like doing this FS journey myself but am stuck "leading teams" of FS/BE/FE mixes and trying to get them to build stuff that I clearly understand and would be able to build with enough time but all I have is a team of FE or BE people or even FS people that can't just do the above. You need to be very involved with these people to get them to do this and it just doesn't scale.
I've recently tried AI (Claude specifically) and I feel like I can build things with Claude much quicker than with the FE/BE/FS people I have. Even including all the frustrations that Claude brings when I have to tell it that it's bullshitting me.
Is that bad? How do you deal with that? Advice?
I am now off the previous work and will devote time to try AI, because I concluded it can't be worse than that.
Only way to cope was to let go things and pick my battles.
I always think about the joke where a sailor goes down to the dock and asks dock men if they speak French, English or German- dock men only shake their heads showing no. Later dock men chat and one saying to other he could learn languages so he would be able to talk with the sailor. The other replied that sailor knew 3 and it didn’t help him.
It’s such a great feeling when you can make someone’s work better, for the life of me I can’t understand why others wouldn’t jump at the opportunity!
Sadly at current $dayjob, the devs are held at arm's length from the customer. On purpose!
PM - Product Manager
FS - Fullstack developer
FE - Frontend developer
BE - Backend developer
Frankly if the people you have aren't good enough then you need to get good at training, get better in your hiring (does your hiring process test the skills you want? Without being so long that anyone decent is going to get a better offer before they reach the end of it?), or maybe your company is just not offering enough to attract decent talent. There are plenty of better-than-AI programmers out there, but even in this job market they still have decent options.
Everything is too recent, nobody can give a sure advice on how to deal with your situation. From my view as a fullstack engineer working with LLMs for the past 3 years, your generated product is probably crap and your only way to assess it is by asking Claude or ChatGPT if it's good, which it'll probably say yes to make you feel good.
Now go ahead and publish it. If your app brings revenue, then you build something quicker. A Claude-generated prototype is as much a product as some PowerPoint slides
And yet it's so easy to get wrong.
We ended up with something like five microservices - that, in principle, could have been used by anyone else in the company to operate on the Domains they were supposed to represent and encapsulate. This one holds Users and User data! This one holds Products, and Product interaction data!
Nobody touched any of those except us, the engineers working on this one very specific product. We could have - should have - just put it all on one service, which would have also allowed us to trivially run database joins instead of having to have services constantly calling each other for data, stitching it together in code.
Sigh.
How you implement it, however is an engineering question. Microservice is not the only abstraction tool that exists. It's kinda the worse. You have procedure/class, file/module, package/libraries, processes and IPC. Network call is for when you have no other choice.
Also operations and customer support. They are your interface to real, not hypothetical, customers.
I've worked on learning all I can and I have a much easier time participating in discussions now, however we still feel a bit silo'd off.