Hacker News

375 points by hunglee2 4 days ago | 94 comments

vipshek 2 days ago

I don't have much to say about this post other than to vigorously agree!

As an engineer who's full-stack and has frequently ended up doing product management, I think the main value I provide organizations is the ability to think holistically, from a product's core abstractions (the literal database schema), to how those are surfaced and interacted with by users, to how those are talked about by sales or marketing.

Clear and consistent thinking across these dimensions is what makes some products "mysteriously" outperform others in the long run.

skydhash 2 days ago

It's one of the core ideas of Domain-Driven Design. In the early stage of the process, engineers should work closely with stakeholders to align on the set of terms (primitives as another commenter has put it), define them and put them in neat little contextual boxes.

If you get this part right, then everything else becomes and implementation effort. You're no longer fighting the system, you flow with it. Ideas becomes easier to brainstorm and the cost of changes is immediately visible.

Denzel 23 hours ago

Immediately thought DDD too!

DDD suggests continuous two-way integration between domain experts <-> engineers, to create a model that makes sense for both groups. Terminology enters the language from both groups so that everyone can speak to each other with more precision, leading to the benefits you stated.

tharkun__ 23 hours ago

Now how do you get your company / do yourself the hiring of those people in such a way that you can basically just have a team of people like this work with PMs to build their ideas?

I like doing this FS journey myself but am stuck "leading teams" of FS/BE/FE mixes and trying to get them to build stuff that I clearly understand and would be able to build with enough time but all I have is a team of FE or BE people or even FS people that can't just do the above. You need to be very involved with these people to get them to do this and it just doesn't scale.

I've recently tried AI (Claude specifically) and I feel like I can build things with Claude much quicker than with the FE/BE/FS people I have. Even including all the frustrations that Claude brings when I have to tell it that it's bullshitting me.

Is that bad? How do you deal with that? Advice?

majkinetor 20 hours ago

I have exactly the same experience as you. I tried educating people but all those developers (and beyond, up to stakeholders), no matter their seniority, do not want to get involved in the domain too much, just as little as they need. That naturally leads to me micromanaging all the things, leading to non scalability and finally overburn. As soon as I stop doing micro, all the stuff start to break down pretty fast. I wrote a book per project trying to get everyone on the same level but nah (more than 3000 pages in last decade, 20+ projects). Tried everything in hiring too, found almost nobody during all that time.

I am now off the previous work and will devote time to try AI, because I concluded it can't be worse than that.

p_v_doom 17 hours ago

Same here. No matter how hard I try, and use different approaches, from coaching, to sharing videos, through poiting out why this can benefit you personally, to showing how exactly it creates results, there simply is no interest. People don't care.

majkinetor 17 hours ago

It's even worse than that - even the owner of the company I worked for didn't care that the product of his own company will be mediocre, while shouting generally the quality is the goal. It turns out that it was the goal as long as it was incidental and free (no such thing, but it looks that way if you are not deeply involved) and because it sounds good. As soon as reputation collides with the immediate profit, profit always wins.

ozim 19 hours ago

That’s something I relate too as well. I like working on different abstraction levels throughout the system.

Only way to cope was to let go things and pick my battles.

I always think about the joke where a sailor goes down to the dock and asks dock men if they speak French, English or German- dock men only shake their heads showing no. Later dock men chat and one saying to other he could learn languages so he would be able to talk with the sailor. The other replied that sailor knew 3 and it didn’t help him.

jiggawatts 16 hours ago

Reading this thread brought back fond memories of sitting with front-line staff and just chatting with them while watching them work from the corner of my eye. My gimmick was to turn up for morning tea (the staff were older ladies that took homemade cakes to work), and by lunchtime have some frustration of theirs resolved.

It’s such a great feeling when you can make someone’s work better, for the life of me I can’t understand why others wouldn’t jump at the opportunity!

Sadly at current $dayjob, the devs are held at arm's length from the customer. On purpose!

layoric 22 hours ago

Decoder for people reading:

PM - Product Manager

FS - Fullstack developer

FE - Frontend developer

BE - Backend developer

lmm 21 hours ago

Huh, my experience has been generally the opposite - most FS/BE/FE folks want to understand the business, and while a good PM will enhance that, the median PM is actively detrimental.

Frankly if the people you have aren't good enough then you need to get good at training, get better in your hiring (does your hiring process test the skills you want? Without being so long that anyone decent is going to get a better offer before they reach the end of it?), or maybe your company is just not offering enough to attract decent talent. There are plenty of better-than-AI programmers out there, but even in this job market they still have decent options.

mycall 14 hours ago

The truth is that everyone is correct when going by past experience. With many millions of developers and PMs, all combinations happen.

brazukadev 22 hours ago

> Is that bad? How do you deal with that? Advice?

Everything is too recent, nobody can give a sure advice on how to deal with your situation. From my view as a fullstack engineer working with LLMs for the past 3 years, your generated product is probably crap and your only way to assess it is by asking Claude or ChatGPT if it's good, which it'll probably say yes to make you feel good.

Now go ahead and publish it. If your app brings revenue, then you build something quicker. A Claude-generated prototype is as much a product as some PowerPoint slides

pavel_lishin 10 hours ago

> If you get this part right

And yet it's so easy to get wrong.

We ended up with something like five microservices - that, in principle, could have been used by anyone else in the company to operate on the Domains they were supposed to represent and encapsulate. This one holds Users and User data! This one holds Products, and Product interaction data!

Nobody touched any of those except us, the engineers working on this one very specific product. We could have - should have - just put it all on one service, which would have also allowed us to trivially run database joins instead of having to have services constantly calling each other for data, stitching it together in code.

Sigh.

skydhash 8 hours ago

Subdomain shouldn't be engineering related. That's putting the cart before the horse. Subdomain is more like: This barely have anything to do with that, other than data transmission (not transformation).

How you implement it, however is an engineering question. Microservice is not the only abstraction tool that exists. It's kinda the worse. You have procedure/class, file/module, package/libraries, processes and IPC. Network call is for when you have no other choice.

xp84 8 hours ago

Yes, and conversely, in cases when the initial model misjudged future needs, the most disastrous projects are those where the requirements or the technical design flies in the face of the original model. When this is solved sloppily, this often begins the slow degeneration - from an application that makes sense to a spaghetti mess that is only even navigable by people who were around when those weird bolt-ons happened. Usually not only the code, but also the UI reflects this, as even massive UI overhauls (like Atlassian's in 2025) tend to just sweep everything awkward under a rug -- those things are still necessary to manage the complexity but now they're hidden under ••• → Advanced Settings → Show More → All Settings.

yazmeya 22 hours ago

> sales or marketing

Also operations and customer support. They are your interface to real, not hypothetical, customers.

ToucanLoucan 15 hours ago

I don't suppose you have any tips on how to get this going in an org? I love where I work and I love the products we make, but my team (phone apps) are treated very often like an afterthought; we just receive completed products from other teams, and have to "make them work." I don't think it's malicious on the part of the rest of the teams, we're just obviously quite a bit smaller and younger than the others, not to mention we had a large departure just as I arrived in the form of my former boss who was, I'll fully admit, far more competent in our products than I am.

I've worked on learning all I can and I have a much easier time participating in discussions now, however we still feel a bit silo'd off.

esjeon 13 hours ago

> The companies that win won’t be those with the most or even the best features. AI will democratize those. The winners will be built on a data model that captures something true about their market, which in turn creates compounding advantages competitors can’t replicate.

And that's why I think the future of the software industry is data-driven, and we will end up with another GNU-like movement around free and open data models/schemas. I think we already have a good starting point: Linked Data[1] and schema.org[2]

[1]: https://www.w3.org/wiki/LinkedData

[2]: https://schema.org/

bayindirh 12 hours ago

Open Science folks understood this fact around 2018 IIRC, and there are a couple of nice standards for encapsulating research data such as RO-Crate [0].

Moreover, the science folks are not a picky bunch and they tend to use what works well, whether it be CSV and XML. As long as there's tooling and documentation, everything is acceptable, which is something I like.

[0]: https://www.researchobject.org/ro-crate/

kitd 10 hours ago

This was also the aim of RDF and the various metadata schemas like Dublin Core, to standardise ontologies for marking up knowledge.

Jgrubb 13 hours ago

I totally agree and would like to shill for the FOCUS project - https://focus.finops.org/focus-specification/ - which is an open source project to normalize and standardize the billing format of cloud vendors and Saas vendors alike. It brings greater transparency and efficiency to understanding that massive cloud bill your company pays every month.

I've used this schema to merge together AWS, GCP, and Azure into 1 unified cloud bill, which unlocks a ton of understanding of where the money is going inside the cloud bills.

kristianc 2 days ago

There's a term for this - inventing a new primitive. A primitive is a foundational abstraction that reframes how people understand and operate within a domain.

A primitive lets you create a shared language and ritual ("tweet"), compound advantages with every feature built on top, and lock in switching costs without ever saying the quiet part out loud.

The article is right that nearly every breakout startup manages to land a new one.

AdieuToLogic 24 hours ago

Another industry term for this is defined in the Domain Driven Design world as a domain's "ubiquitous language"[0]:

  These aspects of domain-driven design aim to foster a 
  common language shared by domain experts, users, and 
  developers—the ubiquitous language. The ubiquitous language 
  is used in the domain model and for describing system 
  requirements.

0 - https://en.wikipedia.org/wiki/Domain-driven_design#Overview

sethammons 13 hours ago

I call it lego pieces. We want to enable teams to compose useful units together; to enable builders (generally internal teams) to build things with a clear mental model. "Primitives" are the same: base unit of abstraction for the domain.

skeezyjefferson 12 hours ago

i think youre actually serious but this is excellent satire

bob1029 19 hours ago

A well engineered data model can also be used as the basis for a business rules engine. This is popular in enterprise environments that use technology like oracle db or mssql. It is possible to implement all the core business logic as stored procedures and functions. These can be directly invoked from something like a web server. Instead of putting all the session validation logic in backend code, it could live in PL/SQL, T-SQL, etc.

The benefit to having the logic and the data combined like this is difficult to overstate. It makes working in complex environments much easier. For example, instead of 10 different web apps each implementing their own session validation logic in some kind of SSO arrangement, we could have them call a procedure on the sql box. Everyone would then be using the same centralized business logic for session validation. Any bugs with the implementation can be fixed in real time without rebuilding any of the downstream consumers.

sethammons 13 hours ago

Counter point: spooky code at a distance is bad. Splitting your code to live partially in source control and partly in the database means keeping multiple layers in sync. This is coupling, and coupling multiple things, especially if that means teams, together means increased overhead.

I have seen business rules as stored procedures lock a business into their current model across with a dozen teams, effectively making system improvements impossible. And because they needed some olap atop oltp in some cases, their very beefy postgres solution crawled down to a max of 2k queries per second. I worked with them for over a year trying to pull apart domain boundaries and unlock teams from one another. Shared, stored procedures was a major pole in the tent of things making scaling the technical org incredibly hard.

Repeat after me: uncoupled teams are faster.

ryanrasti 16 minutes ago

+100 to you both. This is the classic tradeoff: powerful, centralized DB logic vs. clean but often anemic app code.

I'm building Typegres to give you both. It lets you (a) write complex business logic in TypeScript using a type-safe mapping of Postgres's 3000+ functions and (b) compiles it all down to a single SQL query.

Easier to show than tell: https://typegres.com/play

cyberax 18 hours ago

A veritable thread with bad advice!

AdieuToLogic 24 hours ago

For me, this is a "near miss" in that the data model is an implementation detail. Instead, the subtitle identifies where the value resides:

  Your product's core abstractions determine whether new 
  features compound into a moat or just add to a feature list.

Which is captured by the Domain Model[0]. How it is managed in a persistent store is where a data model comes into play.

See also Domain Driven Design[1].

0 - https://en.wikipedia.org/wiki/Domain_model

1 - https://en.wikipedia.org/wiki/Domain-driven_design

android521 20 hours ago

there is a subtle difference, it is not just domain driven desgin. It is basically trying to innovate a new way to think about in an existing domain (eg docs vs blocks in note taking). ~ "Your data model is your destiny. The paradox is that this choice happens when you know the least about your market, but that’s also why it’s so powerful when you get it right. Competitors who’ve already built on different foundations can’t simply copy your insight. They’d have to start over, and by then, you’ve compounded your advantage.".

AdieuToLogic 3 minutes ago

"Data model" is a software engineering term of art[0] which identifies artifacts specific to managing the persistent representation of information relevant to a system's operation. This representation is often a different, simplified, version of what a system uses internally to define and provide its value.

> It is basically trying to innovate a new way to think about in an existing domain ...

Note that a domain is not the same as a domain model.

> "Your data model is your destiny."

This is why I consider the article a "near miss." If the above quote from the post was instead "your domain model is your destiny", the subsequent quoted statements not only would need no alteration but would substantiate the topic at hand being domain modeling and the organizational value found therein.

0 - https://www.merriam-webster.com/dictionary/term%20of%20art

bonesss 15 hours ago

You’re describing core features of Domain Driven Design.

Innovating, evolving, creating, and capturing new domain concepts to create Blue Ocean solutions inside and outside the Enterprise. Iterating on core concepts, via subject matter expert led/involved discussions and designs, and using new concepts to better articulate the domain. Managing that change over time and accounting for ontological and taxonomical overlap versus Enterprise System development needs.

That’s the foundation that can actively copy insights, and doesn’t rely on Immaculate Specification or premature data modelling. No need to start over, thanks to clearly separated concerns.

Note: copying an insight is a far cry from having the wherewithal to make that insight, there are numerous downstream benefits to articulating your business domain clearly and early.

lysecret 7 hours ago

I agree was about to comment that this article fits a domain model more than a data model.

tcgv 7 hours ago

Good read, but it stretches "data model" a bit. It's really about the product's conceptual/domain model, the primary entities you elevate and design around, and how that choice cascades into UX, pricing, and go-to-market. The examples (Slack channels, Notion blocks, Figma’s canvas, Toast's menu items) show how a strong model can compound value across features.

Where it blurs things: data model != UX strategy != business model, and success isn't only about a novel model, execution and distribution still matter greatly.

My takeaway: read "data model" here as "core conceptual model", and ask whether your product has a clear center that lets new features inherit context instead of becoming one-offs.

jamesblonde 18 hours ago

I was expecting a discussion of the foundations of data modelling: star schema vs snowflake schema data models vs one big table. The benefits of 3NF vs when you have to denormalize.

This underlying choice of data model actually does define your destiny. What I think the author was thinking of is domain modelling and correct entity identification, which is also important. It's a layered approach - and if you ignore the foundations (the actual data model), you hit limitations higher up.

For example, in real-time AI systems, you might want users to provide a single value (like an order number) to retrieve precomputed features for a model. With Snowflake Schema data models, it works. But for Star Schema data models, you have to provide entity IDs for all tables containing precomputed features - which leads to big problems (the need for a mapping table, a new pipeline, and higher latency).

Reference: https://www.hopsworks.ai/post/the-journey-from-star-schema-t...

pegasus 16 hours ago

I prefer your terminology. That being said, domain modelling (what the article describes) comes first, hence is more foundational and important than data modelling.

dkarl 2 days ago

This is an application of an engineering term to a product-level concept, but it fits. I guess you'd say "domain model" in product-speak, but to my engineering brain it doesn't evoke the cascading consequences of the model for the rest of the system. It's a rare product manager who treats the domain model as a consequential design product and a potential site of innovation.

nitwit005 7 hours ago

I agree with the premise, but some of the examples are iffy. A lot of these design decisions are tradeoffs, not pure benefit.

Notion's idea of having blocks that can be shared between documents is very cool, but that has an obvious access control problem. What access do you need to see a document with information from multiple other documents? It was easy to find confused users: https://www.reddit.com/r/Notion/comments/12iqsc2/synced_bloc...

majke 18 hours ago

I totally agree. Early days Cloudflare was a great example of this. We treated IP addresses as data, not as configuration. New subnet? INSERT INTO and we're done. Blocked IP? DELETE FROM, and tadam. This was a huge differentiator from other CDN's and allowed us extreme flexibility. The real magic and complexity was with automatic generating and managing HTTPS certs (days before SNI).

vecter 18 hours ago

Can you explain more? I don’t understand the distinction in this case between data and configuration in the context of IP addresses.

majke 15 hours ago

In simplest scenarios software is not aware of the IP space. Like you bind to 0.0.0.0:443 and move on.

In more sophisticated configs adding / removing IP's or TLS certs requires restarting server, configuring applications. This gets out of hand quickly. Like what if your server has primary IP removed, because the IP space is recycled.

At CF all these things were just a row in database, and systems were created to project it down to http server config, network card setting, BGP configurations, etc. All this is fully automated.

So an action like "adding an IP block" is super simple. This is unique. AFAIK everyone else in the industry, back in 2012, was treating IP's and TLS more like hardware. Like a disk. You install it once and it stays there for the lifetime of the server.

dgb23 16 hours ago

Not OP but I think the insight was to treat them as first class objects that are interacted with directly. The implementation itself seems secondary.

0xCE0 15 hours ago

The topic reminds me at least these two documents, which shows in more detail how simple but well-thought data model can create great results.

https://www.notion.com/blog/data-model-behind-notion

https://git-scm.com/book/en/v2/Git-Internals-Git-Objects

Having worked with real business databases, I truly believe and feel that foundational information modeling (database schema, data types) makes or breaks the utility and capabilities of that data.

Rendello 2 days ago

I recently spent a week or so creating a library for my project. There's not a lot of code, but it was hard to reason about the data model, what I wanted the API to look like, and what I wanted actually rendered on the other side.

I was proud after getting it working, but when I had to run dozens of files through it, it was horribly slow. I don't tend to write a lot of hot code, so I was excited by the fact I had to profile it and make it efficient.

I decided that I should rewrite the system, as my mental model had improved and the code was doing a lot more than it should be doing for no reason. I've spent the last few days redesigning it according to Data-Oriented Design. I'm trying to get the wall-clock time down by more than an order of magnitude. I'm not sure how it's going to turn out, wish me luck :)

Since I mentioned DoD, these three links will probably come up in conversation:

Mike Acton's famous performance talk: https://www.youtube.com/watch?v=rX0ItVEVjHc

DoD in the Zig compiler: https://www.youtube.com/watch?v=IroPQ150F6c

The DoD book: https://dataorienteddesign.com/dodbook.pdf

ryanisnan 2 days ago

One nit, while I think Notion's data model is probably superior to that of Google Docs, I don't think their data model is what allowed them to succeed. Much stronger, I think, is their execution.

ares623 2 days ago

I would think their data model choice _is_ part of the execution?

mips_avatar 20 hours ago

Exactly like Google docs couldn't be Notion because Google tried to build Microsoft Office online, but Notion tried to build lovable without AI and accidentally made a better Google doc.

DrewADesign 24 hours ago

Sure, like a transmission is part of a car. No car could work without one, and a bad one makes an otherwise good car bad. However, a great one can’t make an otherwise bad car good.

daralthus 14 hours ago

I 100% agree in the importance of the data model, and the examples show that it often makes more sense to start from the user's perspective of it rather than your db schema.

Interestingly AI Agents are all about disrupting the hard bounds of existing data and interaction models and it turns out the lowest common denominator is often the winner. Eg.: file system > database, grep > embeddings, markdown > pdf, generative ui > portals, computer use > api-s etc.

There simply is no need for all that abstraction / interface / infrastructure to eg. answer questions about documents or to keep track of todo lists, workflows or sending messages etc. when you have glue that can translate between the data models.

itissid 2 days ago

I was working for a company recently and we were exploring how to model what a minor can do with their guardian managed account.

I did initially look at RBAC frameworks but since it was too complex for a small greenfield project I went with one or more accounts linked to a user's profile with a RBAC junction table linking account and profile ID in a relational database.

The junction table was the secret sauce, it allows you to stuff the RBAC permissions into its rows.

I could get very far with this model. For example it allows for example who can pay for features(guardian not minor). Have multiple people manage a minor. Validate permissions for a logged in account.

brigandish 21 hours ago

Decoder: RBAC = role based access control.

Whatever happened to people being charitable enough to readers to define their acronyms and abbreviations? This page is full of "insider" talk.

throwaway290 19 hours ago

Google exists... It's not responsibility of commenter sharing knowledge for free to also expand anything that can be unknown to somebody else?

alexpotato 13 hours ago

I enjoyed the article and wanted to share a story that highlights how a good data model can even bused for incident management tickets:

- Incident ticket gets created

- It used to go to a department wide alias

- Head of Dept used to open the email, hit forward and then have to To/CC the owner of the system affected

- JIRA (which we used) already the idea of a Component and you could tie an owner to each Component

- Update the notifications to be: in the "To" field was the Component owner and who opened the ticket. In the "Bcc" field was the department wide alias

- Now, the Dept Head could just hit reply and get to the right people. The BCC meant that everyone knew something had occurred.

mgh95 2 days ago

I really like this post. The only caveat I would add is it is possible to change your data model, but it requires constant and sustained high-effort work. It can pay off in spades, and it's always preferable to get it right.

munk-a 2 days ago

I've lead a change like that - the very core of our data model was compromised from the early days of our company and we knew it... and knew it... and four years into working there I started a serious effort that ended up taking about a year and a half to pay off. These efforts always need a lot of careful planning and you usually want to work within the constraints of early model decisions as much as possible but it is quite possible to gracefully transition. When you're doing something like this it's important to be extremely greedy with SMEs to try and understand as much as you can about the field to future proof your new solution - our company did that once - there's not a chance it'd do it twice.

mgh95 21 hours ago

I did it for my own startup. Messed up the whole "how do we break down what constitutes a tenant" thing in the initial design at 0 customers. Made me really feel the whole "experience is reading your own code from 5 years ago and wondering what idiot wrote that" thing.

Worked out OK in the end, but took substantial effort to fix.

realprimoh 2 days ago

This reminds me of "Good programmers worry about data structures and their relationships. (https://read.engineerscodex.com/p/good-programmers-worry-abo...).

From Linus Torvalds:

"git actually has a simple design, with stable and reasonably well-documented data structures. In fact, I'm a huge proponent of designing your code around the data, rather than the other way around, and I think it's one of the reasons git has been fairly successful […] I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important.

...

Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

dgb23 16 hours ago

Now there is a difference from what the article is talking about and what you are talking about and I think that's quite important, because we tend to mix these things up often.

The article describes domain modeling, what you describe is computational modelling. The former lives at a higher abstraction closer to the user. The latter is about data processing.

A lot of people have mentioned DDD (or similar) in this thread, but I think that is an example of mixing up computational modeling and domain modeling. I think this is what object orientation and its descendants like micro services generally have been doing wrong: Applying domain structure at a level where it makes no sense anymore. This mismatch can add a lot of friction, repetition and overhead.

skeezyjefferson 12 hours ago

Is Linus actually a good programmer though? Linux is certainly popular but he had a lot of help with it. As a person he seems prone to lashing out and childish. I doubt hed last in the real world with his attitude, the only place he could succeed is heading his own open source project

moomoo11 10 hours ago

He can act however he wants considering his contributions to humanity at large

brazukadev 11 hours ago

He built git alone in a week as a side project. If you think you are a good programmer, there's no way Linus isn't many orders of magnitude better.

skeezyjefferson 10 hours ago

https://bitbucket.org/jacobstopak/baby-git/src/master/

looks like a less than 2k LOC. Source control wasnt a new concept. How is this "orders of magnitude" impressive though?

brazukadev 10 hours ago

Sure. Now show us your best creation and we will compare.

pabs3 20 hours ago

The git data model isn't ideal though, it misses content-defined chunking of file content and directory entries, which leads to lots of duplicate data with large text files or directories containing large numbers of files. Newer backup tools like restic/borg support this though.

dividuum 19 hours ago

That seems like an implementation detail, not a fundamental design decision as it should be easy to change how packfiles are implemented. I'm not sure it would be an improvement though: it already only stores deltas for similar objects.

danielfalbo 10 hours ago

Why/how are likes/comments/reposts disabled on this substack post? https://substack.com/@matttbrown/note/p-176089442

0wis 12 hours ago

It seems that any problem solving starts by defining the data.

« Always define your variables » is the first thing I learned during my engineering studies, in both math and physics class. Professors were insisting a lot about it. I still consider it is the most important thing I ever learned 10 years later.

geokon 19 hours ago

The examples are great but .. I think.. the conclusion is naiive/wrong in a lot of cases.

Aren't both right and left sides of most examples fundamentally different views of the same exact underlying data/relationships?

If you're locked in to one view, then your code sucks. I'm not an expert in this stuff, but I think this is just a natural consequence of thinking in terms of tables and SQL.

If you think in terms of triple stores and graph databases then you can derive either a left side or right side view as needed and you can operate on either abstraction

athrowaway3z 18 hours ago

The data might be a graph, but we're interested in conceptual units of data / abstraction.

Organizations as a whole need to talk the same language. "We have the data somewhere, somehow - so figure it out" doesn't work when communicating.

A good proxy measurement is how fast it takes two random employees to talk about the same thing. It's the effective knowledge transfer speed after taking into account the data density / compression.

The infinite many views of a graph don't compress.

Or more practically, any graph will continuously evolve a set of 'common views' that an organization understands and uses.

geokon 18 hours ago

I think you're mixing things up. People don't need to have the same language at all. Compression can happen by "users" at a higher level when interacting with the data model. The data model itself does not need to encode this.

HR may want to look at employees from one view and your accountant might want to look from an entirely different one. They communicate with people in their own field using their own compression. You don't need to settle on one abstraction for everyone.

The data-model designers need to communicate with each other.. sure.. but it's their job to think in a non-compressed way about the data model.

prmph 18 hours ago

Data model here is not necessarily what is in the DB, I think TFA is using "data model" more abstractly, is the model of the business domain that is prevalent in the organization and structures how they talk about the domain.

geokon 17 hours ago

An interesting take. In the abstract it probably makes sense. You can't make a large org like Instagram start doing a Twitter or something like that.. Though I read the subtext to be talking about code

> By the time the architecture solidifies around these implicit choices, it’s nearly impossible to change.

> Incumbents couldn’t match this without rebuilding from scratch.

My point is that if your actual code-level datamodel is flexible then you often architecturally can pivot to a different view (the Rippling example). My guess as to why HipChat can't change to a Slack model is because they "coded it wrong". It's not somehow inherently orthogonal. The essay present the models are a fait accompli when in fact, if you step back they're for the most part just views on the same fundamental data/relationships - and if modeled correctly I don't see any reason why you wouldn't be able to change between views.

The central thesis that "Your data model is your destiny" does not have to be true

Though that said, there are some examples that probably can't pivot due to some inherent design limitation (ex: the Adobe case)

tomgp 17 hours ago

As a customer I often look for the data model without a "moat". I want to be able to move my data to a different supplier without too much hassle

throwaway98797 13 hours ago

luckily for sass builders most people aren’t like you

StopDisinfo910 16 hours ago

Am I the only one who is bothered by the gradual shift of the expression "data model" from something actually meaning something to a vaguely defined buzzwordy idea which can be brandished to mean anything from "ontology" to "data flow diagram" to even less precise business like entity?

It's eveywhere in my current company where the top management have somehow agreed that data is the future and like to talk about data products all the time but with an actual understanding of what it means, requires and entails with is close to nil. It's very lucrative for our suppliers however.

It feels to me like some kind of repackaging of what the semantic web was promising with very little in term of actual novelty and no real solution to the problems encountered at the time. It's everywhere in this discussion "domain driven design". I saw a post this week about knowledge graph.

Where is the push coming from? I'm missing some fundamental innovation somewhere which makes this more practical?

In my experience, it always ends the same. It's a slow death by governence as business objects owners always end up lagging what's needed in the field, gateways and caches start popping up everywhere as your model doesn't fit what the software you are buying require, data quality becomes uneven in the IT system, costs creep up through duplication until someone higher up probably promoted while putting in place the mess decides it's time to decouple and simplify and get their next promotion by ending what they created.

0xb0565e486 2 days ago

Isn’t this more of a modal usage thing than the actual data model?

Isn’t the slack data model presented here totally possible with hipchats actual data model?

treyd 2 days ago

How it's presented in the UI is roughly a function of how the underlying data is structured and manipulated. You can put in a lot of effort and construct a different view on top of a data model that "wants to" be seen in a different way (Delta Chat being an example of this on top of email), sure. But it increases the complexity of the implementation and makes the abstraction thicker, making iteration harder and introducing space for users (and onboarding developers) to misunderstand how things actually work.

laxpri 8 hours ago

This is a good one thanks for posting it .

tommica 20 hours ago

Wow the illustrations are good! They really help in understanding what the text says - the slack one is one that made me go "ooooooh"

mycall 14 hours ago

What about your algorithm destiny? code = data

willtemperley 19 hours ago

My data model is going to be Apache Arrow. Any kind of Arrow table.

What is my destiny?

jameson 19 hours ago

Is it data model or product? Are they effectively the same?

3abiton 23 hours ago

While the title read a bit dramatic, I find it hard to disagree on the concepts

barrenko 18 hours ago

This is why I come to the HN.

aleatorianator 2 days ago

yeah.. point me to a business where the data model is more important than the bottom line...

lmm 21 hours ago

The data model is what drives the bottom line.

Why is Goldman Sachs so profitable? They have a good data model and have spent 20+ years refining and applying it.

positron26 24 hours ago

Optimize your organization for dual-write migrations and log replays. Now you can do what many cannot: change the data model.

ykcadcg 20 hours ago

agreed

nvdnadj92 2 days ago

Agree with the first half of the article, but every example the author pointed out predates AI. What are examples of companies that have been founded in the past 3 years and prove the authors point that the data model is the definitive edge?

dafelst 23 hours ago

What does AI have to do with anything here?

ako 19 hours ago

Just had a chat with AI to see how we could address the issues mentioned in the article. You can create models that cater to multiple use cases. You can split the domain model into facts (tables) and perspectives (views). This gives you a lot of flexibility in addressing the different perspectives presented in the artcile from a shared domain model.

rvasa 2 days ago

The value of data model in post is spot on. AI has the potential to offer a mapping from the old to ideal (materialising a view); potentially offering an evolutionary path out for the smarter orgs.