Hacker News

159 points by wowi42 18 hours ago | 40 comments

This is really nice, specially the pdf report generation.

I feel very moronic making a dashboard for any products now. Enterprise customers prefer you integrate into their ERPs anyway.

I think we lost the plot as an industry, I've always advocated for having a read only database connection to be available for your customers to make their own visualisations. This should've been the standard 10 years ago and it's case is only stronger in this age of LLMs.

We get so involved with our products we forget that our customers are humans too. Nobody wants another account to manage or remember. Analytics and alerts should be push based, configurable reports should get auto generated and sent to your inbox, alerts should be pushed via notifications or emails and customers should have an option to build their own dashboard with something like this.

Sane defaults make sense but location matters just as much.

oogali 10 hours ago

> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.

Roughly three decades ago, that *was* the norm. One of the more popular tools for achieving that was Crystal Reports[1].

In the late 90s, it was almost routine for software vendors to bundle Crystal Reports with their software (very similar to how the MSSQL installer gets invoked by products), then configure an ODBC data source which connected to the appropriate database.

In my opinion, the primary stumbling block of this approach was the lack of a shared SQL query repository. So if you weren’t intimately aware with the data model you wanted to work with, you’d lose hours trying to figure it out on your own or rely on your colleagues sharing it via sneakernet or email.

Crystal Reports has since been acquired by SAP, and I haven’t touched it since the early ‘00s so I don’t know what it looks or functions like today.

1: https://en.wikipedia.org/wiki/Crystal_Reports

yesbabyyes 45 minutes ago

This brings me back! My first job was at the Norwegian ERP Agresso, now part of Unit4. I started as a support technician, which was a experience since around the time, '97-'98, everyone was moving from Sybase/Ingres/Informix etc, to either MSSQL or Oracle. I got to interact with those older database systems and install and export/import data to systems running on eg Oracle across parallel Solaris servers at SAAB Areospace and Windows NT running on DEC Alpha at Ericsson, among other more vanilla deployments.

I was a developer albeit not professionally, and my boss gave me the opportunity to develop the integration between Agresso and Crystal Reports, my first professional development project, for which I am still grateful. It was a DLL written in C++ and I imagine they shipped that for quite a while after I left for greener pastures.

I was already a free software and Linux enthusiast, so I did a vain skunkworks attempt at getting Agresso to run with MySQL, which failed, but my Linux server in the office came in handy when I needed some extra software in the field--I asked a colleague to put a CD in the server so I could download it to the client site some 500 km away, and deliver on the migration.

skeeter2020 8 hours ago

My best friend from early uni days did a co-op with Crystal Services, and he's been with them for their entire history through Seagate Software, Crystal Decisions, BusinessObjects (and relocating from Canada to France) and then SAP. I myself have had 2 temporary retirements, at least 4 different careers and countless jobs in that time, and it's wild to know someone who has the same internal drive but has satisfied it with a much more linear path (though you could definitely argue he's seen just as much change as me). From employee ~50 to ~100,050!

written-beyond 7 hours ago

Aaaaaah I had a professor rave about Crystal Reports once. Didn't know it had such gilded history.

mitjam 16 hours ago

1999-2000, the company I worked with gave a smallish number of key users full read rights to the SAP minus HR, briefly after introducing SAP to the global supply chain of that company. The key users came from all orgs using SAP, basically every department had one or two key users.

I was part of this and "saw the light". We had such a great visibility into all the processes, it was unreal. It tremendously sped-up cross-org initiatives.

Today, I guess, only agents get that privilege.

AgharaShyam 9 hours ago

100% agreed regarding shipping a read-replica, for any sufficiently complex enterprise app (ERP, CRM, accounting, etc.).

Customers need it to build custom reports, archive data into a warehouse, drive downstream systems (notifications, audits, compliance), and answer edge-case questions you didn’t anticipate.

Because of that, I generally prefer these patterns over a half-baked built-in analytics UI or an opinionated REST API:

Provide a read replica or CDC stream. Let sophisticated customers handle authz, modelling, and queries themselves. This gets harder with multi-tenant DBs.

Optionally offer a hosted Data API, using something like -- PostgREST / Hasura / Microsoft DAB. You handle permissions and safety, but stay largely un-opinionated about access patterns.

Any built-in metrics or analytics layer will always miss edge cases.

With AI agents becoming first-class consumers of enterprise data, direct read access is going to be non-negotiable.

Also, I predict the days of charging customers to access their own goddamn data, behind rate-limited + metered REST APIs are behind us.

conormccarter 8 hours ago

I fully agree in spirit, but in practice, read-replica's have some edge cases that are hard to control for. Namely, the incentives aren't fully aligned between the database host and consumer, and that dynamic can lead to some difficult resourcing decisions for the DB host. Whereas an API can be rate limited or underlying API queries can be optimized (however frustrating that might be for consumers).

The CDC stream option you flagged is more viable in my (admittedly biased) opinion. At my company (Prequel) our entire pitch is basically "you should give your customer's a live replica of their data in whatever data platform they want it in" (and let us handle the cross-platform compatibility & multi-tenant DB challenges).

I think this problem could also be a killer use case for Open Table Formats, where the read-replica architecture can be mirrored but the cost of reader compute can be assumed by the data consumer.

To your point, this is only going to be more important with what will likely be a dramatic increase in AI agent data consumption.

owlstuffing 6 hours ago

> I think we lost the plot as an industry

I get your point, but generally with most enterprise-scale apps you really don’t want your transactional DB doubling as your data warehouse. The “push-based” operation should be limited to moving data from your tx environment to your analytical one.

Of course, if the “analytics” are limited to simple static reports, then a data warehouse is overkill.

jorin 13 hours ago

hi, dev building Shaper here. I agree re sending reports vs dashboards. Many users use Shaper mostly as UI to filter data and then download a pdf, png or csv file to use elsewhere. We are also currently working on functionality to send out those files directly as messages using Shaper's task feature.

written-beyond 11 hours ago

It would be a game changer, very interesting to see this grow. How did you get your PDF generation so good?

jorin 11 hours ago

happy to hear that! pdfs are generated in a headless chrome in the same docker container as shaper itself using chromedp.

matsz 16 hours ago

> I've always advocated for having a read only database connection to be available for your customers to make their own visualisations.

A layer on top of the database to account for auth/etc. would be necessary anyways. Could be achieved to some degree with views, but I'd prefer an approach where you choose the publicly available data explicitly.

GraphQL almost delivered on that dream. Something more opinionated would've been much better, though.

written-beyond 13 hours ago

That's exactly what I meant. It's a specific replica instance with it's own security etc. but not necessarily a separate API you try to integrate too. APIs can stay for writes, but for reads you have the db

mrits 9 hours ago

Customers don’t want to learn your schema or deal with your clever optimizations either. If you expose a DB make sure you abstract everything away in a view and treat it like a versioned API.

written-beyond 4 hours ago

The best example for this are iot devices that share their data. Instead of reinventing the wheel for a dashboard for each customer just give them some docs and restricted access via a replica.

piterrro 15 hours ago

In what extent this is a metabase alternative? I'm a heavy Metabase user and there's nothing to compare really in this product.

mritchie712 9 hours ago

We've (https://www.definite.app/) replaced quite a few metabase accounts now and we have a built-in lakehouse using duckdb + ducklake, so I feel comfortable calling us a "duckdb-based metabase alternative".

When I see the title here, I think "BI with an embedded database", which is what we're building at Definite. A lot of people want dashboards / AI analysis without buying Snowflake, Fivetran, BI and stitching them all together.

victorbjorklund 2 hours ago

Not open source though?

jorin 13 hours ago

hi, dev building Shaper here. Both, Shaper and Metabase, can be used to build dashboards for business intelligence functionality and embedded analytics. But the use cases are different: Metabase is feature-rich and has lots of functionality for self-serve that allows non-technical users to easily build their own dashboards and drill down as they please. With Shaper you define everything as code in SQL. It's much more minimal in terms of what you can configure, but if you like the SQL-based approach it can be pretty productive to treat dashboards as code.

piterrro 11 hours ago

sorry, so it ain't an alternative in any way. Its like saying a bicycle is an alternative to an airplane, both have seats...

rorylaitila 7 hours ago

Nice work! I met Jorin a couple years ago at a tech meetup and this was just an idea at the time. So cool to see the consistent progress and updates and to see this come across HN.

thanhnguyen2187 8 hours ago

Thanks for the cool tool! I think it's worth mentioning SQLPage, which is another tool in similar vein, to generate UI from SQL. From my POV:

- SQLPage: more on UI building; doesn't use DuckDB

- Shaper: more on analytics/dashboard focused with PDF generation and stuff; uses DuckDB

https://github.com/sqlpage/SQLPage

cjonas 5 hours ago

Is there anyway to run the query -> report generation standalone in process? Like maybe just outputting the html (or using the React components in a project).

I was looking to add similar report generation to a vscode-extension I've been building[0]

[0](https://github.com/ChuckJonas/duckdb-vscode)

frafra 15 hours ago

Metabase works great with DuckDB as well, thanks to metabase_duckdb_driver by MotherDuck.

kavalg 12 hours ago

This is so cool and also MPL licensed! Thanks!

3abiton 13 hours ago

As someone who used duckdb but not shaper, what is shaper used for? The readme is scarce on details.

jorin 13 hours ago

hi, dev building shaper here. shaper allows you to visualize data and build dashboards just by writing sql. the sql runs in duckdb so you can use all duckdb features. its for when you are looking for a minimal tool that allows you to just work in code. you can use shaper to build dashboards that you share internally or also for customer-facing dashboards you want to embed into another application.

antman 11 hours ago

Will it expose a visual query builder as metabase?

jorin 10 hours ago

shaper leans into doing everything as code. instead of using a custom UI you can use your own editor and AI agent to generate dashboards for you. shaper is for people happy to use code. it doesn't try to provide self-serve functionality.

SebTi 8 hours ago

And I think that's exactly what makes it so clever. Three years ago, I would have considered this decision risky. But with the live sync feature and "just SQL" as the language for the dashboard builder, it's so powerful—thanks to Claude Code, for example!

ldnbln 10 hours ago

my company integrated tale shape as our customer-facing metabase dashboard alternative. absolutely love its simplicity!

pdyc 16 hours ago

interesting i am trying to build one too but rejected duckdb because of large size, i guess i will have to give in and use it at some point of time.

andrewstuart 16 hours ago

I wanted to love DuckDB but it was so crashy I had to give up.

jastr 2 hours ago

I had this too until I lowered it's memory limit. In ~/.duckdbrc `set max_memory='1GB';` or even less

robowo 16 hours ago

I use it daily and it never crashed. How long ago was this? I am a big fan of DuckDB. Plow through hundrets of GB of logs on a 5 year old linux laptop - no problem.

pletnes 15 hours ago

Same here. I have however seen a few out of memory cases in the past when given large input files.

jastr 2 hours ago

By default, it tries to take 80% of your memory. I've found that you need to set it to something much smaller in ~/.duckdbrc `set max_memory='1GB';`

skeeter2020 8 hours ago

it's not the focus or very performant but you can have it spill to disk if you run out of memory. I wouldn't suggest building a solution based on this approach though; the sweet-spot is memory-constrained.

maxldn 10 hours ago

Really? How large? I’ve only managed to crash it with hundreds/thousands of files so far, but haven’t so many huge files to deal with.

drfrost 10 hours ago

[dead]