Beautiful machine, and fun to see Illumos heart still beating inside!
Not so sure about this one. HCI (Hyperconverged) rack units (where storage and compute live in the same racked systems) and "blade servers" have been a thing for a really long time now; compute sleds aren't what's novel here.
Rack-level DC conversion is also not particularly novel, although underutilized IMO. It was pretty popular in HPC style density applications for awhile (see HP/SGI Altix 4000 for a good old example).
What's unique about Oxide is that they went all the way down to the firmware and then back up, rather than doing commodity hardware integration or reselling - for example, you can get something like a Supermicro EVO:Rail, but it will be running VMWare, not a fully integrated platform.
Somehow everyone wrote to me about baldes. These are not the same, though. Blade servers were mounded into units of 4u, 8u, etc, they occupied a portion of the overall cabinet and still had to do "plumbing" for power and networking behind the chassis to the rest of the cabinet or to the rest of the datacenter. A full-cabinet blade rig would have multiple 8u blade units and some off the shelf units for networking, storage, etc. Yes, you could mix and match different components based on your needs, but that also meant that there were extra wires, cables, mounting rails, and more importantly - all these different components ran a mix of software that had to integrate using common denominator protocols and speeds.
Steve rightly mentioned the integration below, and I didn't put it in my message because I kinda assumed that we include software in this discussion too.
HP in 2005 had an army of programmers writing all sorts of firmware and software and another army of hardware engineers, too. They could have made an Oxide computer back then, and it would sell really well. But they didn't, and none of their competitors did despite this being an obvious product (in hindsight), an THIS is what I find interesting.
This sounds like less of a problem for DCs with bare concrete flooring, but blades did fell out of fashion, so I guess the fractions of DCs with multiple levels or free-access floors were higher than anticipated.
(Also, maybe I'm just being an amateur, but I'd be scared of tolerance stacking with a "grape bunch" design like this. Individually enclosed chassis with cables and cage nuts are a lot more robust against dimensional issues)
That's what's so cool about Oxide's boxes to me-- the legacy garbage is gone and the strange undefined behavior part and parcel with overlapping edge cases will be minimized (and managed, as opposed to used as an excuse by a vendor). Dealing with incompatibilities and strange firmware interactions have made me come to see PC-based servers as a weird opposite of the "Swiss cheese" model. The various layers of interacting hardware, firmware, drivers, and OS act as a kind of "filter" for correct operation. When you swap or add one of these component you get one or more exciting new layers in the stack that, hopefully, have "holes" aligning with the existing.
The storage market used to be dominated by Oxide-style vertical integration and bespoke engineering and almost every vendor has transitioned to modularity over the last 20 years. Pure Storage seems to be doing OK with custom hardware though, so maybe the rest of the industry just has a lack of courage.
There are systems which have similar overall hardware designs, but they are usually integrating a large amount of hardware and software from multiple vendors. Oxide is much closer to "everything is produced by Oxide."
I wrote this back in 2022, and it's still fundamentally relevant today: https://news.ycombinator.com/item?id=30678324
Dell and HP both have "blades" that plugged into a blade-chassis. The chassis had all the lights out mgmt as well as power/networking integrated so the blade was basically a metal box with compute/memory/storage and it just slid in to the dock.
I am sure that supermico had something like this as well
Blades have the basic issue of "how often do you want an unpopulated chassis?" - answer, never.
So really they're solving for replacing a failed piece of hardware.
But how often do you need to do that, what's it worth to you? If it makes sense then the statistical window where it does is tiny.
And if you own more then 1, like an entire rack, then do you even care? Because above some scale you're just going to wheel the rack out rather then go and pull individual units.
Basically the scaling is against you: for a highly manageable bladey rack unit, you've got to be small enough that one server matters, large enough you need the swap out to be low labor, but not so large you could just wait for the rack to go down. And this has to be worth enough to justify the price premium and vendor lock in (because at rack scale you just buy a rack of the cheapest whatever from any vendor and make them compete on price - at one job bringing our computer management in house triggered an immediate 10% price drop because we threatened HP with using another supplier at all).
Loved the idea of blade servers, but they were targeted to people who needed very high compute in small footprints, and we both didn't have high compute requirements and were power/footprint constrained (we could get more power but cost/watt would go up because of cooling density).
The Twin^2 was nice because it amortized the cost of redundant power supplies over 4 machines, but didn't have the cost overhead of big backplanes or fancy layouts to get a lot of CPU+RAM in a small physical space.
Once populated a 4 node chassis was around $750/node including CPU and RAM and 2x SATA drives, it was within $100 of the price of a similar 1U server. We had around 10 cabinets in a data center when I left the company. It was, IMHO, a pretty good deal to get a dedicated box with 24x7 monitoring and sysadmin services including updates and backups at $150/mo.
It was implemented with User Mode Linux, a Linux kernel ported to run under Linux instead of ported to a bare machine. A crazy idea, but it worked REALLY well. I remember finishing up the sign-up and billing software on the plane on the way to US PyCon where we announced the service, though I don't remember the year.
Yep! That perfectly describes the few remaining people I know of that operate the things... and they're (slowly) seeing the light.
Oxide does get a bit of a pass on the vendor lock-in, though. I think you're buying from them _because_ they are the only vendor that has the security model and level of integration.
They all did. HP had Super Dome and blades and Synergy. Dell had similar.
I guess the world of atoms is still hard enough that you can publish an interactive spec of your product and not have to worry about it being immediately copied.
We need to get it cross-linked from the main site still.
NVIDIA has one primary weakness here: GPUs are NOT optimal hardware for training or inference. No competitor has the market reach to challenge them though. An open standard supported by high growth, breakout startup would change that though.
I don't want to subtract from the demo too much, b/c I do love oxide, but I do see this as a trend that more people will use to garner attention until it's too overdone - at which point, 3D will revert to being used for more practical use-cases
EDIT: typos
It's part of the reason I'd waited so long before making this, I knew it was going to be a lot of work. There's parts that Claude was especially useful for, like perf testing, debugging and animation. But the first half of the project was done almost entirely by hand.
Human effort as a proxy for quality... that ship has sailed. And that makes me feel frustrated.
If I recall when comparing to competition, it was premium priced, for sure, but it's more that it's so dense that you had to compare 1 Oxide rack to like 4 commodity racks. Spec for spec I recall that the premium for the verticality wasn't that high.
Not to mention that working at Oxide sounds like a modern Sun Microsystems with the ideology that team has. Highly recommend their podcast "Oxide and Friends", and their original "On The Metal" show.
I've attempted to apply to their company multiple times over the years, only to be stun locked by the application process. Not because it's a bad process, but because I feel I'm not up to par as an engineer. Maybe one day I'll go through with it.
Greatest hope: their approach catches on outside just Oxide, and I get to work somewhere with a similar vibe/ethos one day.
Greatest fear: the way they work only makes sense for the most elite / well capitalised of companies.
I will apply again at some point when an interesting job comes up, and I have a stronger skillset.
I don't need to work there (nor do I feel like I'm smart or talented enough to)-- I just wish I could work with the Oxide gear in Customer engagement, too. I don't work with businesses big enough to need it, sadly. It looks so sweet.
This is what I think of when I think of utility-scale compute-- not racks of Supermicro / Dell / HP boxes with tiny ISA buses hiding on traces on their motherboards for "baseboard management controllers" to plug into to pretend to be PC AT keyboards.
Their interview process was shady. There was a post here about 1-2 years ago that was a link to their interview process and how open and transparent they were. The post itself was from an employee and a fellow commenter who was gaslighting folks was also an employee. Several folks complained about the tremendous amount of homework they had to do after the initial screen, and once submitted, were ghosted. One of employees repeatedly rebutted that claim in the comments, and they did this for quite a few commenters. Was a not a good look. I doubt much has improved since then as seeing the comments below confirms the same mess.
Don't spend time being amazed by folks who won't treat you right. It just ain't worth it.
To state clearly what I feel we have said many times: Yes, it's hard to get a job at Oxide. Yes, we get a lot applicants. Yes, we ask a lot of applicants upfront. But the payoff (and the reason it's worth the risk and the work for the right person!) is an extraordinary and uplifting team -- one that I daresay each of us counts as being of unparalleled breadth and depth in our careers.
> That we don't provide specific feedback on individual applicants (even though we explicitly state that/why we don't)?
Your response is not a response to the OP's claim. The OP didn't claim you didn't provide specific feedback, it was that they were entirely ghosted mid-process. And that others said the same.
But even beyond that, your response doesn't align with your own careers page's "Hiring Process":
> If candidates aren’t advanced into interviews by the process outlined in [rfd147], an explicit rejection should be sent. The level of oversubscription for Oxide roles means that this rejection will likely be non-specific — which is naturally frustrating for applicants that have put a lot of energy into their materials. Candidates may well respond to a rejection by asking for more specific feedback; to the degree that feedback can be constructive, it should be provided.
Which would be in alignment:
> Decency
> We treat others with dignity, be they colleague, customer, community or competitor.
Here you just come off quite defensive, and argue that you at are Oxide are "very clear about" things that you say quite the opposite about on the very directions you tell candidates to read.
If what you say is true - and I can absolutely believe it is - fine, update the docs and the site. But don't come here and gaslight people into "I don't understand the problem. We're very clear, we've been very clear, people should not be complaining about this."
Source: https://rfd.shared.oxide.computer/rfd/0003
Eh, if even a small percentage of those emails end up in a spam folder then there are going to be people who think that they were ghosted. They didn’t ghost me. Alas, they didn’t hire me either.
Investing 6 hours into applying for a position should warrant a response beyond 'we are going to pass'
But generally, the more demands you put on a first round, the less likely I am to apply. I've seen companies asking for 8-10 multi-paragraph each long form answers to even get to a hiring screen. For one recent application, this was one of the questions, of eight: "Describe a time when you had to make a tradeoff in roadmap items. Describe each option and their merits, and the decision-making criteria you used. Describe what stakeholders you spoke with and how their input influenced you. Describe how you communicated this with the team, and customers. Be specific about all points and clear on the exact role you played in this process."
People can say "well, it's a good screen because if you won't put effort into that, will you put effort into your work", but if your argument is that you need to do such things because you get 500-1,000+ applicants per position, you're going to have a hard job convincing me that a human reads every one of those, and not just the subset that are not automatically routed to the trash by your ATS and/or AI.
So my end retort to that is "well, it's a good marker of the level of respect I can be expect to be treated with as an employee".
Terrible process. You need to give feedback early if you're not interested in someone, not leave them hanging for nearly half a year.
And since transparency is a core value and principle, will you commit to sharing your cap table publicly?
In terms of the cap table: that's a bit of an odd request? On the one hand, there are no real secrets hanging out on our cap table -- but on the other, based on your tone, it doesn't feel like the request is terrible earnest? (And, I hasten to add, transparency is a value -- not a principle.[0])
[0] https://rfd.shared.oxide.computer/rfd/0002
You’re not getting across a reasonable point here. Maybe take a step back and think about what you really want to say.
Clearly something is landing wrong, but exactly what is not being well communicated.
How would you handle a few thousand applicants for a single role?
I think no matter what you do it will feel inhumane, we can argue that a few hours of work for a take home test is inhumane too, being ghosted after doing one definitely wouldn’t pass my personal bar of acceptability, but if its the first stage and the task would take a properly qualified applicant less than 30 minutes then I can’t fault.
How would you do things? remember that it has to scale and you cant leave any gaps based on human fallibility (HR/Hiring Managers are humans and will forget if there are too many things going on at once).
I've done this for hiring before, for people who reached the "put substantial effort in" stage (in my case basically 2nd or 3rd round work sample stuff), and it was a great way to make sure we got good signal and they felt respected.
DDG hires like this, actually, and if I recall correctly I would be paid a flat fee, it would take a week, and the work I did would be part of something genuine in DDG, maybe a bug or something.
Now, that probably sounds good to you, but taking a week out of my current employment is not going to happen- there’s an incentive to go “over the hours” inherent to the ask, even if you’re paying me a flat rate, I might lose to someone equally qualified who puts in 1.01n into the task, so I should put 1.02n (etc; ad infinitum).
Which is part of the issue with all take home assignments. I have given out take home assignments (given to HR to be administered) which should take a qualified candidate 20 minutes to finish beginning to end (as in, including syncing the project, setting up their editor, exploring the problem, googling around about things, trying it out and then following up with the email to HR). I don’t doubt for even a moment that someone has spent several hours on this problem- because they’re not qualified.
Passing the HR barrier in that case will not help them unfortunately, because they’ll get to talk to me, and I will disqualify them in all likelihood, and candidates are told that it should take not more than a half hour, but en masse: people don’t listen.
The trouble is, theres thousands of applicants, a handful of HR, and one me.
Not to be on some kind of pedestal (I’m not), but the problem doesn’t scale, you need only apply the tiniest amount of systems thinking to see it.
And I would make it very clear that putting in more than 30 minutes of work, timed, is a disqualifier, and I would sleep well at night clearing all those people out of the queue.
You will bias heavily along some kind of axis, preferred previous employers or location, age, etc.
You add a lot of bias into the system by trying to further scrutinise otherwise meaningfully qualified people on paper.
Hint: you don't even need to evaluate most candidates at all. Random sampling is sufficient and provably bias free.
> Whenever I get a stack of resumes, I throw half of them in the trash
> I sure don't want unlucky people on my team.
What do you send them as a response "sorry, we're going ahead with other applicants" - "you have not been selected this time" -- what happens if you start needing to dig through that pool of now rejected candidates?
Peak humanity.
I acknowledge that I am reaching back out, and they may not be available.
Like a human does.
> Reminds me of something I heard once.
>> Whenever I get a stack of resumes, I throw half of them in the trash
>> I sure don't want unlucky people on my team.
I was actually about to make the same joke.
So about six minutes for the problem itself, then?
It was for an investment bank though and they have essentially unlimited money. I can't imagine any of the other companies I've worked for would be remotely generous enough to do the same.
You shouldn’t be giving take homes unless they’re either short, or the applicant passed a screen and you’re investing time. Otherwise how are you “scaling” the review? Claude? Hidden test suite (not bad)? Some sort of leaderboard (bad, rewards people with time), something else?
I like programming problems, spending a day at Google was fun, they put me up in a fancy hotel, and the interviewers were nice. Like it was clear a lot of time and money had gone into the process (6-8 hours of dev time is not cheap), not a zoom and ghost like most companies.
You can see from this thread that Oxide is a company with an online fan base. If our own experience at Fly.io is anything to go by, they are getting an avalanche of applications for every role they have open. It is extraordinarily difficult to service those kinds of candidate flows. That doesn't excuse ghosting (something we did a bunch even when trying hard to avoid it) or other unfriendly/unfair practices --- which are rife across the industry, most especially at companies that don't have the reputation Oxide is trying to cultivate --- but it does give some context to it.
Long story short: you can't really predict how a company treats its team from the first-contact inbound candidate experience. It's a signal, but it's a small signal among a great many others.
Does anyone have an actual estimated time we can discuss?
It's a fair bit of writing to ask for, but for a mostly remote and prose-driven company, you do a lot of long-form writing in the day to day work. The public RFDs and github issues/comments/commits give a good flavor for this.
As others have said, lots of my work is open source, and I have public writings and talks, so finding those were much easier for me than it might be for someone with only closed source works.
I don't remember how much time I put into mine when I applied.
It is worth keeping in mind that we write a _lot_. If you don't enjoy the process of writing, you might not like working here.
but I still applaud the intent. I self-selected out by giving into scope creep
It sounds from the outside like Oxide has an interview process that requires some low level engineering work to be delivered? Maybe I got that wrong.
As usual, I'm assuming the assignment is evaluated based on a reasonable time-commitment. From what the recruiting experts tell me, it's a good strategy to spend as much time as possible, the deliverable is better, and the optics aren't bad either, it signals investment into the application instead of signalling spray and pray application broadcasting.