Finding a CPU Design Bug in the Xbox 360 (2018)
137 points by mariuz 5 days ago | 41 comments
dev-ns8 58 minutes ago
Wow... A speculative branch prediction path actually get's preemptively executed despite the branch outcome? No matter if the execution has side-affects??? That's quite amazing. Are modern CPUs doing speculative execution like this and just put extra safeguards around affects or do they just prefetch / decode instructions now-a-days?
replybrucedawson 27 minutes ago
Author here: This is not a common problem. I think I was told that Alpha had basically the same bug but it is a bug, for sure. Speculative execution causing problematic side effects is a deal killer.
replySpeculative execution, however, can cause less problematic side effects. For instance, a speculatively executed load or prefetch will usually actually prefetch which will pollute the cache, TLB, etc., and reveal side-band information, but that is a performance problem and perhaps a subtle security flaw, not a correctness bug like this was.
TazeTSchnitzel 6 hours ago
This article is from 2018.
replyPreviously:
NooneAtAll3 5 hours ago
unrelated, but recently XBox One was hacked for the first time
replybrcmthrowaway 5 hours ago
How does XBox get hacked when it uses Secure Boot?
replyTuna-Fish 4 hours ago
Voltage glitching. An outside attacker who has direct, extremely fine-grained control over the power supply to the chip can cause it to brown out for one instruction cycle, preventing a result of an instruction from being written.
replyWith enough sophistication, physical access is more powerful than root access, no exceptions.
The high failure rates of the Xbox 360 did not help.
https://en.wikipedia.org/wiki/Xbox_360_technical_problems
"Microsoft did not reveal the cause of the issues publicly until 2021, when a 6-part documentary on the history of Xbox was released. The Red Ring issue was caused by the cracking of solder joints inside the GPU flip chip package, connecting the GPU to the substrate interposer, as a result of thermal stress from heating up and cooling back down when the system is power cycled."
It seems like there was a period in time when solder just wasn’t done well, it seems like.
Microsoft spent over a billion dollars replacing and repairing consoles to maintain the good brand name of Xbox.
https://en.wikipedia.org/wiki/Xbox_360_technical_problems
However, I wonder how many people got "burned" by it and swore off Xbox consoles going forward.
I know that era we got a lot more use out of the Xbox (original) and the Wii.
I've heard that flash memory can also be revived with heat, either long duration or high intensity.
https://www.extremetech.com/science/142096-self-healing-self...
IBM's Power was the only logical option at the time.
These consoles were being designed around 2000. Intel and AMD weren't partnering on bespoke CPUs at that time. I don't even think AMD would have been considered a viable partner. Neither had viable 64 bit options and part of console marketing at the time was the ever increasing bit depths.
Prior console generations had use MIPS which wasn't keeping up with ever increasing performance expectations and players like Toshiba and Sony were looking for a higher performance CPU architecture. IBM's Power architecture was really the only option. Sony, Toshiba, and IBM partnered to develop their a new 64 bit microarchitecture called Cell.
Microsoft's first console was basically a PC and that's how everyone saw it. The 360 was an opportunity for Microsoft to show that it could compete with the big boys. It was also an opportunity to keep a toe dipped in RISC, because it had dropped support for RISC CPUs with Windows 2000.
What wasn't viable?
If you double the size of numbers, sure it takes up twice the space. If the total size is still less that one page it isn't likely to make a big difference anyways. What really makes a difference is trying to do 64-bit mathematics with 32-bit hardware. This implies some degree of emulation with a series of instructions, whereas a 64-bit CPU could execute that in 1 instruction. That 1 instruction very likely executes in less cycles than a series of other instructions. Otherwise no one would have bothered with it
It's possible but rare for systems to have 64-bit GPRs but a 32-bit address space. Examples I can think of include the Nintendo 64 (MIPS; apparently commercial games rarely actually used the 64-bit instructions, so the console's name was pretty much a misnomer), some Apple Watch models (standard 64-bit ARM but with a compiler ABI that made pointers 32 bits to save memory), and the ill-fated x32 ABI on Linux (same thing but on x86-64).
That said, even "32-bit" CPUs usually have some kind of support for 64-bit floats (except for tiny embedded CPUs).
Now you could build a weird CPU that has "more memory" than it has addressable width (the 8086 is kind of like this with segmentation and 8/16 bit) but if your CPU is 64 bit you're likely not to use anything less than 64 bit math in general (though you can get some tricks with multiple adds of 32 bit numbers packed).
But a 32 bit CPU can do all sorts of things with larger numbers, it's just that moving them around may be more time-consuming. After all, that's basically what MMX and friends are.
It would also process binary-coded decimal integers, as well as floating point.
"The two came up with a revolutionary design with 64 bits of mantissa and 16 bits of exponent for the longest-format real number, with a stack architecture CPU and eight 80-bit stack registers, with a computationally rich instruction set."
https://en.wikipedia.org/wiki/Intel_8087
That allowed both a CPU and an advanced GPU to be on the same die.
They also wisely sold Global Foundries, and were able to scale with TSMC.
At that time AMD wasn't in the custom CPU business, AMD64 was a new unproven ISA, and x86 based CPUs of that time were notoriously hot for a console. These were also some of the reasons why Microsoft moved away from the Pentium III it had used in the original Xbox.
The PS3 was launched in 2006 but the hardware design was decided years earlier to provide a reference platform for the software.