The cube solver that refuses to be confidently wrong

You point your Mac’s camera at a scrambled Rubik’s cube. cubeconjure reads it, and tells you how to solve it. That’s the whole pitch, and the second half is the easy half — turning a known cube state into an optimal sequence of moves is a solved problem, decades old, milliseconds fast.

The hard half is the part that sounds trivial: reading the stickers. We went in assuming the only real puzzle would be telling red from orange. That gap is genuinely nasty — but it was the least of it. Under some lights, blue reads as white. Six colors, nine of each, and a camera that sees them differently every time the lighting shifts. “Just read the colors” turned out to be the whole engineering story, and the story has a spine: never be confidently wrong.

Don’t judge a sticker alone

The naïve scanner asks one question fifty-four times: is this sticker red, or orange, or blue, or…? Each sticker tried in isolation, against some fixed idea of what “red” looks like. That fixed idea is exactly what real lighting breaks.

The insight that turned the corner was to stop classifying stickers one at a time. A cube hands you a free, perfect color reference: its six center stickers. Those six centers are the six colors — under this exact photo, this exact light, this exact camera. So instead of “what color is this sticker in the abstract,” the question becomes “which of these six centers, sitting right here in the same frame, does it most resemble?”

Then comes the move that makes it robust. We don’t decide each sticker independently and hope it adds up. We solve one balanced assignment for the whole cube at once: place all fifty-four stickers onto six faces, with exactly nine on each, in the way that minimizes total perceptual color distance. A real cube has nine of every color — so we make the solver obey that arithmetic.

The payoff is quiet but large. An over-eager red can’t grab a tenth sticker, because there is no tenth red slot — the cube’s own counting shoves the borderline sticker back where it belongs. Ambiguity gets resolved by the cube’s structure, not by a hand-tuned threshold we’d have to re-tune for every kitchen, office, and overcast afternoon on earth.

The lighting saga

Then we tried it at different times of day, and the floor moved.

Daytime scans drifted. Nighttime scans fell apart entirely. The culprit is something everyone’s eyes correct for automatically and cameras don’t: colored light. A warm bulb pushes red, orange, and yellow toward each other until they’re nearly the same hue. Cool light does the same cruel trick to blue and white. Different casts squash different pairs together — and a scanner with a fixed notion of “blue” walks straight into it.

The fix wasn’t to out-engineer physics. It was to stop fighting it. We let the camera’s automatic white balance neutralize the color cast — that’s the job white balance exists to do — while locking exposure so brightness stays steady across the scan. And crucially, we never judge a sticker against an absolute idea of a color. We judge it against the cube’s own centers, photographed in the same light, in the same frame. If the warm bulb reddens everything, it reddens the reference too, and the comparison stays honest. Relative beats absolute.

A sticker is a belief, not a color

During a scan the camera doesn’t see each sticker once. It sees it dozens of times — through glare, through a half-blurred frame, through a moment where your hand shadows one corner. The obvious thing is to average all those looks into a single color and move on.

The obvious thing throws away the most useful information you have: disagreement.

So we don’t average the colors. We keep each sticker as a distribution — a running tally of which face it looked like, accumulated across every frame. A sticker that read solidly blue forty times in a row is a confident blue. A sticker that flickered between blue and white stays honestly uncertain, instead of being silently rounded to whichever side happened to win by a hair.

We later put this head to head with color-averaging, and the difference is exactly what you’d hope. Average the colors, and one bad frame — a flash of glare reading the wrong hue — can drag the mean across the line and flip the answer. Tally the looks instead, and that same bad frame is one outvoted ballot in a crowd. A sticker is a belief, and beliefs should hold their uncertainty, not launder it.

The core principle: never be confidently wrong

Here is the line the whole product is built on.

Sometimes a read is genuinely ambiguous — two arrangements are both legal cubes, and the pixels honestly don’t settle it. A confident scanner picks one anyway, hands you a solution, and watches you scramble your cube further. We refuse to do that. When the evidence is truly split, the engine surfaces the handful of legal interpretations as equal options, and lets the one person who actually knows — you, holding the real cube — point at the right one.

We also refuse to dress this up in a fake number. No “94% confident” badge: a confidence score we couldn’t stand behind would just be a more polished way of being wrong. When the engine is sure, it’s sure. When it isn’t, it says so and asks. And if you’d rather fix a sticker by hand, you paint from a palette that shows a live “nine of each” count, so any imbalance is visible the instant you create it — the cube’s arithmetic, made into a tool you can see.

The humbling middle

I want to tell the unflattering part, because it’s where the real lesson lives.

Scans kept assembling into wrong cubes — legal-looking, but not the cube on the desk. The confident diagnosis came fast: the sampling grid must be bleeding, reading pixels from a neighboring face instead of the sticker we meant. Obvious. Plausible. Wrong.

What saved us was refusing to act on the confident guess. Instead we built an instrument — a debug view that paints the exact points being sampled directly onto the live camera frame, so we could see where the engine was looking. The truth was the opposite of the theory: the sampling was dead-on, every point landing square in the center of its sticker. The grid was innocent.

The actual culprit was orientation. The test cube happened to be a near-symmetric pattern, and each face was photographed on its own — so which way a face was rotated relative to its neighbors was simply unknown, and the assembler was guessing. Hand it a normally-scrambled cube and everything worked; the symmetric pattern had been quietly exposing a hole the random cases hid.

That bleed theory wasn’t the only confident guess we had to put down. We worried the belief-distribution might be amplifying noise rather than averaging it out. We were sure a fancier color metric would cleanly separate red from orange where the simpler one couldn’t. Three confident hypotheses, each one reasonable, each one tested against a small dependency-free harness — and each one refuted by the data before it could ship as a fix. Instrument for the human. Let the data decide. Verify by doing. And distrust your own most confident guess, because “obviously the grid is bleeding” felt exactly as true as the theories that turned out right.

The mirror

Somewhere in the middle of all that, the two halves of the project turned out to be the same half.

The thing we were building for the user — refuse to output a confidently-wrong answer; when you genuinely can’t tell, ask the human — is the exact discipline that made the engineering honest: refuse to ship a confidently-wrong fix; when you genuinely can’t tell, let the evidence decide. The product’s core value and the way we had to work to build it are the same principle pointed in two directions. That mirror is the part I keep coming back to.

Hands on the cube

None of the good fixes came from staring at code. They came from doing the thing and watching closely. Working a real cube in front of the camera is what surfaced that the two-face capture step was firing before the cube had even been turned. It’s what made it obvious that the genuinely hard color pairs deserve to be a choice offered to the person, not a coin the machine flips in private. And it’s what reframed the last open problem entirely: capturing two faces at once was never really about re-confirming colors — it’s about pinning down how the faces connect.

That last chapter is the one in progress now. Two adjacent faces share an edge, and that shared edge is a fact — it tells you exactly how the two faces are oriented relative to each other. Read it, and the assembler stops guessing orientation and starts knowing it. Which is, fittingly, the same move as everywhere else in this project: replace a confident guess with something we can actually stand behind.

Where it’s headed

cubeconjure is a cube solver built to be honest about what it sees. It uses the cube’s own structure to resolve the ambiguity that wrecks naïve scanners, it holds its uncertainty instead of laundering it into a fake number, and when it truly can’t tell, it hands you the choice rather than a confident mistake.

The hard part was never the solving. It was learning to read six colors without lying about how sure we were — and discovering that the same honesty is what it takes to build the thing in the first place. When the app ships, that’s the story behind it.

The cube solver that refuses to be confidently wrong

Don’t judge a sticker alone

The lighting saga

A sticker is a belief, not a color

The core principle: never be confidently wrong

The humbling middle

The mirror

Hands on the cube

Where it’s headed

Keep reading

I wrote a Swift overlay to black out my menu bar. macOS didn't even look at it.

Microsoft Teams uninstalled itself but kept my microphone hostage

Your Mac has two Downloads folders. Here's how I made them one.