Hacker News

It Took Me 30 Years to Solve This VFX Problem – Green Screen Problem [video]

283 points by yincrash 6 days ago | 109 comments

In an earlier video they made a couple years back about Disney's sodium vapor technique Paul Debevec suggested he was considering creating a dataset using a similar premise: filming enough perfectly masked references to be able to train models to achieve better keying. So it was interesting seeing Corridor tackle this by instead using synthetic data.

somat 2 days ago

With regards to the sodium vapor process, an idea has been percolating in the back of my head ever since I saw that video. But I don't really have the budget to try it out.

theory: make the mask out of non-visable light

illuminate the backing screen in near Infra-Red light. (after a bit of thought I chose near-IR as opposed to near-UV for hopefully obvious reasons)

point two cameras at a splitting prism with a near IR pass filter(I have confirmed that such thing exists and is commercially available)

Leave the 90 degree(unaltered path) camera untouched, this is the visible camera.

Remove the IR filter from the 180 degree(filter path) camera, this is the mask camera.

Now you get a perfect non-color shifting mask(in theory), The splitting prism would hurt light intake. It might be worth it to try putting the cameras really close together , pointed same direction, no prism, and see if that is close enough.

overvale 2 days ago

Debevec tried a version of this: https://arxiv.org/abs/2306.13702

dgently7 16 hours ago

im familiar with this work and specifically they tried replicating the sodium vapor style approach but what worked for poppins level isnt actually good enough for today. Specifically you still end up with light spill that contaminates the foreground, especially for things like the fresnel reflections on the side of a face. the magenta idea was to still do what is basically a color difference key, but increase the color separation between fg and bg by lighting the two with different opposite colored lights. then using a ml model to recover the original fg color.

wiml 24 hours ago

This approach was used in the 1950s/60s with ultraviolet light (rather than IR) to create a traveling matte. I'm not sure why visible-light techniques won out. Easier to make sure that the illumination is set up correctly, maybe?

regularfry 14 hours ago

Or maybe they didn't want to blind the talent. UV isn't something you want to bathe your retinas in.

randyrand 9 hours ago

Couldn't this be summarized as the Sodium Vapor technique but with near-IR? Or do I misunderstand something?

somat 2 hours ago

That is exactly what it is. Move the mask light out of the visible spectrum so that the masking operation does not interfere with any colors.

The sodium vapor light process was the best tech in the 1950s, Sodium vapor lights were used because they deliver a very pure single wavelength light. But we can do better now. leds natively illuminate with a single wavelength(and we have to put a lot of engineering into them not doing this) and we have cameras that can view frequencies that the eye cannot. put this together and in theory you can do the single frequency illuminated backing sheet mask(green screen) with a frequency that is not visible to the human eye and therefore does not interfere with any of the colors in the final shot.

diacritical 2 days ago

Don't humans and other warm objects also radiate IR?

somat 2 days ago

That is far-IR, thermal stuff, Near-IR, 700 nanometer-ish is right below red in human vision.

Camera sensors can pick up a little near-IR so they have have a filter to block it. If that filter was removed and a filter to block visable light was used in place you would have a camera that can only see non-visable light. Poorly, the camera was not engineered to operate in this bandwidth, but it might be good enough for a mask. A mask that does not interfere with any visible colors.

fc417fc802 23 hours ago

> Poorly, the camera was not engineered to operate in this bandwidth

At least for cheap sensors in phones and security cameras that engineering consists of installing an IR filter. They pick it up just fine but we often don't want them to.

Keep in mind that sensors are inherently monochrome. They use multiple input pixels per output pixel with various filters in order to determine information about color.

throwway120385 24 hours ago

You can actually dimly perceive near-IR LEDs -- they'll glow slightly red in darkness.

adrian_b 11 hours ago

That depends on how "near" they are.

The sensitivity to red light decreases quickly at wavelengths greater than 650 nm, but light can still be perceived if it is strong enough, up to around 780 nm.

Many so-called near-IR LEDs may actually be somewhere around 750 nm, so they are still visible on a dark background, even if they are perceived as extremely dim.

On the other hand, there are many near infrared LEDs around 900 nm and those are really invisible. Near-infrared LEDs around 1300 nm or around 1550 nm are also completely invisible.

An invisible near-infrared laser beam could become visible due to double-photon absorption, but if a beam of such intensity as to cause double-photon absorption hits your retina, there are more serious things to worry about.

diacritical 23 hours ago

I remember reading some people can perceive some near IR, but mostly that near-IR LEDs actually leak some red themselves due to imperfections in manufacturing or something?

actionfromafar 2 days ago

I'll do you one better, which requires no special cameras (most have IR filters) nor double cameras or prisms.

Shoot the scene in 48 or 96 fps. Sync the set lighting to odd frames. Every odd frame, the set lights are on. Every even frame, set lights are off.

For the backing screen, do the reverse. Even frames, the backing screen is on. Odd frames, backing screen is off.

There you go. Mask / normal shot / Mask normal shot / Mask ... you get the idea.

Of course, motion will cause normal image and mask go out of sync, but I bet that can be remedied by interpolating a new frame between every mask frame. Plus, when you mix it down to 24fps you can introduce as much motion blur and shutter angle "emulation" as you want.

ryandamm 24 hours ago

This is called “ghost frame” and already exists in Red cameras and virtual production wall tools like Disguise.

actionfromafar 23 hours ago

You need to basically timecode/genlock the greenscreen "illumination LEDs" to the camera so the greenscreen lights up only exactly at every other frame. Not sure if there exists any off the shelf solution which can do that but if not it can't be super hard to cobble together.

cma 22 hours ago

https://evermorestud.io/retroreflective-chromakey-experiment...

eichin 19 hours ago

Somebody recently used a variation of this to get good video of welding - basically a camera synced with a very bright (strobe-ish) light, brighter than the weld itself, so you adjust the camera to the ludicrous-but-consistent brightness level and get details of the weld and the surroundings. https://www.youtube.com/watch?v=wSUxK8q4D0Q (Chronos "Helios", from early 2025)

shdudns 21 hours ago

Two problems:

- It'll bleed on fast motion. Hair in the wind would just not work.

- Incandescent lights are out.

You could solve both by having two ghost frames shot very close to the real frame (no need to evenly space the frames, after-all) and using strobing a high powered laser.

You'd need very fast sensor or another one optically on the same position.

actionfromafar 13 hours ago

At some point higher fps solves it. Is 240 fps enough?

amluto 24 hours ago

Surely this makes your actors feel sick? And wouldn’t it make your motion blur look dashed and also cause artifacts at the edge of the mask if there’s a lot of motion?

throwway120385 24 hours ago

You could strobe at some multiple of the sensor frame rate as long as your strobes are continuous through the integration period of the sensor and the lighting fades very quickly. This probably wouldn't work with incandescents but people strobe LEDs a lot to boost the instantaneous illumination without going past the continuous power rating in the datasheet.

amluto 22 hours ago

You mean do strobe, strobe, strobe, strobe, pause, pause, pause, pause? I bet that's at least as bad as holding the source on for the first four intervals and then off for the latter four intervals.

In any case, if you actually have a scene bright for 1/24th of a second and then dark for 1/24th of a second, repeating, you're well within photosensitive epilepsy range. Don't do that to your actors unless you've discussed it with them and with your insurance company first.

actionfromafar 21 hours ago

So, shoot at 240 fps and strobe set lights for 1/240s and backdrop for 1/240s.

wlesieutre 19 hours ago

And if you want a slower than 1/240th second shutter speed, no you don't

actionfromafar 14 hours ago

Or... you frame blend in Fusion or go full hog in Nuke.

( https://www.nukepedia.com/tools/gizmos/time/vectorframeblend... )

kibibu 24 hours ago

Incandescent and fluorescent lights already flicker at your AC power frequency. Just gotta be higher than that

amluto 22 hours ago

No.

Incandescent lights flicker at twice your AC power frequency -- to a decent approximation, their power is proportional to V^2. But this is input power -- the cooling of the filament is slowish and the modulation depth is low. Most people aren't bothered by this.

Fluorescent lights with old or very crappy "magnetic" ballasts flicker at twice the mains frequency, with deep modulation. The effect on people varies from moderate to extremely unpleasant, and it's extra bad if anything is moving quickly (gyms, etc). There are even studies showing that office workers perform worse under such lighting even if they don't experience personally perceptible symptoms. The effect is so severe that people invented the "electronic ballast", which flickers at much, much higher frequency and avoids low-frequency components. Phew. (The light might still be a nasty color, but the temporal output is okay.)

"Driverless LEDs" are deeply modulated at twice the mains frequency. These are very nasty.

If you actually have a light that flickers at the AC power frequency (certain LED sources in a two-brightness diode-dimmed kitchen appliance fixture will do this, as will driverless LEDs with certain types of failures), then it's extra nasty.

There are plenty of people around who find (depending on the actual waveform) 60Hz flicker intolerable and 120Hz flicker extremely unpleasant. And there are plenty of people who can often perceive flicker under appropriate circumstances up to at least several hundred Hz and even into the low kHz with certain shapes of light sources. You can read up on IEEE 1789 to find a standard based on actual research on what lighting waveforms should look like.

The effect of 120 Hz flicker is bad enough that energy codes in some places (e.g. California) have started to require that LED sources minimize this flicker, but, sadly, it's poorly enforced.

kibibu 18 hours ago

Hey thanks for clearing this up. I had no idea that CFLs and fluorescent lights with electronic ballasts now flicker at ~ 20kHz.

SoftTalker 20 hours ago

The fluorescent light strobing is why you often see fluorescent tubes in pairs. They will be wired in opposite phase to cancel the strobing.

tlb 11 hours ago

I think the total light output of each bulb in the pair is the same at all points in time, but the orange-blue gradient is reversed. So when one is orange at one end, the bulb beside it is blue at that end.

IIRC, the end that's negative looks orange, because the electrons emitted from the filament haven't gotten up to speed yet and can't ionize the mercury atoms at that end to the highest states.

If you didn't do this, you'd see 60 Hz strobing when you looked at one end.

toss1 19 hours ago

Also, the human eye sees flicker much better at the periphery than in the central area. The Rod receptor cells respond more rapidly than the Cone color-sensitive cells, and the peripheral vision is also more tuned to quick motions (much advantage in having faster detection of peripheral motion, so positive selection evolutionary pressure).

joecool1029 23 hours ago

phosphors and capacitors are a thing that mask that, so is high frequency switching way above this rate…

Anyway, an old HN submission I still use when buying light bulbs: https://news.ycombinator.com/item?id=14023196

actionfromafar 22 hours ago

Feel sick? Possibly. People are more or less sensitive to imperceptable flicker.

Artifacts?

I bet that can be remedied by interpolating a new frame between every mask frame. Plus, when you mix it down to 24fps you can introduce as much motion blur and shutter angle "emulation" as you want.

Motion blur can also be very forgiving. You are more likely to notice artifacts in still or slow moving scenes and then the problem goes away.

dgently7 16 hours ago

this is the approach that stop motion uses, except they get to keep the camera in the same place. its still not perfect because of spill from the background onto the foreground and requires additional masking and cleanup.

ErroneousBosh 22 hours ago

Corridor Crew cover this in one of their VFX breakdowns where I can't remember the film but it was supposed to be filmed on a rapidly rotating platfom.

There were a large number of lights around it and each one was blinked on for an instant while the camera shot at an insanely high frame rate - something like 288 frames per second with twelve lights.

This meant that after the fact you could pick any one of the twelve frames for that 1/24th of a second, to choose the angle the light was hitting at.

huflungdung 19 hours ago

[dead]

diacritical 2 days ago

From ~04:10 till 05:00 they talk about sodium-vapor lights and how Disney has the exclusive rights to use it. From what I read the knowledge on how to make them is a trade secret, so it's not patented. Seems weird that it would be hard to recreate something from the 1950's.

I also wonder how many hours were wasted by people who had to use inferior technology because Disney kept it secret. Cutting out animals and objects from the background 1 frame at a time seems so mindnumbingly boring.

meatmanek 24 hours ago

The lights are relatively easy to get. iirc (it's been a bit since I watched their full video on the subject[1]) the hard part to find was the splitter that sends the sodium-vapor light to one camera and everything else to another camera.

1. https://www.youtube.com/watch?v=UQuIVsNzqDk

aidenn0 24 hours ago

It would seem to me to be relatively easy to build something like that if you're okay shooting with effectively a full stop less light (just split the image with a half-silvered reflector and use a dichroic filter to pass the sodium-vapor light one one side.

The splitter would have to be behind the lens, so it would require a custom camera setup (probably a longer lens-to-sensor distance than most lenses are designed for too), but I can't think of any other issues.

toast0 22 hours ago

At the end of this video they link to another video from a year ago [1] (this is the same link as the comment you were commenting on, whoops), where they recreate the sodium vapor process with a rig with a beam splitter, one side had a filter to reject sodium vapor light and the other has one to reject everything but sodium vapor light, and then a camera on each side.

The Disney process had the filter essentially built into the beam splitter, but afaik, nobody knows how to make that happen again (or nobody who knows how, knows it's a desirable thing). Seems like the optics might be cumbersome, but the results seem wortwhile.

Also, you need still need careful lighting, you don't want your foreground illuminated by sodium vapor, but I wonder if you could light the background screen from behind (like a rear projection setup) to reduce the amount of sodium vapor light that reflects from the foreground to the camera.

[1] https://m.youtube.com/watch?v=UQuIVsNzqDk

aidenn0 21 hours ago

We know how to make dichroic prisms (Technicolor used them when filming, as did "3 CCD" digital cameras), but I imagine that to have a sufficiently narrow rejection band for the sodium-vapor prociess, you would need to be smart about where you place the prism, since the stop-band of a dichroic filter changes with angle of incidence.

diacritical 24 hours ago

Yup, I wanted to say that the prisms are hard to recreate, not the light itself.

somat 21 hours ago

It is a well known process, Not a lot of general use so costs are not low, but not nearly as high as the original disney prism, I would guess around a 1000 USD for one. As far as I can tell any well equipped optics laboratory could make a beam splitter with whatever frequency gate they want.

https://accucoatinc.com/technical-notes/beamsplitter-coating...

I have no idea about that specific company I just picked it after a search for "beam splitter"

After I saw that video on the sodium vapor illumination process I was curious as to what if you could instead use near-IR light as the mask illumination. In theory you would have a perfect mask(as in the disney process) and no color interference. I found that frequency gated beam splitters are a fairly common scientific instrument.

diacritical 21 hours ago

Thanks for the info.

As for the IR idea, I wonder if there's something like a crowdfunding/crowdsourcing site for ideas where the person who had the idea doesn't really want to do it, but leaves it open to others to try. You said you "don't really have the budget to try it out", but let's say even if you had the money, it wouldn't be a priority for you, as you're not an expert or you have better things to do or whatever. Is there a place to just shout ideas into and see if any market-oriented entity would take it upon themselves to try doing it? Besides forums full of ideas like "tinder but for X" and such crap? Because, imagine if your idea really is a great one. A couple hours from now it would be buried in HN.

like_any_other 22 hours ago

Even if it had been patented, patents from the 1950s would have long expired. In fact, patents from 2005 would have expired - the US patent term is only 20 years.

diacritical 22 hours ago

Didn't know that, thanks. Although 20 years seems too much for some things, especially computer-related fields that move much quicklier than other fields. But now pretty much everything depends on or runs on computers so 20 years seem too much. I don't know if I phrased it correctly, but I mean to say that before computers, things moved much more slowly. Even a century for a patent would've been fine 500 years ago, but now almost every field has been changed by computers in the past few decades and will change even more rapidly. Letting 1 company have the advantage for 20 whole years now is much more impactful than it must've been before.

jasonwatkinspdx 2 days ago

Yeah, that's just nonsense. We used sodium vapor monochromatic bulbs in my high school physics class to duplicate the double slit experiment.

I suspect the real reason is that digital green screen in the hands of experienced people is "good enough" vs the complication of needing a double camera and beam splitting prism rig and such.

jayd16 22 hours ago

As far as alternatives, I wonder if anyone has tried a screen that cycles through colors in a known sequence. Using this modulating-color screen, it might actually be easier to separate the subject because you get around the "green shirt over green screen" problem. You might even be able to use a time sampling to correct the light cast on the subject from the screen as you would have a full spectrum of response.

I could also imagine using polarized light as the backdrop as well.

dgently7 17 hours ago

the general problem with any technique that isnt just throw some vaugely green thing behind our actors is that setting up complicated tech like this on an actual film set is extremely expensive. both the time it would take and the risk of it not working. so you end up with a dedicated permanent stage install but now you need to get the actors and crew to that place. better keys isnt a bad enough problem to justify that effort/cost. even the highly touted "virtual production" mandalorian stuff where you just put a big led wall behind the actors has shown to be more expensive than traditional vfx unless you tightly control the creative or approach.

mcurist 6 hours ago

And all the people aware of the production technique watch it and imagine the characters saying "We can't run from the monster in different direction, our virtual production stage is precisely this big!"

Green screens just more flexible

lynnharry 15 hours ago

It’s fascinating to see the bridge between academic research and industry application here. While Image Matting is a massive research area in Computer Vision, academia often focuses on solving perfect 'benchmarks.' Corridor Crew effectively took that foundational research, like neural unmixing and synthetic training, and adapted it to solve the 'messy' reality of production, like tracking markers and motion blur. It’s a great example of using open-source deep learning resources to build a tool that prioritized workflow over just a high accuracy score.

mk_stjames 11 hours ago

In case anyone wanted technical details of the NN, I dug into the repo:

Its a transformer, with a CNN refiner after. Specifically, a ViT using the Hiera architecture (https://github.com/facebookresearch/hiera)

The Hiera ViT has dual decoder heads, one for the alpha and one for the RGD foreground, and then a small CNN refiner network to solve some artifacting in the output from the Hiera model.

I'd be very interested to see a long form tech talk of Niko explaining his process of learning ML ropes and building this model.

swframe2 20 hours ago

The model in this repo seems pretty good: https://github.com/xuebinqin/DIS

vsviridov 2 days ago

The community has managed to drastically lower hardware requirements, but so far I think only Nvidia cards are supported, so as an AMD owner I'm still missing out :(

mouth 22 hours ago

This works on macOS, as well, via Apple Silicon.

jcmoscon 7 hours ago

It is refreshing to see problems being solved by AI that are not LLMs. There are so many day-to-day challenges that we could solve using data, machine learn and some creativity.

anfogoat 5 hours ago

This is technically true I guess but assuming the YT comment I just read represented it truthfully, it was an LLM that wrote it.

grishka 5 hours ago

It's well established that machine learning excels at solving classification problems, or those that can be reduced to one.

It saddens me that we're wasting so much of that potential on those stupid stochastic parrots that solve all those non-problems that no one has ever had. It saddens me even more that so many people are absolutely sure that LLMs are "smart", or that they can "think", or even that they're somewhat conscious. And that even if they're not quite that, one more order of magnitude of scale will definitely give us an AGI. Oh that didn't help? Then one more, that will definitely be it.

One real problem that LLMs have solved is that they made natural language processing as a discipline obsolete. They also usually don't suck at summarizing long texts, except when they sometimes do. But that's it, really.

voxic11 5 hours ago

If you look at the github page it was an LLM that wrote all the code. Makes sense as corridor are not software developers.

superjan 2 days ago

Watched this a few days ago. The video is light on technical details, except maybe that they used CGI to generate training data.

rhdunn 24 hours ago

The idea behind a greenscreen is that you can make that green colour transparent in the frames of footage allowing you to blend that with some other background or other layered footage. This has issues like not always having a uniform colour, difficulty with things like hair, and lighting affecting some edges. These have to be manually cleaned up frame-by-frame, which takes a lot of time that is mostly busy work.

An alternative approach (such as that used by the sodium lighting on Mary Poppins) is that you create two images per frame -- the core image and a mask. The mask is a black and white image where the white pixels are the pixels to keep and the black pixels the ones to discard. Shades of gray indicate blended pixels.

For the mask approach you are filming a perfect alpha channel to apply to the footage that doesn't have the issues of greenscreen. The problem is that this requires specialist, licensed equipment and perfect filming conditions.

The new approach is to take advantage of image/video models to train a model that can produce the alpha channel mask for a given frame (and thus an entire recording) when just given greenscreen footage.

The use of CGI in the training data allows the input image and mask to be perfect without having to spend hundreds of hours creating that data. It's also easier to modify and create variations to test different cases such as reflective or soft edges.

Thus, you have the greenscreen input footage, the expected processed output and alpha channel mask. You can then apply traditional neural net training techniques on the data using the expected image/alpha channel as the target. For example, you can compute the difference on each of the alpha channel output neurons from the expected result, then apply backpropagation to compute the differences through the neural network, and then nudge the neuron weights in the computed gradient direction. Repeat that process across a distribution of the test images over multiple passes until the network no longer changes significantly between passes.

comex 2 days ago

See also this video comparing Corridor Key to traditional keyers:

https://www.youtube.com/watch?v=abNygtFqYR8

Sniffnoy 15 hours ago

Summary: He created 4 hard-to-key shots, and on each of them tried KeyLight, IBK, and Corridor Key. Overall on 3 of them he judged that Corridor Key had done the best job, on one of them he judged that IBK had done the best job. I think on all of them he judged that more work was still necessary, none of them was fully usable as-is.

amelius 24 hours ago

There's still a bug: the glass with water does not distort the checker pattern in the background at 24:12.

nstart 18 hours ago

Good spot! That is the product working as intended though. The background doesn't exist except as an asset that replaces the green screen. The tool is meant to replace the green screen without the need for manual rotoscoping. Even in a traditional process, the distortion needs to be done by VFX as a separate process. To do that though, they still need the green screen keyed out and this tool does that.

jweir 23 hours ago

True, but with visual art there is what is correct and what looks correct. When things are moving and the area small no one is going to notice.

But now that is problem is solved a director will come along and say... I want a scene with a big glass of water and the camera will zoom in on it and will see the monster refracted through the glass.

gmueckl 23 hours ago

At that point it's better to do the glass entirely in post.

catapart 23 hours ago

I wouldn't call it a bug. This is a first step, not a final step. Maintaining the refraction might be more realistic, but it's not necessarily what the creator wants.

CharlesW 24 hours ago

When you watch the video it becomes pretty clear why it wouldn't be able to do that, although it's fun to think about how a future iteration or alternative might be able to credibly (if you don't look too hard) mimic that someday.

nkrisc 9 hours ago

Not a bug, creating distortion in the comped in backgrounds is not what this tool does. It creates a transparency mask. How do you propose a transparency mask captures distortion artifacts?

That distortion to the new background would have to be added in by the artist.

DrewADesign 23 hours ago

You’d have to track it, render it, and comp it in. It’s not ridiculously difficult, but there’s no way that’s going to happen automatically.

orbital-decay 23 hours ago

>there’s no way that’s going to happen automatically

They train their model in a pretty straightforward way, it can also be used to capture the distortion as well, just use a non-monochrome (possibly moving) background optimized for this. It's a matter of effort and attention to detail during training (uneven green screen lighting, reflections, etc), not fundamental impossibility

amelius 23 hours ago

Yes. But the main issue is in the way they formulate the problem. Their output is always a transparency mask, which of course will never handle distortions.

DrewADesign 21 hours ago

Right. Things like this are why it’s difficult integrating AI into professional movie pipelines— they’re super complex in ways AI cannot (yet) replicate for very good reasons that seem superfluous or trivially replaceable by people not familiar with them.

orbital-decay 17 hours ago

People in ML have this kind of belief rooted in the bitter lesson, that everything will eventually sort itself out given enough scale and data. That often makes them ignore the nuances of particular problem domains. CC is the opposite of that, it's just impossible to do everything at once.

DrewADesign 10 hours ago

It’s certainly a big part of the ML scene, but to a slightly lesser extent, a cultural facet of development in general. It’s not all bad! Many people have solved problems that nobody in their right mind would have attempted knowing the nitty gritty details; often the problem they solved wasn’t the one they intended to solve, or they only solve one small subset of it, but were still valuable advancements. Unfortunately, that also leads to reinforcing some people’s Dunning-Krueger-fueled insistence that they can solve another field’s difficult problems with a few thought experiments, and the only reason it hasn’t already been solved is because nobody thought to ask a developer as smart as them to momentarily consider the problem. Non-developers in tech often bear the brunt of it: moving into design after a decade of dev work, that irritating mindset was one of the reasons I left tech altogether a couple years later.

dgently7 16 hours ago

youd have to train it to also generate and st map of the distortions but creating the ground truth version of that from the synthetic data would add a lot more to render. also its very easy to plausibly fake, its not something humans are good at seeing and knowing its wrong. you can tell its completely missing but accurate vs just distorted in a plausible way is not something most brains are tuned to notice.

orbital-decay 23 hours ago

Sure, because they used monotone backgrounds and never really captured any distortion.

amelius 23 hours ago

Is it a coincidence that the result is stable between subsequent frames?

qingcharles 19 hours ago

I use the Adobe version of this in Photoshop every day and I assumed that Adobe solved this the same way, but used professionals to cut out the subjects from the backgrounds then fed both versions into their AI.

Since they added it a year or so ago it has been game-changing. I'm cutting out portraits every day and having a magical tool that cuts out the subject with perfect hair cut out with a single click is sci-fi.

Here's a demo of Photoshop's tool:

https://www.youtube.com/watch?v=SNVJN6PKeGQ

(the other magical Photoshop tool is the one that removes reflections from windows, which is even more insane when you reverse it and tell it you only want the reflection and not what's on the other side of the glass)

IshKebab 2 days ago

Pretty impressive results! Seems like someone has even made a GUI for it: https://github.com/edenaion/EZ-CorridorKey

Still Python unfortunately.

BoredPositron 24 hours ago

Like 90% of the other tooling in VFX...

IshKebab 23 hours ago

Is it? That's a shame. I assumed this is Python because of Pytorch.

MrVitaliy 22 hours ago

Anyone tried using lidar and just cut/measure distance to the object?

Coeur 8 hours ago

Well there was ZCam, which was a time-of-flight add-on for ENG cameras. It couldn't really handle edges perfectly and existing bluescreen tech was good enough for TV production, so they pivoted into gaming and sold to Microsoft for the Kinect.

https://en.wikipedia.org/wiki/ZCam (Demo: https://www.youtube.com/watch?v=s7Kcmx29RCE )

rcxdude 19 hours ago

Apparently they used something similar for production on avatar: stereo cameras for depth estimation which allowed realtime depth composition of CG characters onto the shots they were taking, which makes it a lot easier to get everyone on the same page about the scene, especially with characters that are outside normal human proportions. But it wasn't good enough for the final shots.

wizzledonker 22 hours ago

That would require calibration with the camera, and even then the camera and lidar sensor can’t be in exactly the same place. I doubt results would be better.

summarity 22 hours ago

Well sort of, the industry tried to go way beyond that by capturing the entire light field: https://techcrunch.com/2016/04/11/lytro-cinema-is-giving-fil...

dgently7 17 hours ago

per pixel depth does not solve for semi-transparency.

dylan604 24 hours ago

The sad thing about this is the problems encountered during post from the production team saying "fix it post" during the shoot. I've been on set for green screen shoots where the lighting was not done properly. I watched the gaffer walk across the set taking readings from his meter before saying the lighting was good. I flip on the waveform and told him it was not even (which never goes down well when camera dept tells the gaffer it's not right). He put up an argument, went back and took measurements again before repeating it was good. I flipped the screen around and showed him where it was obviously not even. A third set of meter readings and he starts adjust lights. Once the footage was in post, the fx team commented about how easy the keys were because of the even lighting.

The problem is that the vast majority of people on set have no clue what is going on in post. To the point, when the budget is big enough, a post supervisor is present on production days to give input so "fixing it in post" is minimized. When there is no budget, you'll see situations just like in the first 30 seconds of TFA's video. A single lamp lighting the background so you can easily see the light falling off and the shadows from wrinkles where the screen was just pulled out of the bag 10 minutes before shooting. People just don't realize how much light a green screen takes. They also fail to have enough space so they can pull the talent far enough off the wall to avoid the green reflecting back onto the talent's skin.

TL;DR They solved something to make post less expensive because they cut corners during production.

weinzierl 23 hours ago

I fully agree but I think for them making it possible to cut corners during production is the whole point. Think about it: The choice is between 5 minutes of work plus a one time purchase of a decent GPU and a big room with a complex lighting setup with a post supervisor present. Now, quality of the end result will not be the same, for sure. You and me would opt for the quality setup whenever we can, but many others won't.

dylan604 23 hours ago

If you're on such a low production budget that you just physically do not have the lamps to light a screen, then you really have to ask if green screen is the right option. Maybe flip it and shoot black limbo so you do not need lights, and the lights you do have can be better used as key lights for separation. You also don't have to worry about the color cast from your light screen. Essentially, you just need a garbage matte for the key, and then clean up what might be getting keyed that you don't actually need. Detecting foreground subject from background is so capable now that a screen isn't necessary, and matte clean up is pretty much unnecessary. Of course you lose street cred of not being able to say you used green screen, but who cares as long as the shot works out.

bonoboTP 22 hours ago

Sometimes the cost is human expertise. If the tech allows you to get stuff done with less competent staff, it's a win.

CharlesW 23 hours ago

> TL;DR They solved something to make post less expensive because they cut corners during production.

FWIW having watched the entire thing, they never blamed bad production staff or unavoidable constraints. Those are things that anyone working with others experiences when making anything, whether it's YouTube videos or enterprise software products. My TLDR is: "Chroma keying is an fragile and imperfect art at best, and can become a clusterf#@k for any number of reasons. CorridorKey can automatically create world-class chroma keys even for some of the most traditionally-challenging scenarios."

dgently7 16 hours ago

didnt dune win a vfx oscar and their screens werent even green at all? they were tan like sand.

plastic3169 15 hours ago

Yes and it was a massive manual effort. In a way they acknowledged that keying does not really work all the way and having that unnatural color everywhere in the set is not worth it. It’s a massive production with heavy VFX work so not something you can apply to your own production. Sand screen and roto sections of this discussion are interesting.

https://youtu.be/UARrOsNPviA

bbstats 22 hours ago

their green screen is good. that has nothing to do with this video.

bbstats 23 hours ago

... did you watch the video?

Computer0 2 days ago

Looking forward to trying it out, 8gb of vram or unified memory required!

ralusek 2 days ago

I'm a software engineer that, like the vast majority of you, uses AI/agents in my workflow every day. That being said, I have to admit that it feels a little weird to hear someone who does not write code say that they built something, without even mentioning that they had an agent build it (unless I missed that).

tekacs 24 hours ago

Worth bearing in mind that people in VFX are often relatively technical.

From their own 'LLM handover' doc: https://github.com/nikopueringer/CorridorKey/blob/main/docs/...

> Be Proactive: The user is highly technical (a VFX professional/coder). Skip basic tutorials and dive straight into advanced implementation, but be sure to document math thoroughly.

steve_adams_86 13 hours ago

Back when I played with animation and post pipelines, I was writing a decent amount of python. It's part of how I got into programming. At the time I would have said I can't program, and I suspect this guy is similar.

krackers 15 hours ago

It's actually a bit refreshing that they didn't brand this with the usual "LLM hype". And it's actually a good example of someone using LLMs to solve a problem by bringing in their domain knowledge. (The solution is surprisingly simple though, I wonder if other people have done this before but kept it proprietary/in-house).

adamtaylor_13 24 hours ago

This is interesting. I had the exact opposite reaction.

You don't hear architects get hounded because they say they "built" some building even though it was definitely the guys swinging hammers that built it. But yet, somehow because he didn't artisanally hand-craft the code, he needs to caveat that he didn't actually build it?

plopz 23 hours ago

architects i know say "i designed that", "i worked on that", "i specified that" or "i chose that", they don't say "i built that"

kalaksi 23 hours ago

Maybe it's a language thing. Architects saying they built something sounds a bit off to me. In my native language, and in everyday language, I don't think people would use "built" like that. I don't know how architects talk with each other, though.

jrm4 24 hours ago

I mean, the heading of the video says "he solved the problem," which I think is wise to pay a lot of attention to.

F7F7F7 18 hours ago

Is a carpenter who relied on a CNC to cut all of their pieces a builder?

tempaccountabgd 19 hours ago

[dead]