Hacker News

New accessibility features powered by Apple Intelligence

251 points by interpol_p 4 hours ago | 127 comments

mohsen1 3 hours ago

Fun fact: This video was made accessible to sighted people because no blind person would ever listen to voice at that speed. Honestly if you ever observe a blind person using computers you'd impressed how they can listen to audio at unimaginable speeds.

asimovDev 3 hours ago

https://youtu.be/wKISPePFrIs?si=ahGfFp0U7-pTU9w6&t=43

my go to example of this is this talk by Saqib Shaikh (a blind software engineer at Microsoft) giving a talk about Visual Studio. Link is timestamped

isityettime 3 hours ago

I think it takes quite a lot of practice to reach this speed. It's not rare among blind developers, but I think it still takes a lot of work to get there. Pretty impressive!

I wish more people would watch videos like this just because having a realistic idea of how blind people do certain tasks can help you move from pity or even compassion to a more productive kind of understanding. I think sometimes when you haven't seen it, you can't really even imagine how it can be done.

Aboutplants 2 hours ago

I listen to a lot of podcasts and listen at 1.5-2.0 speed and it’s to the point that I literally cannot stand listening to 1.0 speed anymore as they go too slowly (depending on the content of course).

runjake 8 minutes ago

I am jealous. I can't listen and retain most podcasts at more than 1.0x. I even disable the podcast player functionality that eliminates pauses and silent sections.

simondotau 49 minutes ago

Same. Returning to 1x speed makes people sound (to my 2x-abused ears) drunk and slurring their works. If I want to listen to something slowly and carefully, I will just about tolerate 1.25x.

What really frustrates me is watching/listening to discussion of music, because I am forced to listen to the talking at 1x because the music sounds wrong (and is wrong) at anything other than 1x.

kevin_thibedeau 27 minutes ago

The funny thing is that slow talkers sound normal at 2x speed. It's jarring when you hear their actual speech.

BurningFrog 31 minutes ago

Playing music at 1x should be a pretty simple feature to add to those apps.

Ideally it should be done while encoding.

ebiester 47 minutes ago

I'm so glad YouTube and other podcast players have moved to support 3.0 speed. As I get comfortable with one, I move it up some. For things like sports and "did you know" content, I can go 2.5 if I'm not multitasking. For technical content, sometimes I'm stuck at 1.0.

kevin_thibedeau 25 minutes ago

You can get browser extensions to do it for all media controls on any site. YouTube's "Premium" for 3x is laughable when it's an internal browser function.

thrownthatway 4 minutes ago

That’s an amusing observation.

Likewise, YouTube’s “premium” feature of not displaying ads is laughable when displaying content is literally an internal browser function.

I pay anyway, because I was going to pay for an on-demand streaming music service anyway.

michaelbuckbee 45 minutes ago

Something that the Overcast podcast player does (and probably others) is silence removal, which in some ways is even better than the raw speedup.

thrownthatway 8 minutes ago

Except Marc Andreessen, I can’t decode his speech at 2x

Maybe it’s just a matter of practice.

miki123211 49 minutes ago

> It's not rare among blind developers

It's not rare among the blind in general.

Unless you're completely technologically illiterate, the kind of person who has no idea how to install an app or sign up for an online account, you're probably doing something of the sort.

gostsamo 2 hours ago

If you are dedicated, few weeks to few months of usage with regular ramp up. You should be careful with adjusting which symbols are read though and sometimes the programing languages matters because different symbols have different significance for understanding the code.

dijit 3 hours ago

Ho-ly cow. That is very impressive.

I'm not even sure what to say, but discoveries like this are why I use hackernews, I'd never have known this otherwise.

miki123211 47 minutes ago

To be fair, the acoustics of the room that talk was given in are... not too great, to put it mildly.

I can easily understand Eloquence (the speech synthesizer he's using) at that speed, but I struggled a bit with this one.

thrownthatway 10 minutes ago

Twenty years ago I took a level 1 tech support call from a visually impairment guy and it took about 3.2 seconds to realise his condition was no impediment for using a computer because of the screen reader tech he was using.

throwatdem12311 2 hours ago

I did IT for a community Center way back in the day and the director was blind. I was blown away by how fast his screen reader read things out to him - completely incomprehensible to me - and his efficiency with keyboard shortcuts would put even vim/emacs elitists to shame.

miki123211 36 minutes ago

The way (Windows) screen readers handle web navigation is basically Vim in disguise.

You have two modes: "focus mode", where you can edit text in text fields and keys are passed straight to the browser, and "browse mode", where keys move a virtual cursor around the page.

In browse mode, navigating with just arrow keys all the time would be just as slow as you might imagine, so you use single-key keyboard shortcuts to move by role, E.G. to the next heading, button, table or unvisited link.

The keyboard layout is optimized for memorizability and not efficiency, you use the actual arrow keys instead of hjkl for example, but the concepts are eerily similar.

There are a couple of other approaches to solve this problem, Mac OS's Voice Over is much more Emacs-like for example, and each approach has its own pros and cons, but that's definitely one way to do it.

UltraSane 3 minutes ago

I briefly worked at a boiler call center and I would hear supervisors listening to recorded calls at warp speed.

isityettime 3 hours ago

Probably because it's an advertisement, and super fast robot voices can feel extremely harsh and annoying. Even blind people who rely on them find them overstimulating sometimes, lol.

satvikpendem 13 minutes ago

I listen to a lot of podcasts and YouTube videos at 3x or 4x speed now, having slowly built up the skill over a few years. It's pretty nice now and saves time, and it's remarkable how well the human brain can adapt to such input.

a012 7 minutes ago

I’m the opposite, I can’t stand the fast speaking videos. But I also speed up 1.2x to 1.5x if the videos were too slow.

brador 12 minutes ago

You recall nothing and you know it. You're just wasting time you could use for something useful or meaningful in your life. Kids call it "Anxiety cope" but I don't agree.

RobMurray 3 hours ago

I know plenty of blind people who have their voice speed unbearably slow and barely scratch the surface of what technology can do for them. I think an interface where you can tell your phone what to do in natural language will really help a lot of less technical people.

I'm not getting my hopes up though given apple's history with Siri, which is truly awful.

chipotle_coyote 2 hours ago

Apple's history with accessibility is, on the whole, pretty good. I strongly suspect that the "coming soon" part of this means "after we integrate Google Gemini models into the system," so I don't think you should use the current state of Siri as a yardstick. (I actually have decent luck with the current Siri, but I don't push it very much and have sort of adapted myself to its limitations; on the flip side, I have a lot of skepticism around LLMs, but they're really a quantum leap in natural language processing capability over what came before, and the use cases they're showing here seem to be right in the LLM wheelhouse -- with the asterisk of "you're still always going to have to check its work.")

miki123211 32 minutes ago

Coming soon very likely means iOS 27.

This has been the typical pattern for Apple for the last few years. The flashy features are announced at WWDC, accessibility has a dedicated, earlier press release. Before this practice, accessibility announcements would usually be tucked in some WWDC slide that most people wouldn't even notice.

Barbing 18 minutes ago

The thing that disappointed me about this amazing announcement was “coming later this year“. They should probably give us dates for a little while at least until we get the (<)$95 checks.

I just would not wanna promise anything. Except “available for download this Friday“ once the gold master is passing tests.

isityettime 2 hours ago

Whenever my sister (blind) and I (visually impaired) visit my mom (blind) we secretly turn up the reading speed on her TV just a little because we can't stand how unbearably slow she keeps it, but if we turn it up quickly, she'll freak out.

After a few more years of Thanksgivings and Christmases and Mothers' Days, we'll finally train her up to a reasonable speed lmao.

kridsdale1 38 minutes ago

This is heartwarming. The audio equivalent to the practice of sighted people fixing the bad default settings on boomers’ televisions each Thanksgiving.

freedomben 2 hours ago

Indeed, and not just fast, but often heavily robotic (which many sighted people struggle to understand even at 1.5x). I remember reading about a blind person who learned how to do echo-location using sound, and it seemed like such a cool superpower, that one of these days I'm going to take the plunge and unplug my monitor and start learning how to really use the tools. I worked with a blind person a few years back who got almost double the battery life from his laptop as the rest of us by having the screen off all the time, so that alone would be a nice feature. I may never get to the epic level of echo-location, but if I get even half-way there it would be awesome. With a bonus of being able to actually QA a11y changes.

Barbing 15 minutes ago

> blind person who learned how to do echo-location using sound

RIP kid https://youtu.be/fnH7AIwhpik

embedding-shape 3 hours ago

> Honestly if you ever observe a blind person using computers you'd impressed how they can listen to audio at unimaginable speeds.

Even better, fire up Orca (or whatever screenreader application your OS comes with) yourself and try to use your computer while shutting your eyes, kind of eye-opening (no pun intended) what kind of experience these sort of users typically get. And also, you quickly start to understand why they set the speech rate for their voice synthesizer to be so fast, it's almost unbearable navigating applications (and particularly lists) otherwise.

jchw 3 hours ago

When I was at Google, I'd periodically test our (internal-only) app with Chromevox with the display off. It's not that it sounded like it would be easy, but it really is a challenge, and I can only imagine the muscle memory built up over time of trying to work around accessibility bugs and strange behaviors.

Unfortunately it seems impossible to get all that much funding for accessibility work :/ I wonder what ever happened to the Newton accessibility bus intended to supplement Wayland...

embedding-shape 12 minutes ago

> I wonder what ever happened to the Newton accessibility bus intended to supplement Wayland...

Hm, never heard about it, but now I'm wondering too. I just finished implementing proper accessibility support for my native app toolkit for Linux, macOS and Windows, but only done it for X11 so far, I was just gonna get started with Wayland. What is the accessibility story on Wayland, couldn't people rely on the same protocols as with X11? That was my impression, but haven't really dig into yet.

miki123211 29 minutes ago

The muscle memory build-up is definitely real.

There are apps I use semi-regularly that less-experienced screen reader users thought were inaccessible, and I couldn't even explain what they were doing wrong from memory. The ways of working around accessibility issues are just so ingrained in me that all I can usually remember is "yeah I did this somehow, but it was six months ago and I have absolutely no idea which specific tricks I needed for this one."

kridsdale1 36 minutes ago

I’ve worked at Apple Facebook and Google. Apple was the only one that made a11y bugs and a face to face consultation with a blind developer to show you how your app sucked, mandatory before you could launch.

seviu 3 hours ago

That time my Mac display broke and I had to log in taught me much about how important learning accessibility is even for non blind people.

isityettime 3 hours ago

> you quickly start to understand why they set the speech rate for their voice synthesizer to be so fast, it's almost unbearable navigating applications (and particularly lists) otherwise.

I imagine that for coding it also helps deal with the fundamental problem of an ephemeral stream rather than a persistent document that you can navigate visually in multiple dimensions. Working memory is limited, and getting more text in in a short period of time probably helps you work within that better. I also imagine that working with text via audio all the time gradually stretches and improves memory.

miki123211 21 minutes ago

It's not the ephemeral stream that's the problem, it's the limited bandwidth.

You can show a lot more info on a screen than you can transmit through speech in a short period of time. That doesn't mean you read faster than you listen, just that sighted people essentially use their eyeballs as an "input device" to decide what information to look at.

If there's an object on the screen that you want to examine but that you don't need to click, you can just "navigate to it" with your eyeballs, without ever touching a mouse or keyboard. We don't have that luxury.

This means we need a much more efficient system for navigating what's on the screen, but that only gets you so far. Eventually, the easiest way to deal with this problem is just to increase the bandwidth of your channel, and you do that by increasing the speech rate.

ShinyLeftPad 3 hours ago

Blind people can't change video speed? The control is available right there.

kochb 2 hours ago

Yes, the audio speed can be adjusted.

Whether that control you see visually is actually accessible to a blind user is a different matter entirely. Further, it maxes out at 2x, but a blind person would typically screen read at the equivalent of 3-6x.

ShinyLeftPad 2 hours ago

Huh, 2x is low even sometimes for sighted people.

Related, it seems like YouTube recently paywalled speed increase beyond 2x. Another way in which it's not cheap to lose sight, I guess.

entrope 47 minutes ago

> Another way in which it's not cheap to lose sight, I guess.

Seems like it would be a win-win to have a user setting to opt out of video in exchange for ungating that feature.

the_other 2 hours ago

> Another way in which it's not cheap to lose sight, I guess.

True.

We can frame it even more strongly: "default societal practices actively discriminate against people with disabilities; they intentionally, consciously choose to make life harder for people who're disadvantaged".

jofzar 3 hours ago

No they are saying that the audio playing for tts would be at like 2.4x what's in the commercial.

ShinyLeftPad 3 hours ago

I don't get it. The speed of TTS can be adjusted, right?

Pretty sure there's enough blind people who don't listen to voice at insane speeds, because they listen in their non-native second language or for whatever other reason. What's wrong in using lowest common denominator that's 100% accessible to those people as well as people who want faster speeds? Unlike "too fast", "too slow" doesn't get entirely inaccessible, it's just boring.

Such a random reason to criticize for.

superchink 3 hours ago

I don’t think it’s meant to be criticism. It’s an interesting piece of information that gives a peek into how those with vision impairment consume content. There’s nothing wrong with it; but it was enlightening to consider the experience for those of us who have not been forced to.

ShinyLeftPad 2 hours ago

Seems like I brought my own negativity into this...

hombre_fatal 53 minutes ago

I don't think you did.

Some blind people listen to things at superhuman speeds, but not all blind people. Using a normal reading speed is a sensible choice for an ad trying to appeal to blind people since you don't want to intimidate those who don't use superhuman speeds.

Going from that to "heh a sighted person made this because it's normal speed" is simply incorrect.

It was the sort of statement an HNer might make to showcase some trivia they have about some other group, but they oversold it.

isityettime 3 hours ago

> Pretty sure there's enough blind people who don't listen to voice at insane speeds, because they listen in their non-native second language or for whatever other reason.

Yes, for lots of reasons. It takes practice to get up to a high speed with a given TTS. People who go blind later in life are just beginning, and it can take a long time for them to get up to really high speeds. You may also need to reset somewhat when you change from one TTS to another. And blind people's ears are subject to problems just like anyone else's; if your hearing isn't great you may need slower speeds or higher volumes or both. That's why even though most people use screenreaders at much higher speeds, the defaults when you turn on a new device are painfully slow. You have to set a conservative default so people with less experience/worse ears/whatever can get by.

Anyway I don't think it's a criticism. It's just noting that it doesn't depict how most people will use end up using it, and if you're curious about what typical usage sounds like, you should look for another example.

stavros 2 hours ago

No. It's not criticism. What they're saying is that the video was shot with a default that a sighted person could understand, because any blind person would naturally have their speed set to much higher than that.

It's like how in videos that teach people a foreign language, everyone speaks slowly and uses simple words, even though native speakers don't talk like that at all. The GP is simply saying that an actual blind person would be way more efficient at it, but they made the video with inefficient settings so sighted people could understand what was going on.

bitwize 3 hours ago

I've heard textual description tracks on television programs before. They come fast, but not screen-reader fast. To the untrained ear a blind person's screen reader sounds like when you somehow get the TI-99/4A's speech synthesizer to read from invalid memory.

isityettime 2 hours ago

The audio description tracks are a different genre than screenreadera perform. They're acting, by actors, carefully written and performed to fit into the gaps in the dialogue while preserving the mood and flow of the show. I think speeding them up or making them robotic would ruin them, while both of those traits are actually desirable for screenreaders.

Sweepi 3 hours ago

dont you worry, as a sighted person I am also infuriated by apples slooow reading speed, e.g. for "Announce Notifications".

hightrix 54 minutes ago

Also as a sighted person, this is why I hate the modern trend of using the video format to show 3-4 bullet points. Just give me the text.

brightbeige 2 hours ago

A while ago I signed up as a sighted person on Be My Eyes. I didn't get as many calls as I had hoped, but I was glad to help out on the few that I could. One call was to read envelopes of incoming mail, another was to read pill bottles, and then there was the two funny guys on big cozy chairs with shopping bags of cereal boxes and wanted to know what was what. I remember one guy really didn't like one type. The app had a unique feature for the sighted person to turn on the camera of the vision impaired person.

https://www.bemyeyes.com

postalcoder 2 hours ago

One thing Apple really needs to get right is speech to text transcription. They've nailed accessibility in so many ways and yet it feels like they're a decade behind on properly transcribing voices. At least half a decade.

Input on the iPhone is so dreadful nowadays. Their palm rejection is definitely worse than before, so mistyping is more frequent. Their text-correction algorithm for typing is worse than before, and it frequently makes incorrect corrections to words that I don't notice, because they change words a few words back from where I typed. And STT hasn't improved. On top of that, my fingers are tired of the phone form factor. Please make the iphone not a chore to use, apple.

terabytest 2 hours ago

Wispr Flow is a masterclass in STT. Apple's solution feels like it's from the last century in comparison. Same applies with Apple's TTS when you have ElevenLabs and OpenAI running laps around it. All I need is for my iPhone to do those things natively at the same quality level (because in Apple's walled garden that's the only way to get them usable everywhere).

jjice 44 minutes ago

But Apple's uses so few system resources and runs fully on device on newer iPhone models (16+ I believe). It's so efficient. I really enjoy using Handy with Parakeet as the model, but the system resource usage is a monster compared to Apple's (although very good).

Looks like Wispr Flow uses a cloud model [0]:

> Cloud based speech processing infrastructure for 1B users

It gets to be a messy comparison because my iPhone can do STT with no latency pretty well fully on device, but Wispr Flow requires a cloud model, but to be fair, older Apple devices do as well. It's not an apples and oranges comparison, but I think those technical details make this a non direct comparison in a few ways.

For on-device with low system resource usage, Apple's is pretty damn good.

[0] https://wisprflow.ai/post/technical-challenges

adamcharnock 2 hours ago

FWIW - I also really like Wispr Flow, but I moved to running the 'Whisper Large' model locally using Handy (https://github.com/cjpais/Handy), which has been essentially as good, while also having lower latency.

throw03172019 2 hours ago

I use Aqua Voice because Apple STT is so frustrating.

nechuchelo 3 hours ago

This looks like a genuinely useful application of LLMs.

I wish more companies focused on how they can help humans instead of replacing us or squeezing us as hard as possible in the name of productivity.

c0wb0yc0d3r 3 hours ago

I think we should reserve judgment until this lands in the hands of the people it helps.

My experience is limited to my elderly parents who have trouble seeing. With the text size Apple allows them to set it to, their phones are unreadable. Text runs off the screen in every app, 1st and 3rd party.

In their bill example, the user is told to confirm with the provider. Why not offer to call the number on the bill? Instead of telling them to use text detection, do it for them? Presumably Apple Intelligence would already have that capability. I’m afraid this will be a gimmick at best.

EDIT: Forgot to mention, the grip is good to see. Hopefully they don’t charge the apple tax on it.

kps 4 minutes ago

Yeah, I used to use iOS with text one step above the default size, and text was often cut off.

I have a problem with astigmatic halation that makes ‘dark mode’ difficult to read. Since iOS 26, multiple aspects of the system have been made dark only, contrary to the system setting. Writing text correctly should be the lowest of low-hanging fruit.

I suspect this is more of a flashy ‘AI’ promotion rather than reflective of any real commitment.

tiffanyh 56 minutes ago

This is what Apple does best.

They treat new industry advancements as technology, not products itself.

AI will be a feature to improve the customer experience, not the product itself.

lern_too_spel 38 minutes ago

These features have existed on Android devices for years. What Apple does best is marketing.

https://blog.google/products-and-platforms/platforms/android...

https://android-developers.googleblog.com/2024/09/talkback-u...

bsanders343 3 hours ago

I agree. There seems to be a lot of potential in this space (from my outsider view). I really hope that this issue from an earlier article (https://news.ycombinator.com/item?id=48178378) doesn't become common enough to make useful functionality like this a danger. Seems unlikely in the short term but as use cases grow, so might the bad actors.

koolala 3 hours ago

Its with their servers right? Do they trust a iPhone with their life? Or they are trusting their data center?

nechuchelo 3 hours ago

Looks like some of the features might use on-device models. They mention subtitle generation works on-device.

micromacrofoot 2 hours ago

"looks like" there are a lot of automated accessibility systems that fall woefully short in practical use

this sort of thing really needs input from someone that uses it before we can judge it

bilbo0s 3 hours ago

Let's be honest, compare the amount of money a corporation can make helping visually impaired people to the amount of money they can make replacing software developers and financial analysts.

Don't get me wrong, Apple using these technologies to help humans who are in need of help is laudable. But let's not pretend we don't know why most corporations don't look into this kind of thing. I think if we're being honest, we all very much know why they leave this sort of thing to the always nebulous "others".

JimDabell 3 hours ago

Tim Cook has been pretty clear where he stands:

> “When we work on making our devices accessible by the blind,” he said, “I don’t consider the bloody ROI.” It was the same thing for environmental issues, worker safety, and other areas that don’t have an immediate profit. The company does “a lot of things for reasons besides profit motive. We want to leave the world better than we found it.”

— https://www.forbes.com/sites/stevedenning/2014/03/07/why-tim...

bilbo0s 2 hours ago

Again, it's absolutely great that Apple does these things!

I was just answering the question of why other corporations don't.

Money.

There's relatively little money in helping the visually impaired. You have to do it because you want to do it. Not because you're going to get rich.

lern_too_spel 36 minutes ago

Apple's competitors have had these features for years (Android for 7, Windows for 1), so it's really an indictment of Apple. They give lip service to helping the visually impaired, and this press release is good marketing for the non-visually impaired people who don't know this.

lotsofpulp 3 hours ago

>But let's not pretend we don't know why most corporations don't look into this kind of thing.

I assume almost everyone looks into spending less money than more money for equivalent goods and services.

jeffbee 2 hours ago

Aren't the LLM-based features of this announcement catch-up features? Describing the contents of the screen is something Gemini has been doing on Pixel phones for a while. It's a fairly obvious use case for a multimodal AI.

My one hope is that this eventually becomes widespread enough to stop alt text scolds.

happyPersonR 19 minutes ago

A lot of us forget it, but things like text to speech, subtitles etc are there for the differently abled

Without that, there wouldn’t really be great vlm and conversational models.

The AI companies might have paid for the dictation of some videos on their own but voice assistants etc wouldn’t have existed and our ability to have AI that eventually understands the world would be much much harder.

nonethewiser 2 minutes ago

So we're blaming disabled people now.

runeks 3 hours ago

> The total amount due on the bill is $83.89. Please verify this amount with your utility provider or by using Text Detection before making a payment.

1. Use AI to determine how much a bill is for

2. Call up the people who billed you and ask them how much they billed you

3. Pay billed amount

dewey 23 minutes ago

Once you paid the same bill for a few months you'll know how much your phone bill will roughly be and you'll not have to do. They obviously have to put that line in there, just like ChatGPT saying "Please verify everything we tell you" in the footer.

tramc 3 hours ago

It’s still useful to get the information instantly and verify it later. Arguably asking someone you trust to read the number for you might be a better idea than calling the company. Not everybody has that option though.

Someone 3 hours ago

And not everybody wants to use that option all the time. Asking a human makes you feel dependent more than using a tool does.

kotaKat 2 hours ago

Aaaand the logistics of making that call to the company to confirm the amount on the bill can get awkward. IVR and hold-time hell just to get a human to have to explain your predicament as to why you're asking for such a mundane piece of information that was in fifty other touchpoints that you couldn't access as quickly or easily.

(I'm also picturing the poor CSR at the other end of the phone wading through hundreds and hundreds of call logs over the years for simple requests and managers up above screaming 'why is this guy calling us all the damn time costing us money'...)

stellamariesays 2 hours ago

[flagged]

Darwins_Toffees 3 hours ago

"Vehicle Motion Cues come to visionOS, which can help reduce motion sickness for people who use Apple Vision Pro as a passenger in a moving vehicle. Vision Pro will also support face gestures for performing taps and system actions, plus a new way to select elements with one’s eyes while using Dwell Control."

Maybe just don't wear them in a car?

dmix 3 hours ago

Wearing a headset in the back of an Uber doesn't sound that crazy,

I use those motion cues on my iPhone even though I don't struggle with motion sickness https://www.youtube.com/shorts/OxbjggMcKrk

nozzlegear 2 hours ago

I use them as well. I'm usually the driver so I don't typically look at my phone while the car is moving, but I recently rode along with a family member to an event. They handed me their iPhone to look at something and I felt totally disoriented trying to look at a moving screen in a moving car. I had to resist the urge to turn on the motion cues.

caiusdurling 56 minutes ago

It's really useful for having a decent screen up in front of me when I'm a passenger trying to do something on the laptop. Saves staring down at my lap, and removes any motion on my screen from the peripheral view of the driver.

Still somewhat odd when a bus drives out from behind your Terminal mind.

jclardy 53 minutes ago

Planes? Trains? If you haven't used these motion dots, they actually do work wonders. My wife gets motion sickness and could barely ever look at her phone when riding as a passenger in the car, even just to type in directions. With the motion dots she does just fine.

brookst 3 hours ago

Trains are a thing.

kridsdale1 33 minutes ago

Trains. Airplanes. TFA said vehicle, not car.

yreg 2 hours ago

>Maybe just don't wear them in a car?

Why not?

throwaway132448 56 minutes ago

Because the more we reject our shared reality and substitute it with each our own, the less humane we become.

jkman 52 minutes ago

God forbid a person rejects the shared reality of a boring 12 hour flight and substitutes it with their own. Some real deep thoughts here

throwaway132448 50 minutes ago

I’ve met some very interesting people on flights. I’ve done some great work. I’ve had some great ideas.

Don’t be so scared of variety. You just keep subjecting yourself to more of the same. The unending familiarity makes you dull.

yreg 2 hours ago

It's a shame Apple removed the screen reader announcements ("the Apple logo") from the youtube version of the commercial.

https://www.youtube.com/watch?v=B3SmsSCvoss

Those made the ad stand out in my opinion.

randusername 3 hours ago

Accessibility features are such a great way to keep technology focused on real-world problems and real-world experiences.

I think the trap in creating anything is doing it for a crowd. Art, software, anything... it turns out better when it is made with a specific, named individual in-mind.

Accessibility features are almost always championed and field-tested with one specific loved one in mind and I think that's what keeps the technical solutions personable and grounded.

Almondsetat 3 hours ago

I have difficulty trusting this. There are plenty of videos online of LLMs making up stuff like "I just ate a hot dog, is there mustard around my mouth?" "No, everything is clean" while there is a big yellow stain om the guy's face

WarmWash 2 hours ago

The problem is using a language model to assess images.

Probably 80% of "LLM's are below expectation" complaints (from the general population) involves some form of image analyses.

Image tokenization is hard because unlike language tokenization, where every token is extremely dense with meaning, image tokens tends to be meaningless or irrelevant but are processed all the same.

Give an SOTA LLM a picture of toothpicks and ask it to move one to make a square, and it will probably struggle and fumble it. But give a mid-size LLM from 2 years ago the same problem in verbal form, and it will nail it almost every time.

That takeaway is, do everything you can to avoid having the LLM need to rely on images for the answer.

gruez 34 minutes ago

I thought all the recent models are "multimodal"? Is the image part just sticking an image recognizer in front of the text model?

postalrat 3 hours ago

Like coding, creating images or text, maybe the alternative of doing it yourself is too easy or enjoyable for you. Don't expect that will be true for everyone.

Almondsetat 3 minutes ago

Did you reply to the wrong person? What are you even trying to say here?

gobdovan 2 hours ago

I'm not blind but I sometimes I can't process where things are, even if in front of me. Would be cool to just point to a messy table and see where the keys are. If they offer this as some Vision/Core ML feature, I'd implement the messy table app as soon as these features land. Probably already possible, but simpler if they release this.

abhikul0 3 hours ago

On-device video subtitles generation is exciting, should help with watching videos on mute. This seems like a low hanging fruit that should've already been grabbed by an app but I can't find any.

mistersquid 2 hours ago

> A new power wheelchair control feature leverages the precision eye-tracking system on Apple Vision Pro to offer a responsive input method for compatible alternative drive systems. [0]

The above caption for Apple Vision Pro is for a video that to me, as an Apple Vision Pro user, is discomforting.

More questions are raised than are answered by the short video: Is the user able to fit the Apple Vision Pro by him/herself? What happens when dwelling on a directional control misregisters? Can the user recalibrate the "Eyes and Hands" setting? Dwelling on a control displaces focus and there may be impeding objects in the path of the power wheelchair. Is this really a good idea?

To my sensibility, the video is unsettling (at best), especially given how cumbersome Apple Vision Pro is.

[0] https://www.apple.com/newsroom/2026/05/apple-unveils-new-acc...

jkman 44 minutes ago

Your concerns are completely nonsensical. It's clearly being marketed as a healthcare tool for people with debilitating injuries that preclude the use of hand-powered wheelchair controls, severe situations where there's no neck-down control and users would be limited to controls like head-tilt or mouth actuated systems. These people obviously require daily care to simply get them out of bed and into the chair and back again every single day - their nurse could just put on their Vision Pro for them! This seems like an incredible leap forward for people in this situation, if they iterate on this and it gets better then this could be a very viable wheelchair control system in the future.

zersiax 2 hours ago

Honestly as a blind person and blind developer myself, most of these features get a shrug at best. For one, there's already a bunch of third-party apps that do most if not all of this (Seeing AI, Envision AI, BeMyEyes, Aira, etc.). So at best, this does what all those apps are doing but faster and on-device, which may or may not mean it is also more inaccurate, we'll have to see. In the meantime, Mac OS's screen reader, VoiceOver, has been left to essentially exist in maintenance mode for years, where users have had to build, arguably impressive, third-party solutions to add features to the thing that comparable screen readers on Windows have had for a really long time.

Through that lens, this all looks a bit performative to me, but again, maybe I'll be pleasantly surprised.

The one thing I'm mildly excited to see is the improvement to Voice Control, as guessing what the programmatic name of a button is or having to constantly use a numbers grid to target elements doesn't sound fun.

To respond to what I see in some of the comments:

- On speech rate: It does take quite a bit of practice to crank up the speech rate and there's a degree of retraining you need to do when you switch voices. A lot of more "human" sounding voices are harder to follow at super high speeds which is why a lot of people prefer more robotic but consistent speech and generally aren't convinced by AI-powered TTS yet; they often fall apart if you raise the speech rate past a certain point. - Re: actually waiting for the target audience's verdict: This is so important. I see more and more companies, individuals etc. talk about accessibility, build accessibility solutions and evangelize AI for accessibility without EVER talking to the people they claim to help. This will almost certainly mean mistakes will be made, up to and including doing more harm than good. If you want to do accessibility right, that includes AI products of any kind, hire people with lived experience or you'll get the equivalent of machine-translated text, hackerproof security in one click or an AI-powered coffee bar that orders thousands of rubber gloves. Coincidental note: I have time for new projects right now :P

dgllghr 2 hours ago

Putting aside the fact that no company should have direct access to anyone's brain, how cool would it be to be building toward VISOR (from TNG) instead of this. If we could translate sensor signals to the neural circuitry of the brain directly, we wouldn't even need an LLM in the mix. But to have it as an overlay, as supplementary data! With the ability to turn it off of course. (Would a person even be able to turn it off? In the same sense as whether someone can "turn off" social media?) If only we had meaningful human rights and institutions that really protected them... I still can't fully give up the techno-optimism that made me love tech in the first place (and TNG for that matter).

dagmx 43 minutes ago

Brain Control Interface support was already announced last year and afaik is part of iOS already.

https://developer.apple.com/documentation/accessibility/brai...

exitb 3 hours ago

As Apple shifts towards services and fancy software features, I wonder how do they expect to stay competitive by only releasing them for a subset of languages.

layer8 3 minutes ago

They roughly know how many of their users use a particular language.

seeeeebt 2 hours ago

Surely a blind person relies a lot on audio input?

isityettime 2 hours ago

Maybe on a smartphone, but usually not on a computer. Keyboards are pretty good.

The other thing is that if you're around others, voice input means you have no privacy. Even if you're not doing anything particularly private, it's a bit awkward and potentially embarrassing. If you use touch input in conjunction with a screen reader, you can be more like a "normal" user in that what you're doing is just between you and your phone.

devinprater 3 hours ago

There's my dopamine hit for the year.

jrm-veris 2 hours ago

this is such a great use case for the technology

nikhilpareek13 2 hours ago

Most apps have terrible accessibility labels because developer don't bother, which breaks every screen reader pipeline downstream. The Voice Control "say what you see" feature routes around that by letting users describe a button in plain language. That's a real fix for a problem caused by humans being lazy about ally.

jansan 3 hours ago

Since Apple uses Gemini to power its AI, are those features actually powered by Google Gemini?

jjice 3 hours ago

They don't get, but they will be using Gemini derived models with iOS 27. For now it's all their own models.

k4rnaj1k 3 hours ago

[dead]

tekacs 3 hours ago

I'm super glad that they're doing this, but once again unexcited for another decade of Apple self-privileging on this stuff so they're the only ones allowed to touch or improve any of this surface, or UX outside an app's tiny box.

People talk a lot about how MacOS has gone downhill but I feel like it would have been a good start if developers could continue to patch over Apple's shortcomings like they used to be able to.

I imagine that we would be a few years into a spectrum of tools like this if they didn't lock it down like they do.

Totally aware that plenty of HN commenters are very glad that Apple keeps this locked down. I'm just the other opinion, that's all.

MagicMoonlight 2 hours ago

And this is why androidlets will never win. They’re too busy selling your data to ever think of disabled people or usability.

iOS is just painfully good. I can pause a video, put my finger on text inside the video, and copy it. Until they added it, I didn’t even know how much I needed that.

f33d5173 2 hours ago

Until they added it, you didn't need it, then suddenly a phone was unusable without it.

NicuCalcea 2 hours ago

I can do that on my Pixel 6.

baxuz 3 hours ago

Now we know why the new AirPods will have cameras!

testfrequency 3 hours ago

I don’t want to discredit more advancements in accessibility, but this feels like accessibility porn.

I have fond memories of an old coworker 10 years ago who is blind. He would use his phone no problem, texting, going about his day, he was even on Tinder (credit to Tinder for making their app so accessible long ago). He would commute on his own, walk to the train station, even transfer to another train during peak rush hour. I’m not saying it was all easy for him, but nothing in this video really stood out to me more than what shirt was on the bed. I know other services/apps have long existed to be the “eyes” for people who need support, but this video feels….uneventful?

I may be cynical about this though, as I often hate how Apple’s marketing makes these emotional bids about how life-critical they are to society - which is fair to a degree..but it just feels cheap to be glamorising “look! we saved this person from pending doom, cool right??”

lwkl 2 hours ago

I mean even if it is marketing for them they still did the work and developed these features. I had some vision issues recently and was glad there were options to make text more legible to me.

Additionally I don't believe this is just marketing. This is adaption to a changing market. Apple's customer base is aging and having these kinds of features will allow them to keep using Apple products for a longer.

testfrequency 2 hours ago

They have done the work, but I don’t see much work that’s beyond what’s been previously capable without Apple Intelligence. The marketing of Apple Intelligence is weak here, not the foundational abilities.