I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models
105 points by iliashad 4 hours ago | 18 comments
TLDR: I had 2,207 GoPro videos, and I need to rewatch them to find interesting moments from my cycling journey. I built a project to index them locally on my M1 Max using open-source ML models, search for those moments, and send the best clips straight to my DaVinci Resolve timeline. I indexed 628 videos (668.68 GB, 15h 13m 18s of footage duration), more details in the metrics table in the last section of this article.

Full article: https://iliashaddad.com/blog/i-indexed-669-gb-of-my-gopro-videos-using-my-m1-max-computer


WarOnPrivacy 42 minutes ago
I was surprised to learn that the

    M1 Max CPU is an ARM/SoC, comparable to an 11th gen Intel i9
Do I have it right? Would Windows ARM performance be similar for those cpu?

ref: https://www.cpubenchmark.net/compare/4585vs4245/Apple-M1-Max...

reply
pachouli-please 7 minutes ago
It's also a bit apples (heh) to oranges for a handful of reasons, but most impactful

- "unified" ram makes all the system ram available as VRAM - dedicated ai coaccelerator thingy

Both of these reasons allow the apple silicon chips to crush conventional cpus in these kind of AI model workload stuffs

No idea about what the windows arm stuff is capable of. I know they use Qualcomm snapdragon chips though.

reply
Beijinger 3 hours ago
Does it work for porn collections too?
reply
pduggishetti 3 hours ago
You'll need a lora for this, porn content rejection is heavy. Or you'll need a abliterated model, not sure if vision also works.

You might want to add something like yolo finetune to detect scenes + face recognition too.

reply
vorticalbox 2 minutes ago
Vision still works perfectly fine in abliterated models.
reply
sarjann 2 hours ago
Asking the important questions
reply
fhdkweig 46 minutes ago
reply
lifestyleguru 2 hours ago
Last time I tried whisper, it hallucinated an elaborate conversation from sounds of slapping and moaning and it took minutes to spit every single line of it.
reply
3eb7988a1663 23 minutes ago
Parakeet has been trained to detect non-voice sounds and exclude that from identification, so you might have better luck with that family.
reply
supertroop 2 hours ago
Not sure if you’re being sarcastic but I think this is an interesting question. Would deep seek be useful here since it is local?
reply
fl0id 38 minutes ago
it is possible to use apple gpu with containers. either with podman + runkit + recent mesa or with recent vllm-metal from docker https://www.docker.com/blog/docker-model-runner-vllm-metal-m...
reply
iliashad 54 minutes ago
I would love your feedback and suggestions for new improvements or features you wanna have, either in the source available version, the desktop app or blog post itself?
reply
m3kw9 9 minutes ago
Grab frames, lower res, classify, combine meta data. Write to sql
reply
rho138 3 hours ago
This would fit most best as a “Show HN:” post :)
reply
iliashad 2 hours ago
I tried to edit it and add Show HN, but it doesn't show the edited version. Thank you!
reply
culi 2 hours ago
The title should link to the "full article". I wonder if OP's domain name is banned or something and they're doing this to get around it
reply
knightops_dev 8 minutes ago
[flagged]
reply