I Stored a Website in a Favicon
119 points by theanonymousone 4 hours ago | 38 comments

Tepix 4 hours ago
Instead of going via pixels, why not use a SVG favicon and directly store markup inside it and extract it?

Use this favicon.svg:

    <svg xmlns="http://www.w3.org/2000/svg">
    <circle cx="50%" cy="50%" r="50%" fill="orange"/>
    <p>hello HN!</p>
    </svg>
use this in your <head> to use a svg favicon:

    <link id="favicon" rel="icon" href="favicon.svg" type="image/svg+xml">
finally, use this in your <body> to extract it and add it to your document body:

    <script>
    fetch(favicon.href).then(r => r.text()).then(t => document.body.innerHTML += t.match(/<p[\s\S]*p>/)[0]);
    </script>
reply
chrismorgan 10 minutes ago
Regular expressions? Ugh. Encode it properly as XML in the correct namespace, load it so, and take it from that.

Or just serve the SVG file and use <foreignObject> to embed the HTML, and include <link rel="icon" href=""> inside it. In theory you should be able to define a <view id="icon"> and use <link rel="icon" href="#icon">, but in practice neither Firefox nor Chromium seems to be handling that properly in a favicon, which is disappointing.

reply
weetii 3 hours ago
Hey, yeah, I wrote the article. This (of course) would be more practical. Thanks for pointing it out. I wanted the payload to "live" in actual pixel data rather than hidden text inside an XML file. That’s why I went this way :)
reply
peter-m80 3 hours ago
The ico file format allows multiple resolution icons, so a lot of data
reply
weetii 3 hours ago
Good point, I might add a section in the article where I list alternative approaches. Thanks
reply
berkes 60 minutes ago
An SVG can embed raster images: base64 encoded bytes.

So you could layer this experiment: favicon is svg, that contains encoded raster, whose bytes are encoded html.

At the very least it would make a mindboggling CTF step.

reply
Walf 3 hours ago
PNG has comment chunks tEXt, zTXt, and iTXt. You can have a completely normal image whose file is stuffed with as much content as you want. That is less fun, I suppose.
reply
weetii 3 hours ago
Yes, that would also work, thanks for pointing it out
reply
sheept 4 hours ago
You can use the favicon cache as storage too, by redirecting users across domains. It's been proposed as a potential fingerprinting risk[0], and if a browser naively reuses the cache for incognito mode, it could be used to track users across browser profiles.

[0]: https://www.schneier.com/blog/archives/2021/02/browser-track...

reply
ai_fry_ur_brain 8 minutes ago
My thoughts instinctively went to "this has to be being used for fingerprinting" when I read OPs blog. Are anti fingerprinting measures taking into account the use of the canvas api with favicons?

The link to the supercookie site is dead unfortunately.

reply
koolala 3 hours ago
Wasn't this fixed or mostly fixed?
reply
franciscop 4 hours ago
Is this timing coincidence? I just submitted 1h (30 mins before this) ago a website I just made about storing your stock porfolio in a URL + favicon!

https://news.ycombinator.com/item?id=48606396

reply
esquivalience 3 hours ago
I found the agressively staccato, clearly LLM-generated content extremely difficult to read.
reply
bstsb 2 hours ago
for the first time in a while on HN, i disagree with the characterisation as AI-generated. at most it was drafted with an LLM, but the final output is pretty human to me.

they used the wrong it’s/its, made But. its own one-word sentence, didn’t capitalise HTML, and used “okayy” in parenthesis. all of this isn’t to criticise the writer - i enjoyed it more seeing these little imperfections that make up a blog post

reply
estetlinus 3 hours ago
It’s the new internet. So, so annoying.
reply
noduerme 3 hours ago
Yeah, but it's kinda weird. The typical LLM headers and bullet points are there, but it's like someone took an axe to the rest of the spew. I too would rather read someone's original bad writing than their bad editing of AI writing, but it's kinda interesting how this all shakes out.
reply
netsharc 10 minutes ago
It doesn't seem to be LLM, but reads like one. The author is German, maybe it's a language expertise thing, maybe he likes the LLM style (unrelated to his nationality).

But yeah, sentences that only have 3-4 word each feel like 3rd grade writing; I couldn't read it.

reply
bartvk 2 hours ago
I wish people would include their prompts.
reply
scottmcdot 3 hours ago
Which bit? The short sentences?
reply
jorisw 46 minutes ago
Fun Fact: You can use any inline SVG for a favicon and keep it right in the HTML document.

This also allows you to use an emoji directly as a favicon, like so:

  <link
    rel="icon"
    type="image/svg+xml"
    href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>(your emoji here)</text></svg>"
  />
(HN isn't showing the emoji)
reply
berkes 53 minutes ago
I'd imagine the (aggressive) caching of the favicon by browsers makes it a challenge, but you could generate the favicon dynamically, then have JS extract the sequentially. Basically streaming arbitraily large content to a webpage via favicons. Via blocks of 239 bytes.

It may be a fun, novel way to proxy webpages that are otherwise blocked. Though, i guess, the service rendering the favicons can just as easily be blocked then.

reply
tetrisgm 47 minutes ago
Love it. Did you see the old effort to store the page in the url? https://github.com/jstrieb/urlpages
reply
purple-leafy 9 minutes ago
That’s awesome. I took this a bit further a few years ago making a url only notepad quine that as you add data to it, creates itself. that can be saved as a bookmarklet. Have to watch the gif to understand

https://github.com/con-dog/serverless-architecture

reply
soanvig 3 hours ago
Honestly it didn't interest me, but I do remember from back in the days full websites rendered by a browser from... Empty files. https://mathiasbynens.be/notes/css-without-html
reply
beardyw 3 hours ago
I would have used a minimal service worker to unpack the web data and present it as if it were just a normal page being loaded.
reply
superjose 4 hours ago
Pretty cool tbh!!! Would have loved seeing the decoder code!!!

It's also pretty interesting to think how an attacker could exploit images on his behalf. Never thought that would be a way!!!

Thanks!

reply
schobi 4 hours ago
I guess the decoder is more than the 208 bytes that this page uses..

But maybe you can misuse this and store a session ID / cookie in a favicon (give everyone a unique one) and survive some cookie cleanup and evade privacy restrictions?

Maybe you can still make it that the favicon looks like an image a little to not raise suspicion?

Favicons seem to be cached across private browsing sessions. Oh no

reply
bozdemir 4 hours ago
Very cool. I wonder is it possible to make a simple game with also leveraging the webassembly?
reply
weetii 3 hours ago
Yes, probably. I guess, you’d need a bigger favicon since the minimal Rust WASM binary is around 20KB+ (?)
reply
alex_suzuki 2 hours ago
You might find my tinkering useful: https://strich.io/blog/posts/embedding-webassembly-in-qrcode... A QR code isn’t much different from a favicon I guess. :)
reply
laladrik 2 hours ago
The link is 404
reply
neon_me 2 hours ago
Is it cake? Game for devs.
reply
ab_wahab01 3 hours ago
Fascinating concept! Thanks for sharing this!
reply
scoot 3 hours ago
Would have been more fun if the blogpost was rendered from the favicon.
reply
fitsumbelay 3 hours ago
very cool and interesting after reading just the title I wrongly assumed this would be about svg
reply
jibal 3 hours ago
Surprised that a minimal "website" only requires a small image = few pixels = few bytes to store it? Um, ok.
reply
pizzaballs 2 hours ago
[dead]
reply
anujshashimal98 4 hours ago
[flagged]
reply
shaharamir 3 hours ago
Amazing!
reply