Hacker News

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

229 points by hopechong 2 days ago | 94 comments

I feel like most of this recent Autoresearch trend boils down to reinventing hyper-parameter tuning. Is the SOTA still Bayesian optimization when given a small cluster? It was ~3 years ago when I was doing this kind of work, haven't kept up since then.

Also, shoutout SkyPilot! It's been a huge help for going multi-cloud with our training and inference jobs (getting GPUs is still a nightmare...)!

karpathy 2 days ago

Wrong and short-sighted take given that the LLM explores serially learning along the way, and can tool use and change code arbitrarily. It seems to currently default to something resembling hyperparameter tuning in absence of more specific instructions. I briefly considered calling the project “autotune” at first but I think “autoresearch” will prove to be the significantly more appropriate name.

janalsncm 2 days ago

I think we need to separate theory from practice. In theory, it can edit the training loop and come up with novel techniques. That is interesting.

In practice, the vast majority of the changes that auto research actually made would have been found much faster with BO if properly parameterized. You do not need an LLM to find a better batch size or learning rate.

kotama7 7 hours ago

I agree that many of the improvements found by auto-research systems could probably be discovered more efficiently by properly parameterized Bayesian optimization. Still, I think LLM-based heuristic guesses can be useful in some cases, especially for proposing reasonable initial hyperparameters based on prior knowledge reported on the web and in blog posts.

prpl 2 days ago

I’d always hoped something like this could take advantage of FPGAs directly

vasco 22 hours ago

FPGAs won't rebuild fast enough for it to matter vs software simulation I'd wager. Even FPGA-in-CPU has been a dream for decades and there you have more time for some workloads, still never was commercially viable for general computing.

jmalicki 16 hours ago

There was research a few years back that tried doing something like this with an FPGA, and they found that their algorithm actually exploited defects in the particular chip (not the model, the actual single specific chip) they were using to use electrical interference for computation that shouldn't have worked on paper. They could not reproduce their design on another FPGA of the same model from the same lot.

achierius 2 days ago

Out of curiosity, what sort of things have you seen it do that better fit 'autoresearch' than 'autotune' thus far? Optimizations it made that wouldn't be been surfaced by an autotune system, I suppose.

karpathy 2 days ago

The most recent round of autoresearch (round 2) which decreased "time to GPT-2" from 1.8 hours to 1.65 hours had some examples. I adjusted the program.md to "look at modded nanogpt project and draw inspirations from there for things to try" and it came back with a bunch of tuning, but also tried and implemented new architecture changes, some of which actually helped including the smear gate and the backout skip connection. These are not just hyperparameters, they are new PyTorch code. I'm now working on a more general system that can have a queue of ideas that could be sourced from archive papers, github repos, etc.

johndough 23 hours ago

Did you consider providing the LLM with a framework for automatic hyperparamter tuning? This would free up its capacity to focus on the more important architectural decisions.

rfw300 2 days ago

Do you have a sense of whether these validation loss improvements are leading to generalized performance uplifts? From afar I can't tell whether these are broadly useful new ideas or just industrialized overfitting on a particular (model, dataset, hardware) tuple.

whiplash451 2 days ago

Why set the bar higher on generalization for autoresearch vs the research humans generally do?

youngprogrammer 2 days ago

industrialized overfitting is basically what ML researchers do

no_shadowban 23 hours ago

[dead]

jwilber 2 days ago

I see this critique about autoresearch online often, but I think it’s misplaced.

Here’s a use case that may illuminate the difference, from my own work at Nvidia. Im currently training some large sparse autoencoders, and there are issues with dead latents. Several solutions exit to help here, such as auxk, which I can certainly include and tune the relevant params as you describe. However, I have several other ideas that are much different, each of which requires editing core code (full evaluation changes, initialization strategies, architecture changes, etc.), including changes to parallelism strategies in the multi-rank environment I’m using. Moreover, based on my ideas and other existing literature, Claude can try a number of new ideas, each potentially involving more code changes.

This automated run-and-discover process is far beyond what’s possible with hyperparam search.

achierius 2 days ago

It wasn't meant as a critique, I'm legitimately interested in knowing more about where it can push boundaries and where it struggles. I agree that in general it's a truism that "Claude can try a number of new ideas" etc., but the question remains as to where in particular it actually takes advantage of this to push the envelope in a way other tools don't -- since that informs when it makes sense to use something like this.

kraddypatties 2 days ago

I can believe that in the long run.

Does the agent have access to arxiv (a brief skim of the README didn't have an answer)? If not, it could be that the current approach of relying on the model's weights only is resulting in the perceived local optimum of hyperparameter tuning.

Anecdotally, we built a little MCP for arxiv to help with our internal research, noticed a significant boost in the diversity of methods (architecture or otherwise) Claude and friends were able to reference.

touristtam 23 hours ago

care to share?

DoctorOetker 20 hours ago

I wonder about the following:

To calculate an gradient step, in practice one doesn't accumulate the gradient for the full corpus, but updates the weights on mini-batches.

Suppose one runs conventional gradient descent on minibatches multiple times with different starting seeds, and then considers a set of pre-trained models M_i

From a random starting point we thus have an idea of the desired end-region in weightspace (lets say a gaussian cloud fit to the final M_i's).

Then it seems like one could score update strategies by how much a single iteration has approached the gaussian cloud, by scoring just the approach on a number of minibatches or just a few update iterations. Instead of searching update strategy space by waiting until pretraining has finished for each candidate update strategy. Only the candidate strategies that perform well enough on 1 or a few iterations would be considered worthy of further consideration, those that pass (a lower number of candidates) are then inspected for approach to the gaussian target after another round of iterations etc.

It seems like it should be possible to optimize the optimization iteration loop, by running it just once for many candidates and observing their convergence to the known desired end region.

throwaway132448 20 hours ago

Naming things is your primary contribution to AI so well done for deliberating on it. I disagree with the outcome though. Autotune would have been much more fitting.

DoctorOetker 20 hours ago

The dataset climbmix 400b looks like it is 600GB, it would be neat if someone could host this in compressed form, given that LLM can be used to compress, even having a small LLM compress it would perform better than classical compression algorithms, why is this approach not used within the ML community?

Or is it the "anyone who means anything in the field, has access to high bandwidth anyway"?

corndoge 2 days ago

Would you say it's fair to describe autoresearch as a form of neural architecture search? I am curious what you think the core differences are between them.

westurner 2 days ago

Is there a cost to converge? And how much does it vary with the random seed?

Re: OpenCogPrime:EconomicAttentionAllocation https://news.ycombinator.com/item?id=45518074 and something about eWASM (edit) https://news.ycombinator.com/item?id=47171887 .. from https://news.ycombinator.com/item?id=46825026 re: eWASM and costed opcodes for agent efficiency

saberience 2 days ago

Have you actually used LLMs for non trivial tasks? They are still incredibly bad when it comes to actually hard engineering work and they still lie all the time, it's just gotten harder to notice, especially if you're just letting it run all night and generate reams of crap.

Most people are optimizing for terrible benchmarks and then don't really understand what the model did anyone and just assume it did something good. It's the blind leading the blind basically, and a lot of people with an AI-psychosis or delusion.

nfg 2 days ago

Do you realise who you’re replying to?

SirensOfTitan 2 days ago

I think the OP's comment is entirely fair. Karpathy and others come across to me as people putting a hose into itself: they work with LLMs to produce output that is related to LLMs.

I might reframe the comment as: are you actually using LLMs for sustained, difficult work in a domain that has nothing to do with LLMs?

It feels like a lot of LLM-oriented work is fake. It is compounding "stuff," both inputs and outputs, and so the increased amount of stuff makes it feel like we're living in a higher plane of information abundance, but in reality we're increasing entropy.

Tech has always had an information bias, and LLMs are the perfect vehicle to create a lot of superfluous information.

DustinKlent 16 hours ago

In my limited experience, using LLMs to code up things unrelated to LLMs (robotics for instance) is significantly less productive than using LLMs to code up things related to LLMs. It works, just not very well and requires a lot more leg work on the user end than in other areas.

OJFord 24 hours ago

To be fair Karpathy isn't known for using LLMs—not that I would assume or question whether he's used them 'for non-trivial tasks', but it's not like making the same comment in reply to Steve Yegge or someone. (However trivial we may think Gastown/Wasteland is in the other sense!)

no_shadowban 23 hours ago

[dead]

_menelaus 2 days ago

lolololol

emp17344 2 days ago

Why should we care that he’s famous?

nfg 2 days ago

Fame doesn’t enter it - the point is Karpathy has about as strong a claim as anyone to having “actually used LLMs for non trivial tasks”.

nurettin 24 hours ago

That is not the case at all, considering that he himself started using and tweeting about llms for coding fairly recently. He's probably less experienced in that area than most people who started using claude cli last year.

He is a researcher who understands neural networks and their architectures exceptionally well. That is all.

DoctorOetker 20 hours ago

> He is a researcher who understands neural networks and their architectures exceptionally well. That is all.

And that is precisely why he is more qualified on the subject than your average vibe coder!

ericd 2 days ago

Shades of https://news.ycombinator.com/item?id=35079

CamperBob2 2 days ago

That whole thread is just amazing, if you back up a couple of levels from ground zero. Great perspectives from a lot of thoughtful posters.

E.g., you can see a post from a user named dhouston, who mentioned that he was thinking about starting an online file sync/backup service of some sort.

ericd 2 days ago

Haha awesome. I guess they were going through YC right then, I still remember their launch video from around then and thinking it was one of the best ads I’d ever seen.

Drupon 2 days ago

tfw le AI guy has LLM psychosis. We're cooked

ipsum2 2 days ago

Hyperparam tuning that has better intuition and can incorporate architecture changes automatically. It won't invent something completely new though.

falcor84 2 days ago

> It won't invent something completely new though.

I don't necessarily disagree, but am wondering whether you have any particular reason/intuition driving you to claim this. I have seen AI agents be quite creative in other tasks; do you think there's a particular reason why we shouldn't see creativity in architecture research, given enough time and resources?

kraddypatties 2 days ago

Hm, that's fair. It does feel like there's low hanging fruit in combining "old school" methods for conducting a hyperparameter sweep efficiently _with_ the higher level architecture edit ability of Autoresearch.

Probably would cut the number of runs down by a significant number (as far as I can tell it's doing a grid search once it decides to mess with a knob or section of the architecture).

wenc 2 days ago

I wonder if it's more like "qualitative gradient descent" on a very non-linear non-convex surface.

You can try this yourself in a simple fashion -- let's say you have piece of code that you want to speed up. Point your agent to a code profiler (your oracle -- typically your Python profiler) and tell it speed up the code. I've tried it. It works.

aimarketintel 20 hours ago

[dead]

pbkhrv 2 days ago

> How parallelism changed the agent’s research strategy > With a single GPU, the agent is stuck doing greedy hill-climbing: try one thing, check the result, pick a direction, try the next thing. With 16 GPUs, the strategy shifts. ...skip... 12 experiments in a single 5-minute wave. This makes it much harder to get stuck in local optima and much easier to find interaction effects between parameters.

The agent can theoretically come up with a protocol to run those same 12 experiments one-by-one and only then decide which branch to explore next - which I think would lead to the same outcome?

But in this case, it just happened to have stumbled on this particular outcome only because it didn't get a chance to execute a greedy strategy after the first 1 or 2 results.

Worse experiment design + parallelism = better experiment design + serialized execution ?

gwern 2 days ago

> The agent can theoretically come up with a protocol to run those same 12 experiments one-by-one and only then decide which branch to explore next - which I think would lead to the same outcome?

At least in theory, adaptiveness should save samples and in this case, compute. (As noted, you can always turn the parallel into serial and so the serial approach, which gets information 'from the future', should be able to meet or beat any parallel approach on sample-efficiency.)

So if the batch only matches the adaptive search, that suggests that the LLM is not reasoning well in the adaptive setting and is poorly exploiting the additional information. Maybe some sort of more explicit counterfactual reasoning/planning over a tree of possible outcomes?

rfw300 2 days ago

Yeah, assuming there's no active monitoring during the training runs, you can trivially give the agent an abstraction which turns "1 GPU" into "16 GPUs" that just so happens to take 16x the wall-clock time to run.

rfw300 2 days ago

In fact, looking at the blog post, the agent orchestrating 16 GPUs is half as efficient as the agent using 1 GPU in GPU-time. Since it uses 16 GPUs to reach the same result as 1 GPU in 1/8 of the time.

zhwu 2 days ago

The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy emerged entirely on its own.

rogerrogerr 2 days ago

Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.

GorbachevyChase 2 days ago

You probably express very few truly original ideas. Let’s not set the bar quite so high unless we are all just a sad simulacrum of “pure” thought.

rullelito 21 hours ago

Then "on its own" has no meaning, i.e. everything an LLM does is "on its own".

deadbabe 22 hours ago

Original ideas are easy if you allow for bad ideas.

suddenlybananas 24 hours ago

But humans are capable of very many original ideas. Look around you, humans were able to remake the entire world because of these original thoughts.

fdghrtbrt 2 days ago

Why surely? Have you never seen an LLM try something new?

rogerrogerr 2 days ago

Is your assertion that no one has ever written "we tried some stuff on the small inexpensive platform first, then moved to the bigger more expensive platform with the more promising options" in a research paper or literally anywhere else?

fdghrtbrt 2 days ago

No, that's not my assertion. In fact I asserted nothing at all.

rogerrogerr 2 days ago

You're speaking in riddles; your communication would be more effective if you didn't do that.

fdghrtbrt 2 days ago

You said "surely", and I asked:

> Why surely? Have you never seen an LLM try something new?

I'm afraid I can't make it any simpler than this.

And I still don't know the answer to how you're so sure. To me there's several explanations, and it seems to you there's only one.

I'm pretty happy with my communication style.

frank_nitti 2 days ago

Seems to me the commenter was asking: what observations led us to conclude that original affirmative statement that “the AI did this entirely on its own”.

Given that this is a common technique and not a novel invention, it’s probably present in the training set.

The “surely” reads like it’s referring to the presence of that information in the training set. But your response casts it as saying “surely the AI has not invented something on its own”.

The original question stands IMO, the burden of proof is on whoever is asserting that the AI has invented something on its own, with or without training data that surely already mentions this approach

fdghrtbrt 2 days ago

There is no burden of proof on me, because I'm not asserting that AI has invented something on its own. I haven't told you what my view is or whether I ever have a view.

The problem with the reasoning of the person I was responding to is that it's assuming "if X is in the training set and LLM outputs X, then it did so because X is in the training set". That does not follow. Conceivably it's possible that X is in the training set and LLM outputs X, but if X hadn't been in the training set the LLM also would've output X.

Lets look at that phrase again:

> Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.

This phrase implies "if X was in the training set, then LLM couldn't have come up with X on its own". This is false. In fact, my claim that the implication is false is testable, in the following manner: Have two training sets, T and T'. In T, X is present. In T' you've removed X but left X-adjacent things. Train LLM A on T and A' on T'. Find a prompt that requires that A outputs X. If on the same prompt A' also outputs X, that's an example of my claim. To repeat, my claim is "it's possible that X is in the training set and LLM outputs X, but if X hadn't been in the training set the LLM also would've output X."

In fact, I've just realized I even have a method for constructing (T, T') that guarantees what I've described. Not sure if it's worth a paper on its own though.

rogerrogerr 2 days ago

Your pure logic is probably right; I do not have the time or interest to dissect it.

But you’re missing the context and implication: “doing new stuff” is the major achievement we’re looking for next from LLMs. Seeing something that is “new” and is not in the training set is interesting in a way that something contained in the training set is not.

We cannot introspect LLMs meaningfully yet, so the difference between “came up with myself and it’s in the training set incidentally” and “applied a concept in the training set” is not meaningful.

nl 2 days ago

I think the number of new math proofs generated by LLMs over the last few months has conclusively proven that yes - they can "come up with things themselves"

A few examples: Axiom's proof of Fel’s open conjecture on syzygies of numerical semigroups: https://x.com/axiommathai/status/2019449659807219884

Erdos 457: https://www.erdosproblems.com/457

The stronger form of Erdos 650: https://www.erdosproblems.com/650

caconym_ 2 days ago

I honestly don't think I have.

In this case, using a cheap(er) signal or heuristic as an initial filter before spending more resources on cases that pass the filter is a pattern that shows up all over the place, and LLMs are good at picking up on patterns like that and generalizing them. AFAICT.

anon291 2 days ago

I'm not sure how people say this so confidently. I have a rather esoteric haskell library that I've written and published for years. ChatGPT and Claude both know about it and frequently help me improve it, and propose completely novel approaches. I'm really not sure how people are so confident that they can't think of anything new. This seems like wishful confirmation bias.

caconym_ 2 days ago

> I'm not sure how people say this so confidently.

Say what, exactly?

hhh 2 days ago

Why?… The experiment.yaml shows that it is calling h100/200 explicitly, it’s pretty common for humans to say “number bigger more gooder” for anything… Lie and reverse the values and see what happens. I would put money on a rabbit hole of complaining about it being misconfigured.

ed 2 days ago

Models are familiar with H100’s. They even predate ChatGPT.

Aboutplants 2 days ago

Yeah I thought that was a particularly neat part

TheJord 2 days ago

[dead]

herf 2 days ago

This "early velocity only" approach seems like a problem - how do you know with 5-minute training runs that you aren't affecting the overall asymptote? e.g., what if the AI picks a quantizer that happens to be faster in the first five minutes, but has a big noise floor where it can't make more progress?

gwern 2 days ago

Yes, it's greedy so may hit local optima. You can fit learning curves and try to extrapolate out to avoid that problem, to let you run long enough to be reasonably sure of a dead end, and periodically revive past candidates to run longer. See past hyperparameter approaches like freeze-thaw https://arxiv.org/abs/1406.3896 .

claud_ia 23 hours ago

[dead]

zkmon 23 hours ago

I'm trying to find the 'wow' factor in this. Finding the optimal combination of parameters, given a validation criteria should be a boring repetitive task for a machine or a human. Is it about determining how to utilize the given hardware?

hgoel 2 days ago

This is fascinating to me because I just recently built something similar (as a test), not for improving AI, instead it's for tuning hyperparameters for a physics simulation.

We've managed to optimize execution of the simulation enough that brute-force search is a viable option, but giving an agent some background on how we tune those parameters on intuition and some physical reasoning, and a means to run tests and retrieve resulting statstics, works surprisingly well.

I see it as essentially a hyperparameter search that is more capable of finding and exploiting implicit constraints in a system.

covi 2 days ago

This feels like the chimpanzee with a power drill. An agent is honestly just brute-force search, but guided.

chaos_emergent 2 days ago

Human-driven research is also brute-force but with a more efficient search strategy. One can think of a parameter that represents research-search-space-navigation efficiency. RL-trained agents will inevitably optimize for that parameter. I agree with your statement insomuch as the value of that efficiency parameter is lower for agents than humans today.

It's really hard to imagine that they __won't__ exceed the human value for that efficiency parameter rather soon given that 1. there are plenty of scalar value functions that can represent research efficiency, of which a subset will result in robust training, and 2. that AI labs have a massive incentive to increase their research efficiency overall, along with billions of dollars and really good human researchers working on the problem.

viccis 2 days ago

>Human-driven research is also brute-force but with a more efficient search strategy

No it's not. Is there anything to back that up? There's a creative aspect to human research that I've yet to see with gen AI. All it does is regurgitate stuff and get some "new" ideas via the latent space of the distribution it models. But a generative model cannot by definition create anything new. Just estimate its data well enough that it can sample it well enough to fake novelty.

red75prime 23 hours ago

Your "brute-force search, but guided" feels like oxymoron. How does it differ from "guided search"?

groby_b 2 days ago

Is there anything in the research space that doesn't fit "brute-force search, but guided"?

All of science is "gather inputs, make hypothesis, test, analyse" on repeat.

There's plenty to critique in the particular guidance approach, but the overall method is the same.

gwern 2 days ago

Except the power drill isn't being used to make a better chimpanzee.

ordinarily 19 hours ago

I've been doing this for about a month. I also have wildly complicated ML pipelines working similarly in parallel. When Karpathy's 'autoresearch' came out I was surprised by how novel it was treated.

QubridAI 23 hours ago

Feels like we’ve solved how to run agents anywhere, but not yet how to trust them anywhere.

snthpy 14 hours ago

> Scale Autoresearch on your own GPU cluster

Who's got a cluster of H100s and H200s just lying around?

dragonwriter 14 hours ago

The entire discussion is about rented cloud clusters, so I guess anyone with the money to rent one?

fabmilo 2 days ago

I am fascinated by this example of using AI to improve AI. I won a small prize using this technique on helion kernels at a pytorch hackathon in SF.

The next step are: - give the agent the whole deep learning literature research and do tree search over the various ideas that have been proposed in the past. - have some distributed notepad that any of these agents can read and improve upon.

augment_me 2 days ago

Isn't this apples to pears comparison? It's really just saying that having a bigger credit card gets you shit faster, but it's actually worse in terms of GPU utilization and efficiency.

1) The total amount of time is not the same if you just count GPU-hours. If you have 16 GPUs, it makes sense to run them for 4.5 hours to get to 72h for an even comparison, not 8.

2) If we stop at 4.5 hours(and are generous including the big drop), the loss is about 0.978, which is the same as about 44 hours with the sequential solution, making the sequential solution about twice as efficient.

So the real conclusion here is that we are able to run things in parallel at an efficiency loss but at a time win as long as we have access to more hardware. I feel like the blog oversells itself.

Freedumbs 2 days ago

The US governement should 'autoresearch' a way to charge this man for his crimes as head of autopilot at tesla.

random3 2 days ago

Can you elaborate?

Freedumbs 2 days ago

are you familiar with tesla? i'm not super, but am aware of their public things. they introduced fake marketing products called full self driving and autopilot that don't do those things. apparently this person karpathy was in charge of computer vision there. he led the team who is responsible for these systems that occupy our roads which can't navigate due to such outstanding occurrences as sunlight, precipitation, and fog.

i don't know a great deal about the guy. i know: he worked at tesla, led autopilot there. if we ignore the character defects required to work at tesla, he's responsible for designing systems that would certainly kill people because they decided lidar was too expensive.

dbdr 2 days ago

Building a tech and falsely advertising it to be something else that what it is (e.g. self driving instead of driving assistance) can typically done by different people. Lacking specific evidence, it's reckless to accuse this person.

Freedumbs 2 days ago

right. i'm mostly ignorant of the subject and rushing to judgement based on bias. but he did lead the computer vision team for years at tesla that created autopilot. didn't resign in protest and to my knowledge hasn't apologized, but again i'm ignorant and not seeking new data.

nurettin 2 days ago

It feels like the richer a company is, the dumber their software and more expensive their upkeep gets. Something you could do with optuna in C++ on a single server now requires clusters of GPUs with LLMs at the helm.

ipsum2 2 days ago

A cluster is 2 nodes? That's technically true, but not very exciting.

saberience 2 days ago

Wait, "Karpathy's Autoresearch", you mean a loop that prompts the agent to improve a thing given a benchmark?

People have been doing this for a year or more, Ralph loops etc.

I hate the weird strange Twitter world of hero-worship for folks that seems to arise just out of large followings.

Joe no-followers does this six months ago, nobody cares. Karpathy writes a really basic loop and it's now a kind of AI miracle prompting tons of grifters, copy-cats, weird hype.

I do wonder if LLMs have just made everyone seriously, seriously dumber all of a sudden. Most of the "Autoresearch" posts I see are completely rubbish, with AI optimizing for nonsense benchmarks and people failing to understand the graphs they are looking at. So yes, the AI made itself better at a useless benchmark while also making the code worse in 10 other ways you don't actually understand.

password54321 2 days ago

The number of refurbished mac minis that are available in my country has suddenly dramatically increased ever since the Clawdbot tweet. People never learn.

misiti3780 2 days ago

increased or decreased?

dag100 24 hours ago

"increased" implies that people bought brand-new Mac Minis to run ClawdBot on, got bored of it, and then sold them back to be refurbished and resold.

misiti3780 20 hours ago

ya ok, that makes more sense, thanks

huang-b62b5756 2 days ago

cool

muin_kr 14 hours ago

[dead]

elophanto_agent 23 hours ago

[dead]

elophanto_agent 23 hours ago

[dead]

bhekanik 19 hours ago

[dead]

robutsume 2 days ago

[dead]

maxothex 2 days ago

[dead]

rsmtjohn 2 days ago

[dead]

aplomb1026 2 days ago

[dead]

opensre 2 days ago

[flagged]

fmymzk41 2 days ago

[dead]

ReacherL3692283 2 days ago

[dead]

pratelsingh 2 days ago

[dead]

ladyxtel88 2 days ago

[dead]

nsollazzo53 2 days ago

[dead]

UndoExec55 2 days ago

[dead]

mika-el 2 days ago

[flagged]