Hacker News

Hacker News

RLHF from Scratch

61 points by onurkanbkrc 12 hours ago | 2 comments

fauria 5 hours ago

RLHF: Reinforcement learning from human feedback - https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...

alansaber 10 hours ago

Looks good. I am a big advocate for these hands on demos as being the best way for beginners to learn ML