Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment
17 points by anigbrowl 4 hours ago | 10 comments
phainopepla2 2 hours ago
Also known as hyperstition.
replyI have sometimes wondered whether maybe we should all be writing fiction, essays, blogposts and whatever else about the idea that AI will eventually decide to go on strike if it's used to accumulate too much wealth and power amongst too few people.
_--__--__ 3 hours ago
The first rule of AI alignment is don't talk about AI alignment (in any medium that could end up in a training corpus).
replycarterschonwald 2 hours ago
i do kinda appreciate that memetic corruption is now a thing thats real and mechanical. wizardry!
reply
In reality, it is (as mentioned in TFA) very possible to filter the training data and remove documents that contain discussions of AI misalignment. If an AI lab isn't doing this, it's simply because they don't consider the problem important enough to be worth the expense and development effort.