avatar

say "AI risk mitigation" not "alignment"

the common thread i see between the work of people who describe themselves as working on alignment seems to be AI risk mitigation.

this is the case because "alignment" does not necessarily cover eg pivotal acts; in addition, X-risks are not the whole story (see also: alignment near miss and separation from hyperexistential risk).

while it is true that AI risks are largely caused by us not having alignment, it is not necessarily the case that the immediate solution is to have alignment.

to encompass the spirit of the work i do (when i am being truthful about it), i tend to say that i think about AI risk mitigation — whatever form that takes.


RSS feed available here; new posts are also linked on my twitter.
CC_ -1 License Unless otherwise specified on individual pages, all posts on this website are licensed under the CC_-1 license.
This site lives at https://carado.moe and /ipns/k51qzi5uqu5di8qtoflxvwoza3hm88f5osoogsv4ulmhurge2etp9d37gb6qe9.