tammy's blog about
AI alignment,
utopia,
anthropics,
and more;
some people worry that coherent extrapolated volition (CEV) is not coherent (for example, on the limit of idealized values). see also my response to "human values are incoherent".
CEV in a general sense is hard to consider, but thankfully i have an actual concrete implementation of something kinda like CEV i can examine: question-answer counterfactual intervals (QACI).
so, how "incoherent" is QACI? it's really up to the user, how long they have in the question-answer interval, and other conditions they're in for that period. but, taking myself as an example, i don't expect there to be huge issues arising from CEV "incoherency". at the end of the day, i don't expect what i write down as my answer to each question to be something current me wouldn't particularly endorse, and i expect that the community of counterfactual me's can value handshake and come to reasonable agreements about general policies. plus, extra redundance could be provided by running counterfactual me's in parallel rather than purely in sequence, to make sure no single counterfactual me breaks the entire long reflection somehow.
in addition, it's not like this first implementation of CEV has to solve everything completely forever! a CEV implemented using QACI can return another long-consideration process, perhaps such as a slightly modified version of itself, and pass the buck to that. in essence, all that the initial QACI CEV has to do is bootstrap something that eventually produces aligned choice(s).
unless otherwise specified on individual pages, all posts on this website are licensed under the CC_-1 license.
unless explicitely mentioned, all content on this site was created by me; not by others nor AI.