avatar

uploading people for alignment purposes

as per my utopian vision, i've thought that an aligned AI would want to figure out how to upload us.

but, thinking about it more, it could be the other way around: if we can upload people in a deterministic simulation, this can buy us a lot of time to figure out alignment, as per this post.

notably, the simulation could for example contain a single uploaded person (say, eliezer yudkowsky, or a bunch of copies of yudkowsky), which would save us from an arms-race type coordination problem; and while, on the outside, the superintelligence is killing everyone instantly to tile the universe with more compute to run this simulation, whoever's inside of it has plenty of time to figure things out (and hopefully resurrect everyone once that's done).

this seems like a long shot, but have you looked around? this could be the miracle we need.

of course this could also turn into a hell where infinite yudkowsky's are suffering forever everywhere. hopefully we can make another button which actually stops the simulation and tiles the universe with only benign paperclips, and maybe even make that button auto-activate if the yudkowsky is detected to be suffering or incoherent.

remember: as long as the simulation is deterministic, superint can't force the uploaded yudkowsky to not shut it down, or force or even coerce him to do anything for that matter; it can only make the yudkowsky simulation run slower, which basically eventually achieves the same effect as either completing it or shutting it down.


RSS feed available here; new posts are also linked on my twitter.
CC_ -1 License Unless otherwise specified on individual pages, all posts on this website are licensed under the CC_-1 license.
This site lives at https://carado.moe and /ipns/k51qzi5uqu5di8qtoflxvwoza3hm88f5osoogsv4ulmhurge2etp9d37gb6qe9.