in a previous post i talk about the need to accomplish philosophical progress at determining what we value, before alignment. i wouldn't be the first to think of "what if we boot superintelligence now, and decide later?" as an alternative: it would indeed be nice to have this possibility, especially given the seeming imminentness of superintelligence.
alas, typically, making this proposition goes like this:
which is a pretty good point, and usually a reasonably A concedes at that point.
today, however, i am here to offer a continuation to this conversation, from A's side.
my idea is to implement a deterministic computational utopia for people to be uploaded in, whose internals are disconnected from the outside world, such as ∀V; if we have infinite compute, then it can be even more free from outside interference.
the trick is to have that utopia's principles be deontological, or at least to make them absolute rather than able to be weighed against decisions outside of it: as it largely is in ∀V, ensure everything about utopia has a definite okay or not-okay status, evaluable without knowing anything about the "outside" of this utopia. either someone's consent is being violated, or it's not. with a set of decisions based only on the state of the utopia being simulated, every decision of the superintelligence about what it does in ∀V is unique: all superintelligence is doing is calculating the next step of this deterministic computation, including ethical principles, and thus there is nothing superintelligence can do to bias that decision in a way that is helpful to it. all it can do is run the computation and wait to see what it is that persons inside of it will decide to reprogram it to value or do; on the outside/before the singularity, all we need to ensure is that superintelligence does indeed eventually run this computation and apply the changes we decide on once it finds them out.
under these conditions, a device could be set up for us to later reprogram superintelligence somehow when/if we ever figure out what values we actually want, and it wouldn't be able to meaningfully interfere with our decision process, because every decision it takes regarding how our utopia is ran is fully deterministic.
not that i think being able to reprogram a superintelligence after boot is necessarily a good idea, but at least, i think it can be a possibility.