QACI blob location: an issue with firstness
for QACI i need to figure out the problem of blob location (see thoughts 1, 2).
in this post i bring up a particular issue: in many-worlds, which is the likely correct interpretation of quantum mechanics, selecting the "first" (in time) instance of the blob might be just wrong.
here are two failure modes illustrate why:
- at all times including nearer to the big bang, there are exponentially many enough decohered branches of the universe that some happen to contain the question blob, and even some large macro-phenomena encoding it. this is a naive failure mode.
- in timelines where unaligned superintelligence was launched in the past — whether by us or aliens — some of those superintelligences are gonna guess that we're gonna do QACI, and they're gonna do enough quantum coinflips to generate exponentially many enough timelines to include the ones with our question blob, and by having those be earlier in time than our question blob, they'll get to hijack the question-answer interval. this is an adverserial failure mode.
furthermore, we can't just trace a "causality" or "continuity" between the question and the AI being launched, because in the adverserial failure mode, the adverserial superintelligence can simply run a simulation inside of which there is such a continuity or causality.
my thoughts about possible solutions are thus:
- maybe an exception to {adverserial superintelligence being able to fake being causally ahead} would be if we define causality such that us doing QACI is causally upstream of an adverserial superintelligence hijacking it — after all, there's a sense in which it's hijacking QACI "because" we are or might be doing QACI — but this seems like a possibly difficult true name
- maybe rule out adverserial instances using a notion of agency akin to the one in PreDCA? maybe this also helps detect us and thus avoid naive failures too?
- use a clever scheme which lets something like a question-blob or question-process be manifested in a way that can't be bruteforced by generating exponentially many quantum timelines? what are some computational complexity classes greater than EXPTIME, or greater than whichever computational complexity class describes the set of possible timelines one gets to instantiate in many-worlds?
- in some other way, figure out a true name for "naive world, not engineered by an adverserial superintelligence?