goal-program bricks

this is the follow-up to the Insulated Goal-Program idea in which i suggest doing alignment by giving an AI a program to run as its ultimate goal, the running of which would hopefully realize our values. in this post, i talk about what pieces of software could be used to put together an appropriate goal-program, as well as some example of plans built out of them.

here are some naive examples of outlines for goal-program which seem like they could be okay:

these feel like we could be getting somewhere in terms of figuring out actual goal-program that could contain to valuable outcomes; at the very least, it seems like a valuable avenue of investigation. in addition, unlike AGI, individual many pieces of the goal-program can be individually tested, iterated on, etc. in the usual engineering fashion.

