tammy's blog about
AI alignment,
utopia,
anthropics,
and more;
suppose you have multiple values ("i want to be healthy but also i want to eat a lot of fries") or a value applying to multiple individuals ("i want both alice and bob to be happy"), but sometimes there are tradeoffs between these values. how do you resolve such situations ?
a simple weighed sum might suffice in many cases, but i feel like there are cases where this is not sufficient.
for example, consider a population of 5 persons, who you care about equally, and consider a simple scalar value you have for them, such as happiness.
now, consider the following three options:
if we are to use a simple sum, all three of these situations sum to 2.5 total utility; yet, i feel like something ought to be done to favor the fair situation over the other two (and then probably to favor the bully situation over the scapegoat situation?)
what i propose to address this is to apply a square root (or other less-than-one exponent) to the utilities of persons before summing, which has the effect of favoring more equal situations. in this case, we get:
which does seem to produce the desired effect: in this situation, it maps to how i feel about things: fair > bully > scapegoat
unless otherwise specified on individual pages, all posts on this website are licensed under the CC_-1 license.
unless explicitely mentioned, all content on this site was created by me; not by others nor AI.