tammy's blog about
AI alignment,
utopia,
anthropics,
and more;
(edit: i mean exfohazard, not infohazard)
(edit: i've added something like this to my blog, see locked posts)
to me, turning my thoughts into posts that i then publish on my blog and sometimes lesswrong serves the following purposes:
however, i've come to increasingly want to write and publish posts which i've determined — either on my own or with the advice of a trusted peers — to be potentially infohazardous, notably with regards to potentially helping AI capability progress.
on one hand, there is no post of mine i wouldn't trust, say, yudkowsky reading; on the other i can't just, like, DM him and everyone else i trust a link to an unlisted post every time i make one.
it would be nice to have a platform — or maybe a lesswrong feature — which lets me choose which persons or groups can read a post, with maybe a little ⚠ sign next to its title.
note that such a platform/feature would need something more complex than just a binary "trusted" flag: just because i can make a post that the Important People can read, doesn't mean i should be trusted to read everything else that they can read; and there might be people whom i trust to read some posts of mine but not others.
maybe trusted recipients could be grouped by orgs — such as "i trust MIRI" or "i trust The Standard List Of Trusted Persons". maybe something like the ability to post on the alignment forum is a reasonable proxy for "trustable person"?
i am aware that this seems hard to figure out, let alone implement. perhaps there is a much easier alternative i'm not thinking about; for the moment, i'll just stick to making unlisted posts and sending them to the very small intersection of people i trust with infohazards and people for whom it's socially acceptable for me to DM links to new posts of mine.
unless otherwise specified on individual pages, all posts on this website are licensed under the CC_-1 license.
unless explicitely mentioned, all content on this site was created by me; not by others nor AI.