Host of the Technical AI Safety Podcast https://technical-ai-safety.libsyn.com
Coorganizer of EA Philly
Streams Linear Algebra Done Right in Coq on twitch https://twitch.tv/quinndougherty92
Host of the Technical AI Safety Podcast https://technical-ai-safety.libsyn.com
Coorganizer of EA Philly
Streams Linear Algebra Done Right in Coq on twitch https://twitch.tv/quinndougherty92
So I read Gwern and I also read this Dylan Matthews piece, I’m fairly convinced the revolution did not lead to the best outcomes for slaves and for indigenous people. I think there are two cruxes for believing that it would be possible to make this determination in real-time:
as Matthews points out, follow the preferences of slaves.
notice that a complaint in the declaration of independence was that the british wanted to citizenize indigenous people.
One of my core assumptions, which is up for debate, is that EAs ought to focus on outcomes for slaves and indigenous people more than the general case of outcomes.
I’m puzzled by the lack of push to convert Patrick Collision. Paul Graham once tweeted that Stripe would be the next google, so if Patrick Collision doesn’t qualify as a billionaire yet, it might be a good bet that he will someday (I’m not strictly basing that on PG’s authority, I’m also basing that on my personal opinion that Stripe seems like world domination material). He cowrote this piece “We need a science of progress” and from what I heard in this interview, signs point to a very EA-sympathetic person.
My first guess, based on the knowledge I have, is that the abolitionist faction was good, and that supporting them would be necessary for an EA in that time (but maybe not sufficient). Additionally, my guess is that I’d be able to determine this in real time.
Maybe! I’m only going after a steady stream of 2-3 chapters per week. Be in touch if you’re interested: I’m re-reading the first quarter of PLF since they published a new version in the time since I knocked out the first quarter of it.
Thanks for the comment. I wasn’t aware of yours and Rohin’s discussion on Arden’s post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me.
I want to jot down notes more substantive than “wait until I post ‘Going Long on FV’ in a few months” today.
As Rohin’s comment suggests, both aiming proofs about properties of models toward today’s type theories and aiming tomorrow’s type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?
I’ve gathered that there’s a lot of pessimism about 1, in spite of MIRI’s investment in type theory and in spite of the word “provably” in CHAI’s charter. My personal expected path to impact as it concerns 1. is “wait until theorists smarter than me figure it out”, and I want to position myself to worry about 2..
I think there’s a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There’s a fundamental question: should we expect that a theory’s competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this.
Asya:
While I agree that proof assistants right now are much slower than doing math proofs yourself, verification is a pretty immature field. I can imagine them becoming a lot better such that they do actually become better to use than doing math proofs yourself, and don’t think this would be the worst thing to invest in.
Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end up drastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage.
Rohin:
it is plausibly still worthwhile becoming an expert on formal verification because of the potential applications to cybersecurity. (Though it seems like in that case you should just become an expert on cybersecurity.)
It’s not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path.
Related: Information security careers for GCR reduction. I think the software safety standards in a wide variety of fields have a lot of leverage over outcomes.
I’ve been increasingly hearing advice to the effect that “stories” are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn’t make a dent in it, but do you think it’s valuable for AI x-safety researchers to make significant (i.e. more than 30% of their time) investments in both 1. doing this directly by telling stories and attempting backward induction, and 2. training so that their stories will be better/more reflective of reality (by studying forecasting, for instance)?