technicalities

Karma: 5,839

Max Chiswick (1985–2025)

technicalitiesJan 13, 2025, 10:37 AM

263 points

3 comments1 min readEA link

technicalities Jun 6, 2024, 1:19 PM
2 points
0 ∶ 0
in reply to: DC’s comment on: Case for emergency response teams
Nah we were only 5 people plus a list of contacts to the end. Main blocker was trying to solve executive search and funding at the same time when these are coupled problems. And the cause of that is maybe me not having enough pull.

technicalities Feb 2, 2024, 12:21 PM
2 points
0 ∶ 0
in reply to: JoshuaBlake’s comment on: Who’s hiring? (Feb-May 2024)
No

technicalities Jan 31, 2024, 10:03 PM
2 points
0 ∶ 0
in reply to: Ofer’s comment on: EA and the United Nations
Appreciate this.
The second metric is aid per employee I think, so salaries don’t come into it(?) Distributing food is labour intensive, but so is UNICEF’s work and parts of WHO.
The rest of my evidence is informal (various development economists I’ve spoken to with horror stories) and I’d be pleased to be wrong.

technicalities Jan 31, 2024, 11:31 AM
22 points
0 ∶ 0
on: Who’s hiring? (Feb-May 2024)
Arb is a research consultancy led by Misha Yagudin and Gavin Leech. Here’s our review of our first and second years. We worked on forecasting, vaccine strategy, AI risk, economic policy, grantmaking, large-scale data collection, a little software engineering, explaining highly technical concepts, and intellectual history. Lately we’ve been on a biotech jaunt and also events.

We’re looking for researchers with some background in ML, forecasting, technical writing, blogging, or some other hard thing. Current staff include a philosophy PhD, two college dropouts, a superforecaster, a machine learning PhD, etc. We pay US wage.

Fully remote with optional long retreats. We spent a full half of 2022 colocated.

We only take work we think is important.

hi@arbresearch.com

technicalities Jan 25, 2024, 11:56 AM
7 points
1 ∶ 0
in reply to: Jan_Kulveit’s comment on: Impact Assessment of AI Safety Camp (Arb Research)
When producing the main estimates, Sam already uses just the virtual camps, for this reason. Could emphasise more that this probably doesn’t generalise.

technicalities Jan 25, 2024, 11:06 AM
7 points
1 ∶ 1
in reply to: Jan_Kulveit’s comment on: Impact Assessment of AI Safety Camp (Arb Research)
The key thing about AISC for me was probably the “hero licence” (social encouragement, uncertainty reduction) the camp gave me. I imagine this specific impact works 20x better in person. I don’t know how many attendees need any such thing (in my cohort, maybe 25%) or what impact adjustment to give this type of attendee (probably a discount, since independence and conviction is so valuable in a lot of research).
Another wrinkle is the huge difference in acceptance rates between programmes. IIRC the admission rate for AISC 2018 was 80% (only possible because of the era’s heavy self-selection for serious people, as Sam notes). IIRC, 2023 MATS is down around ~3%. Rejections have some cost for applicants, mostly borne by the highly uncertain ones who feel they need licencing. So this is another way AISC and MATS aren’t doing the same thing, and so I wouldn’t directly compare them (without noting this). Someone should be there to catch ~80% of seriously interested people. So, despite appearances, AGISF is a better comparison for AISC on this axis.

Long list of AI questions

NunoSempereDec 6, 2023, 11:12 AM

124 points

15 comments86 min readEA link

technicalities Dec 1, 2023, 9:03 AM
2 points
0 ∶ 0
in reply to: Jobst Heitzig (vodle.it)’s comment on: Shallow review of live agendas in alignment & safety
https://www.lesswrong.com/posts/pHJtLHcWvfGbsW7LR/roadmap-for-a-collaborative-prototype-of-an-open-agency
I put it in “galaxy-brained end-to-end solutions” for its ambition but there are various places it could go.

technicalities Nov 30, 2023, 10:12 AM
2 points
0 ∶ 0
in reply to: Jobst Heitzig (vodle.it)’s comment on: Shallow review of live agendas in alignment & safety
Well there’s a lot of different ways to design an NN.
That sounds related to OAA (minus the vast verifier they also want to build), so depending on the ambition it could be “End to end solution” or “getting it to learn what we want” or “task decomp”. See also this cool paper from authors including Stuart Russell.

technicalities Nov 29, 2023, 7:27 PM
2 points
0 ∶ 0
in reply to: Jobst Heitzig (vodle.it)’s comment on: Shallow review of live agendas in alignment & safety
It’s not a separate approach, the non-theory agendas and even some of the theory agendas have their own answers to these questions. I can tell you that almost everyone besides CoEms and OAA are targeting NNs though.

technicalities Nov 28, 2023, 10:07 AM
2 points
0 ∶ 0
in reply to: RyanCarey’s comment on: Shallow review of live agendas in alignment & safety
excellent, thanks, will edit

Shallow review of live agendas in alignment & safety

technicalitiesNov 27, 2023, 11:33 AM

76 points

8 comments29 min readEA link

technicalities Sep 16, 2023, 8:24 AM
3 points
0 ∶ 0
in reply to: sphor’s comment on: Closing Notes on Nonlinear Investigation
Oh great, thanks. I would guess that these discrete cases form a minority of their work, but hopefully someone with actual knowledge can confirm.

technicalities Sep 16, 2023, 7:36 AM
142 points
39 ∶ 11
on: Closing Notes on Nonlinear Investigation
The closing remarks about CH seem off to me.
1. Justice is incredibly hard; doing justice while also being part of a community, while trying to filter false accusations and thereby not let the community turn on itself, is one of the hardest tasks I can think of.
  So I don’t expect disbanding CH to improve justice, particularly since you yourself have shown the job to be exhausting and ambiguous at best.
  You have, though, rightly received gratitude and praise—which they don’t often, maybe just because we don’t often praise people for doing their jobs. I hope the net effect of your work is to inspire people to speak up.
2. The data on their performance is profoundly censored. You simply will not hear about all the times CH satisfied a complainant, judged risk correctly, detected a confabulator, or pre-empted a scandal through warnings or bans. What denominator are you using? What standard should we hold them to? You seem to have chosen “being above suspicion” and “catching all bullies”.
3. It makes sense for people who have been hurt to be distrustful of nearby authorities, and obviously a CH team which isn’t trusted can’t do its job. But just to generate some further common knowledge and meliorate a distrust cascade: I trust CH quite a lot. Every time I’ve reported something to them they’ve surprised me with the amount of skill they put in, hours per case. (EDIT: Clarified that I’ve seen them work actual cases.)

technicalities Sep 16, 2023, 7:21 AM
−2 points
1 ∶ 3
on: Closing Notes on Nonlinear Investigation
Thanks for all your work Ben.
But a glum aphorism comes to mind: the frame control you can expose is not the true frame control.

technicalities Sep 16, 2023, 7:03 AM
4 points
0 ∶ 0
in reply to: Jaime Sevilla’s comment on: Jsevillamol’s Shortform
What about factor increase per year, reported alongside a second number to show how the increases compose (e.g. the factor increase per decade)? So “compute has been increasing by 1.4x per year, or 28x per decade” or sth.
The main problem with OOMs is fractional OOMs, like your recent headline of “0.1 OOMs”. Very few people are going to interpret this right, where they’d do much better with “2 OOMs”.

technicalities Sep 8, 2023, 3:39 PM
2 points
0 ∶ 0
in reply to: technicalities’s comment on: Strongest real-world examples supporting AI risk claims?
Buckman’s examples are not central to what you want but worth reading: https://jacobbuckman.com/2022-09-07-recursively-self-improving-ai-is-already-here/

technicalities Sep 8, 2023, 1:53 PM
15 points
0 ∶ 0
in reply to: technicalities’s comment on: Who’s hiring? (May-September 2022)
Despite my best efforts (and an amazing director candidate, and a great list of volunteers), this project suffered from the FTX explosion and an understandable lack of buy-in for an org with maximally broad responsibilities, unpredictable time-to-payoff, and a largeish discretionary fund. As a result, we shuttered without spending any money. Two successor orgs, one using our funding and focussed on bio, are in the pipeline though.
I’ll be in touch if either of the new orgs want to contact you as a volunteer.
What links here?
- Who’s hiring? (May-September 2022) [closed] by Lorenzo Buonanno🔸 (May 27, 2022, 9:49 AM; 117 points)
- bruce's comment on Sharing Information About Nonlinear by Ben Pace (Sep 9, 2023, 3:29 PM; 62 points)

technicalities Sep 7, 2023, 6:47 PM
10 points
0 ∶ 0
on: Strongest real-world examples supporting AI risk claims?
Break self-improvement into four:
1. ML optimizing ML inputs: reduced data centre energy cost, reduced cost of acquiring training data, supposedly improved semiconductor designs.
2. ML aiding ML researchers. e.g. >3% of new Google code is now auto-suggested without amendment.
3. ML replacing parts of ML research. Nothing too splashy but steady progress: automatic data cleaning and feature engineering, autodiff (and symbolic differentiation!), meta-learning network components (activation functions, optimizers, …), neural architecture search.
4. Classic direct recursion. Self-play (AlphaGo) is the most striking example but it doesn’t generalise, so far. Purported examples with unclear practical significance: Algorithm Distillation and models finetuned on their own output.^[1]
See also this list
Treachery:
https://arxiv.org/abs/2102.07716
https://lukemuehlhauser.com/treacherous-turns-in-the-wild/
1. ^
  The proliferation of crappy bootleg LLaMA finetunes using GPT as training data (and collapsing when out of distribution) makes me a bit cooler about these results in hindsight.

technicalities

Max Chiswick (1985–2025)

Long list of AI ques­tions

Shal­low re­view of live agen­das in al­ign­ment & safety

Long list of AI questions

Shallow review of live agendas in alignment & safety