https://www.moreisdifferent.com/ https://moreisdifferent.substack.com/ https://twitter.com/moreisdifferent
Dan Elton
I’m not a purely hedonic utilitarian but I think understanding and eliminating the sources of the greatest pain and suffering is important. Humans aren’t good at reasoning about pain severity. There’s a trick the brain plays (especially when depressed) that the pain being experienced is “the worst possible”. However, it can’t be that everyone’s pain is the worst (although it may in truth be the worst the person has experienced themselves). Several times I’ve experienced pain (either psychological or physical, or some combination) which felt like the worst possible (at least along some dimension) only to find later that far greater pain is possible.
You know that Black Mirror episode where the doctor uses a BCI device to feel his patient’s pain (S4E6: “Black Museum”)? I think that could be useful here for comparing instantaneous pain states across wildly different situations.
It’s pretty obvious that what matters is the integral, and philosophers arguing otherwise just seem to be spinning yarn. The integral over instantaneous values is what matters. Unfortunately this is hard to measure and retrospective analyses trying to gauge this integral really suck. Pain experienced is not any less a tragedy if forgotten shortly afterwards. We all die in the end, the ultimate forgetting.
I don’t understand why this is being downvoted. I’m reading the SSC post now and haven’t read all, but I agree this should be explored more. Consider hypnosis—there are anecdotal reports it can be used to get rid of pain. I sorta doubt how far hypnosis can go to eliminate pain but it appears understudied and if it’s true it can really eliminate major pain and if hypnosis (either self or administered) could be systematized and delivered at scale, it would could revolutionize how we treat pain.
At the very least, we already know psychological treatments already work for some conditions involving chronic pain like CFS/ME and Fibromyalgia, but they are underutilized. Patient groups resist these treatments, because of the stigma around mental illness and confusions about the mind-body connection. CBT combined with graded exercise therapy is the only intervention for CFS/ME with multiple RCTs backing it up. It stands to reason that CBT (or similar interventions like DBT / third wave CBT) may be helpful for other chronic conditions which are medically unexplained and for which no good treatments exist.
late to reply, but those are fair points, thanks for pointing that out. I do need to be more careful about attribution and stereotyping. The phenomena I see which I was trying to point at is that in the push to find “the most intelligent people” they end up selecting for autistic people, who in term select more autistic people. There’s also a self-selection thing going on—neurotypicals don’t find working with a team of autistic people very attractive, while autistic people do. Hence the lack of diversity.
I noticed this before seeing this post and actually bought a ticket. I didn’t win. I bought a MA powerball ticket, but it seems this also counts for the national powerball too (?). In any case, now I can say I’ve experienced buying a lottery ticket. It was very underwhelming!
[I didn’t calculate out the likelihood of multiple winners, but yeah, if you factor that in and the tax involved plus time discounting, its almost striking even.]
Yeah there are more men in EA but notice this: you can have a gender ratio of 54% men, 46% women and have 9x as many single men if 45% of men and 45% of women are partnered already. Looking at single people only amplifies gender ratio.
Aside: this is also why there are so many guys in the Bay Area complaining they can’t find a girlfriend vs places like Boston even though the overall gender ratio isn’t that much different.
There are and have been a lot startups working on similar things (AI to assist researchers), going back to IBM’s ill-fated Watson. Your demo makes it look very useful and is definitely the most impressive I’ve seen. I’m deeply suspicious of demos, however.
How can you test if your system is actually useful for researchers?
[One (albeit imperfect) way to gauge utility is to see if people are willing to pay money for it and keep paying money for it over time. However, I assume that is not the plan here. I guess another thing would be to track how much people use it over time or see if they fall away from using it. Another of course would be an RCT, although it’s not clear how it would be structured.]
There’s also a retreat being run for ACX and rationality meetup organizers (https://www.rationalitymeetups.org/new-page-4) July 21 − 24, and a lot of pre-events and after parties planned for EAG SF. (I can send people GDocs with lists if anyone is interested.. I’m not sure if the organizers want them shared publicly).
There should be other options for 2FA you can setup going forward, like authenticator or using the Gmail app on your phone. Some cell providers allow calls / SMS over wifi now, too. There’s also might be a way to use backup codes.
I use all of these, except for corn bulbs! (although technically I used QC35s instead of the QC 45.. and for fish oil I use a 2:1 or 3:1 ratio capsulized)
You didn’t mention the M1 chip in the 2021+ Mac Book Pros, which is amazing.
Also worth noting I found the Bose QC35/45 are better than the NC700 (for a variety of reasons—the volume can go lower and the switches are more tactile being the two main ones).
Just realized I never posted this comment that I wrote a ~week ago!:
What you’re saying is not uncommon. People on EA Forum are very smart and love to engage in criticism, no doubt. Sometimes, it comes off as harsh, too. It is a bit intimidating.
Overall I like getting criticism on my blog posts although it can be tough seeing you made a stupid error and now someone is upset at you. If you’re worried about loosing standing in the EA community because you accidentally publish something wrong, that’s something I worry about too. I’m not sure how much worry about that is appropriate, though. As far as social consequences, a lot of people will be impressed just to see you publishing something, and fewer will be aware you wrote something wrong unless you really screw things up. A retraction or correction can go along way to remedy things, too.
Hi,
I’m just seeing your comment now for some reason. This is super helpful.
Regarding your first point (pain vs suffering), that’s pretty interesting and makes sense. I would just note that the degree to which people can detach from painful experiences varies. Regarding suffering from operations and stents, I have heard the same thing about stents, and that is something we would have to factor in, I think, to a Fermi estimate of the amount of suffering that could be alleviated with early interventions for kidney stones. I wonder if someone could invent a stent that actually diffuses a bit of anesthetic around it while it is in (my understanding is the stents are typically only temporary in place).
Regarding the second point “Regarding small stones being downplayed:”. So, after writing this I looked into it a bit further because I was interested in whether an AI application assisting in early detection might be high value. The idea that radiologists miss tiny stones is only my personal guess. I have only seen 1-2 examples of this, when I was running a system I developed for stone detection and it found stones in CT colonography scans that were not mentioned in the report.. but those 1-2 examples only surfaced after running on over 6,000 scans.
Regarding how what happens with tiny stones: data on this subject is very scarce, but it seems most tiny stones resolve on their own without major symptoms (?). It’s really not very clear. I found one paper which covers this question, although it doesn’t directly study it. Looking at CT scans for CT colonography they found that 7.8% of patients (all middle aged adults) had asymptomatic stones. They then found that only 10% of patients with asymptomatic stones were later recorded as having symptoms over a variable follow-up interval that extended to a maximum of 10 years. So it seems the tiny stones don’t cause symptoms… but maybe it takes longer than 10 years before they start to manifest symptoms.. Probably having a tiny stone puts you at massively higher risk for a symptomatic stone event later in life. There’s very little data on this question or about stone growth dynamics across lifespan in general.. it’s not something that’s very easy to study. Basically, scanning people with CT just to monitor their stone size is absurd, so to study this we have to mine historical scans and then try to find follow-up scans to see if the same stones are still there, and compare their volume (which is a bit tricky to do accurately when the scan parameters change). This is an application for deep learning based automated stone segmentation algorithms, actually, to assist in doing such a study. We have a conference paper under review that actually does this although I have to say it’s technically and logistically challenging to do.
Regarding the third point, “Radiation stigma”: I agree, I think the way you are thinking about this is pretty in line with the risk-benefit calculus as far as I understand it. I should have elaborated a bit more, in my post. I was not thinking of doing a screening CT only to screen for kidney stones. I’ve been working with Prof. Pickhardt at UW. One of the things we’ve been researching is the utility of a low-dose “screening CT” in middle age. The screening CT would cover many things including stones. Prof. Pickhardt is working on assembling data to support this idea, mainly focusing on the value of scanning just the abdomen (not the chest). People currently get a coronary calcium score (“CAC”) CT scan for screening their cardiovascular risk. The abdomen also contains biomarkers (like aortic plaque) that also can gauge cardiovascular risk, plus we can look for a lot of other stuff in the abdomen, including kidney stones.
I agree the preventative approach is probably the most promising (identifying patients at-risk using genetics, blood tests, and maybe other factors, not screening CT) .. especially given how safe and cheap potassium citrate is.
I live near Boston (Somerville, just north of Cambridge).
Perhaps this is obvious but it’s worth noting that the urban core of Boston is denser than a lot of other cities, which makes it easier to get around, either by walking, bike, or public transit. The public transportation is very good (although only by American standards) and will get better with the Green line extension opening (hopefully by Fall 2022). They also seem to be doing a really good job with urban planning / construction here generally compared to other cities (they are actually allowing lots of new housing to be built to meet rising demand).
I’m excited about the EA co-working (https://forum.effectivealtruism.org/posts/cCrMqacEhFRnoHthF/do-you-want-to-work-in-the-new-boston-ea-office-at-harvard) and biosecurity hub projects here.
The main downside is the cold weather, which is exacerbated by moist air from the ocean. However if you know how to dress properly it shouldn’t be too much of an issue. Another downside, if you’re young, is high-turnover among the people in their 20s since many are just here for school.
Thanks, yeah I agree overall. Large pre-trained models will be the future, because of the few shot learning if nothing else.
I think the point I was trying to make, though, is that this paper raises a question, at least to me, as to how well these models can share knowledge between tasks. But I want to stress again I haven’t read it in detail.In theory, we expect that multi-task models should do better than single task because they can share knowledge between tasks. Of course, the model has to be big enough to handle both tasks. (In medical imaging, a lot of studies don’t show multi-task models to be better, but I suspect this is because they don’t make the multi-task models big enough.) It seemed what they were saying was it was only in the robotics tasks where they saw a lot of clear benefits to making it multi-task, but now that I read it again it seems they found benefits for some of the other tasks too. They do mention later that transfer across Atari games is challenging.
Another thing I want to point out is that at least right now training large models and parallelization the training over many GPUs/TPUs is really technically challenging. They even ran into hardware problems here which limited the context window they were able to use. I expect this to change though with better GPU/TPU hardware and software infrastructure.
“The retreat lasted from Friday evening to Sunday afternoon and had 12 participants from UCLA, Harvard, UCI, and UC Berkeley. There was a 1:3 ratio of grad students to undergrads”
So it was 9 undergrads and 3 grads interested in AI safety? This sounds like a biased sample. Not one postdoc, industry researcher, or PI?
To properly evaluate timelines, I think you should have some older more experienced folks, and not just select AI safety enthusiasts, which biases your sample of people towards those with shorter timelines.
How many participants have actually developed AI systems for a real world application? How many have developed an AI system for a non-trivial application? In my experience many people working in AI safety have very little experience with real-world AI development, and many I have seen have none whatsoever. That isn’t good when it comes to gauging timelines, I think. When you get into the weeds and learn “how the sausage is made” to create AI systems, (ie true object level understanding), I think it makes you more pessimistic on timelines for valid reasons. For one thing, you are exposed to weird unexplainable failure modes which are not published or publicized.
Note : I haven’t studied any of this in detail!!!
This review is nice but it is a bit to vague to be useful, to be honest. What new capabilities, that would actually have economic value, are enabled here? It seems this is very relevant to robotics and transfer between robotic tasks. So maybe that?
Looking at figure 9 in the paper the “accelerated learning” from training on multiple tasks seems small.
Note the generalist agent I believe has to be trained on all things combined at once, it can’t be trained on things in serial (this would lead to catastrophic forgetting). Note this is very different than how humans learn and is a limitation of ML/DL. When you want the agent to learn a new task, I believe you have to retrain the whole thing from scratch on all tasks, which could be quite expensive.
It seems the ‘generalist agent’ is not better than the specialized agents in terms of performance, generally. Interestingly, the generalist agent can’t use text based tasks to help with image based tasks. Glancing at figure 17, it seems training on all tasks hurt the performance on the robotics task (if I’m understanding it right). T his is different than a human—a human who has read a manual on how to operate a forklift, for instance, would learn faster than a human who hasn’t read the manual. Are transformers like that? I don’t think we know but my guess is probably not, and the results of this paper support that.
So I can see an argument here that this points towards a future that is more like comprehensive AI services rather than a future where research is focused on building monolithic “AGIs”.. which would lower x-risk concerns, I think. To be clear I think the monolithic AGI future is much more likely, personally, but this paper makes me update slightly away from that, if anything.
So right after posting this I found a report called “Problem area report: pain” published by the Happier Lives Institute. (It seems they have taken the report off their website but you can find it on archive.org)
They say this: “We considered investigating other pain-related problems, but decided not to: … Kidney stones and trigeminal neuralgia – although very painful conditions, they are not as painful as cluster headaches or as common as advanced cancer.”
It’s not clear to me that cluster headaches are much more painful than kidney stones. Really what needs to be done is more Fermi estimation. Data on kidney stone pain frequency and duration seem non-existent. Doing the estimate would require some digging through personal reports online and possibly conducting surveys. I have other stuff I want to write so I pushed out this post without going down that rabbit hole.
Kidney stone pain as a potential cause area
“It doesn’t naively seem like AI risk is noticeably higher or lower if recursive self-improvement doesn’t happen.” If I understand right, if recursive self-improvement is possible, this greatly increases the take-off speed, and gives us much less time to fix things on the fly. Also, when Yudkowsky has talked about doomsday foom my recollection is he was generally assuming recursive self-improvement, of a quite-fast variety. So it is important.
(Implementing the AGI in a Harvard architecture, where source code is not in accessible/addressable memory, would help a bit prevent recursive self improvement)
Unfortunately it’s very hard to reason about how easy/hard it would be because we have absolutely no idea what future existentially dangerous AGI will look like. An agent might be able to add some “plugins” to its source code (for instance to access various APIs online or run scientific simulation code) but if AI systems continue trending in the direction they are, a lot of it’s intelligence will probably be impenetrable deep nets.
An alternative scenario would be that intelligence level is directly related to something like “number of cortical columns” , and so to get smarter you just scale that up. The cortical columns are just world modeling units, and something like an RL agent uses them to get reward. In that scenario improving your world modeling ability by increasing # of cortical columns doesn’t really effect alignment much.
All this is just me talking off the top of my head. I am not aware of this being written about more rigorously anywhere.
For talking points, see my blog post containing responses to the common objections to the FDA quickly doing an EUA.
I plan to submit something based off that post. I also hope to be able to secure a spot to give an oral presentation when the committee meets again for the Paxlovid EUA application. (The deadline for oral presentations for the Molnupiravir EUA meeting passed quite a while ago).
Best practice is to always define acronyms before using them!