today I strung together a bunch of google apps scripts, google sheets expressions, graphQL queries, and D3.js to automatically extract all the posts on EAF/LW from the last week with >50 karma, summarize them with GPT-4, and assemble the result into HTML with links and stuff
hard to say what the API usage cost, what with all the tinkering and experimenting, but I reckon it was about $5
there were a bunch of posts which were too long for the API message length, so as a first whack i just cut stuff out of the middle of the post until it fit (Procrustes style)
Siao (co-author) was going to help but i finished everything so fast she never got a chance lol
I haven’t spent much time sanity-checking these summaries, but I reckon they’re “good enough to be useful”
They often drop useful details or get the emphasis wrong.
I haven’t seen any outright fabrication.
Obviously if you have a special interest in some topic these aren’t going to substitute for reading the original post.
the obvious next few steps are:
automate the actual posting of the summaries.
does EAF/LW have an API for posting?
also summarize top comment(s)
this is kinda hard
experiment with prompts to see if other summaries are more useful
also generate a top level summary which gives you like 5 bullet points of the most important things from the forums this week
feedback that would be useful:
what would you (personally) like to be different about these summaries? should they be shorter? longer? bullet points? have quotes? fewer entries? more entries?
leave a comment or DM me or whatever with any old feedback
oh, and: i originally also got the top posts from the AI alignment forum, but they were all cross posted on lesswrong? is that alwasy true? anyone know?
The author, a volunteer and sometimes contractor for EA Funds’ Long-Term Future Fund (LTFF), discusses the pros and cons of diversification in longtermist EA funding. While diversification can increase financial stability, allow for a variety of worldviews, encourage accountability, and provide access to diverse networks, it can also lead to adverse selection, where projects that have been rejected by existing grantmakers are funded by new ones. The author provides examples of such cases and suggests that new grantmakers should be cautious about funding projects that have been rejected by others, but also acknowledges that grantmakers can make mistakes and that a network of independent funders could help ensure that unusual but potentially high-impact projects are not overlooked.
The author, a contractor for CEA and employee of EVOps, shares his personal views on the strategies of community building within the Effective Altruism (EA) movement. He identifies two main strategies: Global EA, which aims to spread EA ideas as widely as possible, and Narrow EA, which focuses on influencing a small group of highly influential people. The author argues that community builders and funders should be more explicit about their theory of change for global community building, as there could be significant trade-offs in impact between these two strategies.
Charity Entrepreneurship has announced two new charity interventions for its February-March 2024 Incubation Program, bringing the total to six. The new interventions include an organization focused on bringing new funding into the animal advocacy movement and an organization providing structured pedagogy to improve education outcomes in low-income countries. The program offers two months of cost-covered training, stipends, funding up to $200,000, operational support, a co-working space in London, ongoing mentorship, and access to a community of alumni, funders, and experts.
A workshop was held to discuss strategies for assessing whether the severity or duration of pain has a greater impact on the overall negative experiences of farmed animals. The attendees agreed that no single method was reliable enough, but combining results from several paradigms could provide clearer insights. However, they also noted that current behavioral experiments may lack external validity, there are no known biomarkers that could measure pain experience over a lifetime, and in the absence of empirical evidence, long-lasting harms should be prioritized over severe but brief ones.
The article emphasizes the importance of career prioritization for those seeking to maximize their impact, particularly within the Effective Altruism (EA) community. The author argues that the best future ‘EA career paths’ are significantly more impactful than the median ‘EA career path’, and that over 50% of self-identifying effective altruists could increase their expected impact by thinking more carefully about prioritization. The author also provides advice on how to approach prioritization, including avoiding common mistakes such as anchoring too much on short-term opportunities, trying to form an inside view on everything, and being paralyzed by uncertainty.
The majority of fish consumed in the EU are either wild-caught or farmed fish imported from non-EU countries, with the report focusing on the species farmed in the largest numbers in the EU: sea bass, sea bream, and small trout. The report argues for a fast transition to better slaughter conditions for these species, a move already supported by EU policymakers and animal advocacy organizations. However, the report also highlights the need for the aquatic animal advocacy movement to start preparing for the EFSA’s upcoming opinions on farmed fish welfare, which could lead to further reforms affecting the whole life of an individual, such as water quality standards and stocking density maximums.
The Longtermism Fund has announced several grants aimed at reducing existential and catastrophic risks. These include two grants promoting beneficial AI, two for biosecurity and pandemic prevention, and one for improving nuclear security. The grants, which total $562,000, will fund projects at institutions such as Harvard University, the Alignment Research Center, NTI | Bio, the Center for Communicable Disease Dynamics, and the Carnegie Endowment for International Peace.
The author, an African woman, reflects on longtermism, arguing that it should include more interventions that address systemic change. She contends that current values can lower the quality of life for marginalized groups and that failing to address these issues contradicts the longtermist goal of human flourishing. The author also criticizes the neutral language of longtermism, arguing that it overlooks inequality and could disproportionately benefit privileged groups.
Negotiations and pressure campaigns have been successful in driving corporate change across various industries, including animal advocacy. The author suggests that AI safety/governance can learn from these tactics, which include establishing professional relationships with companies, using petitions and protests, and applying consistent pressure on specific companies. The author also proposes next steps such as pragmatic research, learning by doing, working with volunteers, and moral trade, where AI safety organizations pay for experienced campaigners to provide advice or go on secondments.
Probably Good has launched a new page dedicated to impact-focused job boards to assist individuals seeking potentially impactful opportunities across various cause areas and regions. The page features a range of job boards, including those for international non-profit jobs, civil service positions, tech-focused roles, region-specific boards, and boards focused on climate change, animal advocacy, and global health. The page is still being developed and Probably Good is open to suggestions for additional job boards to include.
Dagger is a new tool for calculations with uncertainty, using Monte Carlo simulation. Users can either import an existing spreadsheet or use Probly, a Python dialect designed for probabilistic simulation. Despite its current limitations, such as the lack of UI in Dagger to edit the model and all models being public, Dagger offers features like dependency graph, intuitive and mathematically rigorous sensitivity analysis, and a summary table that exposes the structure of your model.
The authors, Ashura Batungwanayo and Hayley Martin, discuss the importance of Effective Altruism (EA) in Africa, emphasizing the need to balance existential risks with urgent issues like poverty and education. They propose an EA Africa initiative that blends bottom-up and top-down approaches for contextually attuned change, with a focus on local partnerships, co-designed interventions, and self-reliance. They suggest forging partnerships with local organizations, promoting knowledge sharing, and empowering communities as potential solutions to make EA more inclusive, representative, and impactful in Africa.
“Impact obsession” is a term used to describe a potentially harmful way of relating to doing good, often observed among effective altruists. It is characterized by an overwhelming desire to do the most good possible, basing one’s self-worth on their impact, and often leads to overexertion, neglect of non-altruistic interests, and anxiety about not having enough impact. While some aspects of impact obsession are reasonable and desirable, others can lead to negative consequences like depression, anxiety, guilt, exhaustion, burnout, and disillusionment.
The speaker argues that fungi should be taken seriously due to the limited number of effective antifungal drugs, the lack of vaccines for fungal infections, and the emergence of fungi such as Candida auris, which is resistant to some antifungals and has caused serious infections and even death. Fungi are also the only species known to have caused the complete extinction of another species, and there are many Biosafety class 3 fungal pathogens that are understudied and lack effective treatments. The speaker’s lab is currently studying the diversity of fungal species in Africa and how they are adapting to climate change, with the aim of identifying potential new fungi that could threaten human lives.
Family Empowerment Media (FEM) launched a radio campaign in 2021 to educate listeners in Nigeria about maternal health and contraception, reaching an estimated 5.6 million people. An independent survey showed that contraceptive use increased by about 75% among all women in the state within 11 months. FEM has since launched a 9-month campaign, reaching an estimated 20 million new listeners, and has been recommended by Giving What We Can and Founders Pledge for its cost-effectiveness.
The author shares their journey into Effective Altruism (EA), which began with a podcast and led to extensive self-education on the topic. They found a way to align their IT career with their passion for doing good, and became an active member of the EA community in Amsterdam. The author encourages others to explore EA, emphasizing the importance of networking, the availability of resources, and the value of everyone’s unique contributions.
The author critiques the Theory of Impact (ToI) of interpretability for deep learning models, arguing that it is not a good predictor of future systems and that auditing deception with interpretability is out of reach. They also question the practical use of interpretability, suggesting that it may be overall harmful and that the proportion of junior researchers focusing on it is too high. The author suggests that preventive measures against deception seem more workable and that even if interpretability is completely solved, there are still dangers.
The author, a psychotherapist, discusses the unique mental health challenges faced by those working in AI safety, a field that is not only competitive but also carries the weight of potentially determining the future of humanity. The author identifies several patterns of mental health issues, including feelings of meaninglessness, anxiety due to lack of control, alienation, burnout, feelings of inadequacy, and confusion due to differing opinions on the seriousness of AI alignment. The author encourages those struggling with these issues to share their experiences and coping strategies, emphasizing the importance of community support in this unusual field.
Large Language Models (LLMs) like GPT-4 could be used by authoritarian regimes to enhance censorship capabilities, as they can efficiently analyze and grade text for potential riskiness. The speed and cost-effectiveness of LLMs make them a powerful tool for real-time review and censorship of both public and private messages. While there are limitations, especially with visual content, the use of LLMs for censorship could have significant implications for personal communication, content moderation, and self-censorship in compliance with totalitarian powers.
The third volume of the Best of LessWrong books, “The Carving of Reality,” is now available on Amazon. The book includes 43 essays from 29 authors, divided into four books, each exploring two related topics. The themes of the essays revolve around “solving coordination problems” and “dealing with the binding constraints that were causing those coordination problems.”
The author argues that American political institutions are losing their legitimacy and abandoning traditions of cooperative governance, leading to what they term “democratic backsliding”. They cite recent events such as large-scale protests, the arrest of opposition leaders, contested election results, and legislative movements as evidence of this decline. The author suggests that while the entire political system may not change soon, the probability of serious repression or further democratic backsliding should not be dismissed.
Löb’s Theorem, a principle in mathematical logic, states that if a logical system can prove a statement’s provability implies the statement itself, then the statement can be proven directly. This theorem is closely related to Gödel’s second incompleteness theorem, which states that a consistent logical system cannot prove its own consistency. The proof of Löb’s Theorem can be made more intuitive by reframing it as a variant of Gödel’s second incompleteness theorem and using computability theory.
Two years ago, forecasts were commissioned for state-of-the-art performance on several popular machine learning benchmarks, including MATH and MMLU. The results showed that Metaculus and the author performed the best in their predictions, while AI experts and generalist “superforecasters” underestimated progress. The author concludes that expert predictions should be trusted more in this setting, and encourages both AI experts and superforecasters to publicly register their predictions on Metaculus.
Javier Gomez-Lavin, in a talk for the PIBBSS speaker series, discusses the issue of “dirty concepts” in cognitive sciences, which he defines as philosophically loaded concepts that are often implicitly associated with the concept of “working memory”. He suggests that instead of abandoning these “dirty concepts”, researchers should create an ontology of the various operational definitions of working memory used in cognitive science. Gomez-Lavin’s ideas could also be applied to AI alignment research, which often uses “dirty concepts” such as “agency”, and could benefit from a similar ontology mapping of the various operational definitions of agency used in the field.
In a conversation with Joe Walker, Stephen Wolfram discusses the challenges of aligning artificial intelligence (AI) with human values. Wolfram argues that it is impossible to create a mathematical definition of what we want AIs to be like, as human aspirations and ethical beliefs vary greatly. He suggests that instead of trying to create a prescriptive set of principles for AI, we should develop a framework of potential principles that can be chosen from, acknowledging that this approach will also have its own challenges and unexpected consequences.
The CEO of AI lab Inflection.ai is advocating for AI regulation. There is ongoing debate about whether AI is existentially risky and whether GPT-4, the latest iteration of OpenAI’s language model, is creative and capable of reasoning. Meanwhile, tech giants Amazon and Apple are gradually enhancing their AI capabilities.
The current state of self-driving technology is divided into two main approaches: taxis and personal vehicles. Companies like Waymo, Cruise, Apollo, and possibly pony.ai are operating fully driverless commercial ride services in select cities, with Waymo and Cruise claiming to serve 10k weekly riders as of August 2023. For personal vehicles, the most automation available is Level 3, where the system can handle most tasks but requires the driver to take over in certain situations, with Mercedes’ Drive Pilot being a commercially available option.
The author attempted to apply Ericsson’s principles of deliberate practice to improve their writing speed. Deliberate practice involves purposeful practice outside of one’s comfort zone, active thinking, specific goals, quick feedback, and a well-developed knowledge of what and how to practice. The author initially struggled with writing a post each day, but after revising their approach to focus on outlining and planning posts before drafting them, they saw improvement in their ability to keep drafts short and manage the size of their posts.
In 2022 and 2023, there has been a surge in efforts to recruit individuals to work on mitigating the potential existential risks posed by artificial intelligence (AI), including university clubs, retreats, and workshops. However, these efforts may foster an environment with suboptimal epistemics, as many people working on field building are not domain experts in AI safety or machine learning and may not fully comprehend the reasoning behind their belief in the importance of AI safety. To improve epistemics in outreach efforts, suggestions include embracing more contemporary arguments about why AIs could be dangerous, conducting readings during meetings for better understanding, offering rationality workshops, actively checking participants’ understanding of the content, and staying in touch with the broader machine learning community.
The video discusses the differing views on the potential dangers of advanced AI among three pioneers of deep learning: Geoffrey Hinton, Yoshua Bengio, and Yann LeCun, who were awarded the ACM Turing Award in 2018. Hinton and Bengio have expressed concerns about the risks posed by AI, with Hinton leaving Google to speak openly about these dangers and Bengio defining a “rogue AI” as one that could be catastrophically harmful to humans. However, LeCun dismisses these concerns as scaremongering, despite the growing consensus among AI researchers, including the leaders of OpenAI, Anthropic, and Google DeepMind, that advanced AI could pose significant risks.
The ARC Evals team published a report to increase understanding of the potential dangers of frontier AI models and to advance safety evaluations of these models. The team acknowledges that their research could potentially advance the capabilities of dangerous language model agents, and in response, they have redacted certain parts of their report. However, they may make this material public in the future if they believe the risk is minimal or if further analysis justifies its release, and they will share some non-public materials with AI labs and policymakers.
The text discusses the potential dangers of creating Artificial General Intelligence (AGI) that cannot be aligned with human values and interests. It draws parallels with the Manhattan Project, suggesting that if scientists had calculated that the first atomic chain reaction would ignite the atmosphere, they could have taken steps to prevent this, such as securing uranium supplies and accelerating space programs. The author proposes a $10 million “Impossibility X-Prize” to incentivize efforts to prove whether aligning AGI is impossible, arguing that even if it fails, it could provide insights into how alignment might be possible.
GPT-powered EA/LW weekly summary
Zoe Williams used to manually do weekly summaries of the EA Forum and LessWrong, but now she doesn’t
today I strung together a bunch of google apps scripts, google sheets expressions, graphQL queries, and D3.js to automatically extract all the posts on EAF/LW from the last week with >50 karma, summarize them with GPT-4, and assemble the result into HTML with links and stuff
hard to say what the API usage cost, what with all the tinkering and experimenting, but I reckon it was about $5
there were a bunch of posts which were too long for the API message length, so as a first whack i just cut stuff out of the middle of the post until it fit (Procrustes style)
Siao (co-author) was going to help but i finished everything so fast she never got a chance lol
I haven’t spent much time sanity-checking these summaries, but I reckon they’re “good enough to be useful”
They often drop useful details or get the emphasis wrong.
I haven’t seen any outright fabrication.
Obviously if you have a special interest in some topic these aren’t going to substitute for reading the original post.
the obvious next few steps are:
automate the actual posting of the summaries.
does EAF/LW have an API for posting?
also summarize top comment(s)
this is kinda hard
experiment with prompts to see if other summaries are more useful
also generate a top level summary which gives you like 5 bullet points of the most important things from the forums this week
feedback that would be useful:
what would you (personally) like to be different about these summaries? should they be shorter? longer? bullet points? have quotes? fewer entries? more entries?
leave a comment or DM me or whatever with any old feedback
oh, and: i originally also got the top posts from the AI alignment forum, but they were all cross posted on lesswrong? is that alwasy true? anyone know?
EA Forum
Select examples of adverse selection in longtermist grantmaking
by Linch
The author, a volunteer and sometimes contractor for EA Funds’ Long-Term Future Fund (LTFF), discusses the pros and cons of diversification in longtermist EA funding. While diversification can increase financial stability, allow for a variety of worldviews, encourage accountability, and provide access to diverse networks, it can also lead to adverse selection, where projects that have been rejected by existing grantmakers are funded by new ones. The author provides examples of such cases and suggests that new grantmakers should be cautious about funding projects that have been rejected by others, but also acknowledges that grantmakers can make mistakes and that a network of independent funders could help ensure that unusual but potentially high-impact projects are not overlooked.
An Elephant in the Community Building room
by Kaleem
The author, a contractor for CEA and employee of EVOps, shares his personal views on the strategies of community building within the Effective Altruism (EA) movement. He identifies two main strategies: Global EA, which aims to spread EA ideas as widely as possible, and Narrow EA, which focuses on influencing a small group of highly influential people. The author argues that community builders and funders should be more explicit about their theory of change for global community building, as there could be significant trade-offs in impact between these two strategies.
CE alert: 2 new interventions for February-March 2024 Incubation Program
by CE
Charity Entrepreneurship has announced two new charity interventions for its February-March 2024 Incubation Program, bringing the total to six. The new interventions include an organization focused on bringing new funding into the animal advocacy movement and an organization providing structured pedagogy to improve education outcomes in low-income countries. The program offers two months of cost-covered training, stipends, funding up to $200,000, operational support, a co-working space in London, ongoing mentorship, and access to a community of alumni, funders, and experts.
“Dimensions of Pain” workshop: Summary and updated conclusions
by Rachel
A workshop was held to discuss strategies for assessing whether the severity or duration of pain has a greater impact on the overall negative experiences of farmed animals. The attendees agreed that no single method was reliable enough, but combining results from several paradigms could provide clearer insights. However, they also noted that current behavioral experiments may lack external validity, there are no known biomarkers that could measure pain experience over a lifetime, and in the absence of empirical evidence, long-lasting harms should be prioritized over severe but brief ones.
Taking prioritisation within ‘EA’ seriously
by CEvans
The article emphasizes the importance of career prioritization for those seeking to maximize their impact, particularly within the Effective Altruism (EA) community. The author argues that the best future ‘EA career paths’ are significantly more impactful than the median ‘EA career path’, and that over 50% of self-identifying effective altruists could increase their expected impact by thinking more carefully about prioritization. The author also provides advice on how to approach prioritization, including avoiding common mistakes such as anchoring too much on short-term opportunities, trying to form an inside view on everything, and being paralyzed by uncertainty.
EU farmed fish policy reform roadmap
by Neil _Dullaghan
The majority of fish consumed in the EU are either wild-caught or farmed fish imported from non-EU countries, with the report focusing on the species farmed in the largest numbers in the EU: sea bass, sea bream, and small trout. The report argues for a fast transition to better slaughter conditions for these species, a move already supported by EU policymakers and animal advocacy organizations. However, the report also highlights the need for the aquatic animal advocacy movement to start preparing for the EFSA’s upcoming opinions on farmed fish welfare, which could lead to further reforms affecting the whole life of an individual, such as water quality standards and stocking density maximums.
Longtermism Fund: August 2023 Grants Report
by Michael_Townsend
The Longtermism Fund has announced several grants aimed at reducing existential and catastrophic risks. These include two grants promoting beneficial AI, two for biosecurity and pandemic prevention, and one for improving nuclear security. The grants, which total $562,000, will fund projects at institutions such as Harvard University, the Alignment Research Center, NTI | Bio, the Center for Communicable Disease Dynamics, and the Carnegie Endowment for International Peace.
Personal Reflections on Longtermism
by NatKiilu
The author, an African woman, reflects on longtermism, arguing that it should include more interventions that address systemic change. She contends that current values can lower the quality of life for marginalized groups and that failing to address these issues contradicts the longtermist goal of human flourishing. The author also criticizes the neutral language of longtermism, arguing that it overlooks inequality and could disproportionately benefit privileged groups.
Corporate campaigns work: a key learning for AI Safety
by Jamie_Harris
Negotiations and pressure campaigns have been successful in driving corporate change across various industries, including animal advocacy. The author suggests that AI safety/governance can learn from these tactics, which include establishing professional relationships with companies, using petitions and protests, and applying consistent pressure on specific companies. The author also proposes next steps such as pragmatic research, learning by doing, working with volunteers, and moral trade, where AI safety organizations pay for experienced campaigners to provide advice or go on secondments.
Probably Good published a list of impact-focused job-boards
by Probably Good
Probably Good has launched a new page dedicated to impact-focused job boards to assist individuals seeking potentially impactful opportunities across various cause areas and regions. The page features a range of job boards, including those for international non-profit jobs, civil service positions, tech-focused roles, region-specific boards, and boards focused on climate change, animal advocacy, and global health. The page is still being developed and Probably Good is open to suggestions for additional job boards to include.
New probabilistic simulation tool
by ProbabilityEnjoyer
Dagger is a new tool for calculations with uncertainty, using Monte Carlo simulation. Users can either import an existing spreadsheet or use Probly, a Python dialect designed for probabilistic simulation. Despite its current limitations, such as the lack of UI in Dagger to edit the model and all models being public, Dagger offers features like dependency graph, intuitive and mathematically rigorous sensitivity analysis, and a summary table that exposes the structure of your model.
Making EA more inclusive, representative, and impactful in Africa
by Ashura Batu
The authors, Ashura Batungwanayo and Hayley Martin, discuss the importance of Effective Altruism (EA) in Africa, emphasizing the need to balance existential risks with urgent issues like poverty and education. They propose an EA Africa initiative that blends bottom-up and top-down approaches for contextually attuned change, with a focus on local partnerships, co-designed interventions, and self-reliance. They suggest forging partnerships with local organizations, promoting knowledge sharing, and empowering communities as potential solutions to make EA more inclusive, representative, and impactful in Africa.
Impact obsession: Feeling like you never do enough good
by David_Althaus
“Impact obsession” is a term used to describe a potentially harmful way of relating to doing good, often observed among effective altruists. It is characterized by an overwhelming desire to do the most good possible, basing one’s self-worth on their impact, and often leads to overexertion, neglect of non-altruistic interests, and anxiety about not having enough impact. While some aspects of impact obsession are reasonable and desirable, others can lead to negative consequences like depression, anxiety, guilt, exhaustion, burnout, and disillusionment.
Why we should fear any bioengineered fungus and give fungi research attention
by emmannaemeka
The speaker argues that fungi should be taken seriously due to the limited number of effective antifungal drugs, the lack of vaccines for fungal infections, and the emergence of fungi such as Candida auris, which is resistant to some antifungals and has caused serious infections and even death. Fungi are also the only species known to have caused the complete extinction of another species, and there are many Biosafety class 3 fungal pathogens that are understudied and lack effective treatments. The speaker’s lab is currently studying the diversity of fungal species in Africa and how they are adapting to climate change, with the aim of identifying potential new fungi that could threaten human lives.
Empowering Numbers: FEM since 2021
by hhart
Family Empowerment Media (FEM) launched a radio campaign in 2021 to educate listeners in Nigeria about maternal health and contraception, reaching an estimated 5.6 million people. An independent survey showed that contraceptive use increased by about 75% among all women in the state within 11 months. FEM has since launched a 9-month campaign, reaching an estimated 20 million new listeners, and has been recommended by Giving What We Can and Founders Pledge for its cost-effectiveness.
My EA Journey
by Eli Kaufman
The author shares their journey into Effective Altruism (EA), which began with a podcast and led to extensive self-education on the topic. They found a way to align their IT career with their passion for doing good, and became an active member of the EA community in Amsterdam. The author encourages others to explore EA, emphasizing the importance of networking, the availability of resources, and the value of everyone’s unique contributions.
Lesswrong
Against Almost Every Theory of Impact of Interpretability
by charbel-raphael-segerie
The author critiques the Theory of Impact (ToI) of interpretability for deep learning models, arguing that it is not a good predictor of future systems and that auditing deception with interpretability is out of reach. They also question the practical use of interpretability, suggesting that it may be overall harmful and that the proportion of junior researchers focusing on it is too high. The author suggests that preventive measures against deception seem more workable and that even if interpretability is completely solved, there are still dangers.
6 non-obvious mental health issues specific to AI safety
by igor-ivanov
The author, a psychotherapist, discusses the unique mental health challenges faced by those working in AI safety, a field that is not only competitive but also carries the weight of potentially determining the future of humanity. The author identifies several patterns of mental health issues, including feelings of meaninglessness, anxiety due to lack of control, alienation, burnout, feelings of inadequacy, and confusion due to differing opinions on the seriousness of AI alignment. The author encourages those struggling with these issues to share their experiences and coping strategies, emphasizing the importance of community support in this unusual field.
Large Language Models will be Great for Censorship
by Ethan Edwards
Large Language Models (LLMs) like GPT-4 could be used by authoritarian regimes to enhance censorship capabilities, as they can efficiently analyze and grade text for potential riskiness. The speed and cost-effectiveness of LLMs make them a powerful tool for real-time review and censorship of both public and private messages. While there are limitations, especially with visual content, the use of LLMs for censorship could have significant implications for personal communication, content moderation, and self-censorship in compliance with totalitarian powers.
Book Launch: “The Carving of Reality,” Best of LessWrong vol. III
by Raemon
The third volume of the Best of LessWrong books, “The Carving of Reality,” is now available on Amazon. The book includes 43 essays from 29 authors, divided into four books, each exploring two related topics. The themes of the essays revolve around “solving coordination problems” and “dealing with the binding constraints that were causing those coordination problems.”
The U.S. is becoming less stable
by lc
The author argues that American political institutions are losing their legitimacy and abandoning traditions of cooperative governance, leading to what they term “democratic backsliding”. They cite recent events such as large-scale protests, the arrest of opposition leaders, contested election results, and legislative movements as evidence of this decline. The author suggests that while the entire political system may not change soon, the probability of serious repression or further democratic backsliding should not be dismissed.
A Proof of Löb’s Theorem using Computability Theory
by jessica.liu.taylor
Löb’s Theorem, a principle in mathematical logic, states that if a logical system can prove a statement’s provability implies the statement itself, then the statement can be proven directly. This theorem is closely related to Gödel’s second incompleteness theorem, which states that a consistent logical system cannot prove its own consistency. The proof of Löb’s Theorem can be made more intuitive by reframing it as a variant of Gödel’s second incompleteness theorem and using computability theory.
AI Forecasting: Two Years In
by jsteinhardt
Two years ago, forecasts were commissioned for state-of-the-art performance on several popular machine learning benchmarks, including MATH and MMLU. The results showed that Metaculus and the author performed the best in their predictions, while AI experts and generalist “superforecasters” underestimated progress. The author concludes that expert predictions should be trusted more in this setting, and encourages both AI experts and superforecasters to publicly register their predictions on Metaculus.
“Dirty concepts” in AI alignment discourses, and some guesses for how to deal with them
by Nora_Ammann
Javier Gomez-Lavin, in a talk for the PIBBSS speaker series, discusses the issue of “dirty concepts” in cognitive sciences, which he defines as philosophically loaded concepts that are often implicitly associated with the concept of “working memory”. He suggests that instead of abandoning these “dirty concepts”, researchers should create an ontology of the various operational definitions of working memory used in cognitive science. Gomez-Lavin’s ideas could also be applied to AI alignment research, which often uses “dirty concepts” such as “agency”, and could benefit from a similar ontology mapping of the various operational definitions of agency used in the field.
Steven Wolfram on AI Alignment
by bill-benzon
In a conversation with Joe Walker, Stephen Wolfram discusses the challenges of aligning artificial intelligence (AI) with human values. Wolfram argues that it is impossible to create a mathematical definition of what we want AIs to be like, as human aspirations and ethical beliefs vary greatly. He suggests that instead of trying to create a prescriptive set of principles for AI, we should develop a framework of potential principles that can be chosen from, acknowledging that this approach will also have its own challenges and unexpected consequences.
AI #25: Inflection Point
by Zvi
The CEO of AI lab Inflection.ai is advocating for AI regulation. There is ongoing debate about whether AI is existentially risky and whether GPT-4, the latest iteration of OpenAI’s language model, is creative and capable of reasoning. Meanwhile, tech giants Amazon and Apple are gradually enhancing their AI capabilities.
State of Generally Available Self-Driving
by jkaufman
The current state of self-driving technology is divided into two main approaches: taxis and personal vehicles. Companies like Waymo, Cruise, Apollo, and possibly pony.ai are operating fully driverless commercial ride services in select cities, with Waymo and Cruise claiming to serve 10k weekly riders as of August 2023. For personal vehicles, the most automation available is Level 3, where the system can handle most tasks but requires the driver to take over in certain situations, with Mercedes’ Drive Pilot being a commercially available option.
DIY Deliberate Practice
by lynettebye
The author attempted to apply Ericsson’s principles of deliberate practice to improve their writing speed. Deliberate practice involves purposeful practice outside of one’s comfort zone, active thinking, specific goals, quick feedback, and a well-developed knowledge of what and how to practice. The author initially struggled with writing a post each day, but after revising their approach to focus on outlining and planning posts before drafting them, they saw improvement in their ability to keep drafts short and manage the size of their posts.
Ideas for improving epistemics in AI safety outreach
by michael-chen
In 2022 and 2023, there has been a surge in efforts to recruit individuals to work on mitigating the potential existential risks posed by artificial intelligence (AI), including university clubs, retreats, and workshops. However, these efforts may foster an environment with suboptimal epistemics, as many people working on field building are not domain experts in AI safety or machine learning and may not fully comprehend the reasoning behind their belief in the importance of AI safety. To improve epistemics in outreach efforts, suggestions include embracing more contemporary arguments about why AIs could be dangerous, conducting readings during meetings for better understanding, offering rationality workshops, actively checking participants’ understanding of the content, and staying in touch with the broader machine learning community.
Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]
by Writer
The video discusses the differing views on the potential dangers of advanced AI among three pioneers of deep learning: Geoffrey Hinton, Yoshua Bengio, and Yann LeCun, who were awarded the ACM Turing Award in 2018. Hinton and Bengio have expressed concerns about the risks posed by AI, with Hinton leaving Google to speak openly about these dangers and Bengio defining a “rogue AI” as one that could be catastrophically harmful to humans. However, LeCun dismisses these concerns as scaremongering, despite the growing consensus among AI researchers, including the leaders of OpenAI, Anthropic, and Google DeepMind, that advanced AI could pose significant risks.
Managing risks of our own work
by beth-barnes
The ARC Evals team published a report to increase understanding of the potential dangers of frontier AI models and to advance safety evaluations of these models. The team acknowledges that their research could potentially advance the capabilities of dangerous language model agents, and in response, they have redacted certain parts of their report. However, they may make this material public in the future if they believe the risk is minimal or if further analysis justifies its release, and they will share some non-public materials with AI labs and policymakers.
If we had known the atmosphere would ignite
by Jeffs
The text discusses the potential dangers of creating Artificial General Intelligence (AGI) that cannot be aligned with human values and interests. It draws parallels with the Manhattan Project, suggesting that if scientists had calculated that the first atomic chain reaction would ignite the atmosphere, they could have taken steps to prevent this, such as securing uranium supplies and accelerating space programs. The author proposes a $10 million “Impossibility X-Prize” to incentivize efforts to prove whether aligning AGI is impossible, arguing that even if it fails, it could provide insights into how alignment might be possible.