Announcing Scry: A research tool over arXiv, EA Forum, etc. w/ SQL + Vibe/Vector Algebra
Today I’m releasing ExoPriors Scry, a new research tool over the EA Forum, arXiv, HackerNews, community-archive.org, and more. Try it here.
TLDR
You and your agent can now query this rich dataset with the full expressive power of SQL (Structured Query Language) + embedding vector algebra. You DON’T have to be familiar with these, Claude writes the search queries for you. Just ask Claude whatever questions you have about the tool.
It’s best driven via Claude Code (with Opus 4.5). I highly recommend getting acquainted with it. It can be driven by Claude in the browser—you just have to change one setting and paste in the prompt—but it’s significantly slower and less agentic.
Even with all the fanciness of AI so far and Deep Research agents, this is a novel search/research experience, that allows you to do things you can’t do anywhere else.
Exploration is encouraged! Text embeddings are kind of magical. This an easy way to get more familiar with them, like how to compose “vibes” like @FTX_controversy - (@guilt_tone - @guilt_topic), to find writing that was about the FTX crisis and distinctly without guilty tones, but that can mention “guilt”.
There’s an Alerts functionality. You can be emailed when some new post, comment, arXiv paper, etc., is ingested and meets your very nuanced criteria.
Some example usage:
> find the 4 posts over 100 karma that are most distant from each other in every way (not the average of them). we want to create 4 quadrants.
> what are the most whimsical and uplifting EA Forum posts?
> find me the most serious documents about civilizational refuges, and anyone who could be able to be rapidly mobilized to help with that.
> who flip flopped multiple times on FTX crisis, being against → for sam → against, or for sam → against → for sam, in time
> Which EA Forum posts predicted things about AI that turned out to be wrong, and what were the common failure modes?
> exhaustively explore who has tended to coin the most new terms
> the people who have had very uplifting and whimsical EA posts, how are they doing. does anyone seem like they could use support. followed up with: search for all of lesswrong, twitter, etc., of people who plausibly feel abandoned by EA. do sophisticated search.
^ I got a very meaningful response, but will leave out results for privacy.
Some cool aspects of this project:
hardening up a SQL database enough to let the public run queries. There’s so much collective trauma about SQL injection attacks that most people have forgotten that this is possible.
I’ve built on syntactic sugar for using custom vectors. Agents can embed arbitrary queries and refer to them with
@vector_handlesyntax. This compactness helps agents reason efficiently, and let’s us not have to pass around 8kb vectors.product-as-a-prompt, agent-copilot-targeted UX as a paradigm. It was pretty cool realizing I could e.g. just describe my /feedback API endpoint in the prompt, to open up the easy communication channel with users and help me iterate on the project better.
There’s currently over 23M embeddings and 600 GB of indexes for query performance, and long query time limits. You can learn a lot about the actual shape of the world with exhaustive search, like finding that no single person across Hacker News mentions a combination of particular topics across their comments.
there’s an Alerts feature. The system ingests thousands of papers, posts, articles, comments a day, you can specify an arbitrary SQL query that will run multiple times a day, and get an email when the output changes. If you find a query you like, you can just tell Claude to submit that query as an Alert (and it’ll use our API appropriately).
Bounty Offer
When I’m in the mindset of building this, I tend to struggle at coming up with interesting research questions. I will pay $12/$8/$4 to the three most interesting queries (rough prompt + SQL) provided in the comments, on January 1st.
Happy to take any feedback!
P.S. If anyone wants to sponsor this, I could ingest a large fraction of all academic papers, Bluesky, and dozens of other sources that you point out.
I’m curious about how representative the EA forum is to the EA community, particularly in regards to worldview. One thing you could try is to take the EA community surveys done by RP, and estimate how a random representative of “the forum” would answer, where the random representative might be a random user or a random comment/post weighted by karma or amount of text
I fed your message to Scry! Seems the forum is significantly more longtermist.
99% karma-weighted of tagged posts about AI seems wrong
if you check the top 4 posts of all time, the 1st and 3rd are about FTX, the 2nd about earning to give and the 4th about health, totalling > 2k karma
might want to check for bugs
You’re so right, I should have caught that. The database data is good, but Opus 4.5 did a rare error. I’ve since updated the response.
Im still in awe of current AI capabilities
It’s hard to fathom what’s straightforwardly possible already.
Did a decent job for this academic paper, but I think it’s hampered by only having content from Arxiv and various EA/tech forums. Still, it generated some interesting leads.
https://gistcdn.githack.com/daaronr/b9447c40a7a6b948f399073496f98c37/raw/scanner_elasticity_experts.html
Prompt:
Trying this out for using for various Unjournal.org processes (like prioritizing research, finding potential evaluators, linking research to pivotal questions) and projects (assessing LLM vs human research evaluations). Some initial forays (comming from a conversation with Xyra). I still need to human-check it.
~prompt to Claude code about @Toby_Ord and How Well Does RL Scale?
``
Toby Ord's writing -- what do the clusters look like? What other research/experts come closest to his post ....https://forum.effectivealtruism.org/posts/TysuCdgwDnQjH3LyY/how-well-does-rl-scale``This interactive visual (with some extra prompts)
Key Takeaways
For “How Well Does RL Scale?”, the closest intellectual neighbors are:
ML researchers focused on compute efficiency and scaling laws (e.g. 1a3orn, Pablo Villalobos), and
AI safety researchers analyzing inference scaling implications (e.g. Ryan Kidd, ryan_greenblatt).
Stuart Armstrong appears to be the only researcher in this corpus writing substantively across all three of Toby Ord’s main intellectual themes:
AI scaling
Existential risk / longtermism
Infinite ethics and decision theory
Most Similar Researchers to“How Well Does RL Scale?”
Top Authors (by Semantic Similarity)
@1a3orn @Pablo Villalobos @Matrice Jacobine🔸🏳️⚧️ @Lukas Finnveden @Ryan Greenblatt @Ryan Kidd -- if you have a chance, let me know if this is accurate/relevant.
Most Similar arXiv Paper
Toby Ord’s Writing Clusters
Based on semantic analysis, Toby Ord’s writing appears to fall into three main thematic clusters.
Cluster 1: AI Scaling & Timelines (Recent)
“How Well Does RL Scale?” (Oct 2025)
Core claim: RL training scales poorly; most gains come from inference scaling (longer chains of thought), not training compute.
Key implication: This likely lengthens AI timelines and affects governance and safety strategies.
Cluster 2: Existential Risk & Longtermism
The Precipice and related work
“The Precipice Revisited” (Jul 2024)
“Shaping Humanity’s Longterm Trajectory” (Jul 2023)
“Robust Longterm Comparisons” (May 2024)
Topics include:
Existential risk estimation
Discounting
Anthropic effects / anthropic shadow
Cluster 3: Infinite Ethics & Decision Theory
“Evaluating the Infinite” (Sep 2025) — hyperreal-based approach to divergent sums and integrals
“What’s wrong with the dogmatic solution to recklessness?” (Feb 2023)
“The Ethics of Copying Conscious States” (Aug 2025)
Topics include:
Pascal’s Mugging
Unbounded utility
Decision-making under infinity
This was a linkpost, I didn’t write that paper.
Thanks, helpful to understand the limitations of this tool
Update: The service has been improving a lot. Infra is more robust, higher query limits, more sources, a further optimized Claude Prompt with much improved guidance for meaningful embedding vector manipulation. Someone could write some interesting quicktakes from the research you’re easily able to do with this.