We (Austin Chen, Caleb Parikh, and I) built an app! You can test the app out if you’re writing a grant application! You can put in sections of your grant application** and the app will try to give constructive feedback about your applicants. Right now we’re focused on the “Track Record” and “Project Goals” section of the application. (The main hope is to save back-and-forth-time between applicants and grantmakers by asking you questions that grantmakers might want to ask.
Austin, Caleb, and I hacked together a quick app as a fun experiment in coworking and LLM apps. We wanted a short project that we could complete in ~a day. Working on it was really fun! We mostly did it for our own edification, but we’d love it if the product is actually useful for at least a few people in the community!
As grantmakers in AI Safety, we’re often thinking about how LLMs will shape the future; the idea for this app came out of brainstorming, “How might we apply LLMs to our own work?”. We reflected on common pitfalls we see in grant applications, and I wrote a very rough checklist/rubric and graded some Manifund/synthetic applications against the rubric. Caleb then generated a small number of few shot prompts by hand and then used LLMs to generate further prompts for different criteria (e.g., concreteness, honesty, and information on past projects) using a “meta-prompting” scheme. Austin set up a simple interface in Streamlit to let grantees paste in parts of their grant proposals. All of our code is open source on Github (but not open weight 😛).***
This is very much a prototype, and everything is very rough, but please let us know what you think! If there’s sufficient interest, we’d be excited about improving it (e.g., by adding other sections or putting more effort into prompt engineering). To be clear, the actual LLM feedback isn’t necessarily good or endorsed by us, especially at this very early stage. As usual, use your own best judgment before incorporating the feedback.
*Credit to Saul for the name, who originally got the Ulysses S. Grant pun from Scott Alexander.
** Note: Our app will not be locally saving your data. We are using the OpenAI API for our LLM feedback. OpenAI says that it won’t use your data to train models, but you may still wish to be cautious with highly sensitive data anyway.
*** Linch led a discussion on the potential capabilities insights of our work, but we ultimately decided that it was asymmetrically good for safety; if you work on a capabilities team at a lab, we ask that you pay $20 to LTFF before you look at the repo.
For fun, I put one of my (approved) lightspeed applications through the app. This isn’t a great test because Lightspeed told people to do crude applications and they’d reach out with questions if they had any. Additionally, the grantmakers already knew me and had expressed verbal interest in the project. But maybe it’s still a useful data point.
[2 projects omitted for client privacy, but were included with permission in the original application]
Quantified review of the costs of iron deficiency, which motivated 10s of ex-risk workers to get nutrition testing and treatment.
HONEST AND ACCURACY 4-7/10
I forgot to record the details for the first run (which got a 4 or 5⁄10), and when I reran the same text I got a 7⁄10. The 7⁄10 review says: “The applicant has demonstrated a strong ability to conduct quantified risk assessments in important health areas. The specific mention of influencing ex-risk workers to seek treatment shows a practical impact. More detail on how these studies relate specifically to the project goals would enhance this section”
I’m a little annoyed at the name of this section, when language analysis can’t possibly check if my statements about my own work are truthful or accurate. Seems like it might mean details?
Because the input doesn’t allow links, it’s missing a lot of the information I’m presenting. OTOH, I think I could be reasonably docked for concision here, since grantmakers unfamiliar with my work are unlikely to click through 5 links and read long, weedy posts.
The wide spread on runs that at most different in white space is 🤨 .
PAST FUNDING: 2-3/10
“The description provides information about past projects but does not specify if any were funded, who the funders were, or mention any funding amounts. Mentioning previous funding and linking outcomes directly to that funding would offer insight into financial support effectiveness”
This is fair and useful. I got away with the omission this time because one of those projects was funded by a different org but the same person, but under any other circumstance a service pointing out the omission would have been a big service.
PAST PROJECTS 3-4/10
“The applicant outlines several projects focused on evaluating health-related risks, demonstrating thoroughness in research by using quantified methods. However, the input lacks reflections on the challenges faced or how past failures have influenced methodologies or outcomes in these projects. This addition would enhance the understanding of the applicant’s capacity for critical analysis and adaptation.”
I don’t remember the Lightspeed application asking for those details. The current LTFF application doesn’t make it obvious to me that they want those details either. So the analysis is useful but I also think there are problems with the question on application.
BRAGGING 5-6/10
“The input briefly mentions some accomplishments like quantified risk reviews and effects on health behaviors. However, it fails to explicitly connect these accomplishments to the specific goals or relevance of the proposed project, making it less impactful. More detailed explanations of how these accomplishments relate to and enhance the project’s goals would improve the proposal.”
This is true and would obviously be important if the grant maker wasn’t already familiar with my work.
CONTEXT 4⁄10
“The submission outlines various reviews related to substance use risks and nutritional deficiencies, yet fails to provide enough detail about the methodologies employed or the significance of the findings. For fund managers unfamiliar with these topics, clearer illustrations of impact or relevance, such as how the information was used by policymakers or healthcare providers, would be beneficial.”
Yup, true and useful if the grantmakers didn’t already have a lot of context.
Oh, I quite like the idea of having the AI score the writing on different rubrics. I’ve been thinking about how to better use LLMs on LW and the AI Alignment Forum, and I hadn’t considered rubric scoring so far, and might give it a shot as a feature to maybe integrate.
Introducing Ulysses*, a new app for grantseekers.
We (Austin Chen, Caleb Parikh, and I) built an app! You can test the app out if you’re writing a grant application! You can put in sections of your grant application** and the app will try to give constructive feedback about your applicants. Right now we’re focused on the “Track Record” and “Project Goals” section of the application. (The main hope is to save back-and-forth-time between applicants and grantmakers by asking you questions that grantmakers might want to ask.
Austin, Caleb, and I hacked together a quick app as a fun experiment in coworking and LLM apps. We wanted a short project that we could complete in ~a day. Working on it was really fun! We mostly did it for our own edification, but we’d love it if the product is actually useful for at least a few people in the community!
As grantmakers in AI Safety, we’re often thinking about how LLMs will shape the future; the idea for this app came out of brainstorming, “How might we apply LLMs to our own work?”. We reflected on common pitfalls we see in grant applications, and I wrote a very rough checklist/rubric and graded some Manifund/synthetic applications against the rubric. Caleb then generated a small number of few shot prompts by hand and then used LLMs to generate further prompts for different criteria (e.g., concreteness, honesty, and information on past projects) using a “meta-prompting” scheme. Austin set up a simple interface in Streamlit to let grantees paste in parts of their grant proposals. All of our code is open source on Github (but not open weight 😛).***
This is very much a prototype, and everything is very rough, but please let us know what you think! If there’s sufficient interest, we’d be excited about improving it (e.g., by adding other sections or putting more effort into prompt engineering). To be clear, the actual LLM feedback isn’t necessarily good or endorsed by us, especially at this very early stage. As usual, use your own best judgment before incorporating the feedback.
*Credit to Saul for the name, who originally got the Ulysses S. Grant pun from Scott Alexander.
** Note: Our app will not be locally saving your data. We are using the OpenAI API for our LLM feedback. OpenAI says that it won’t use your data to train models, but you may still wish to be cautious with highly sensitive data anyway.
*** Linch led a discussion on the potential capabilities insights of our work, but we ultimately decided that it was asymmetrically good for safety; if you work on a capabilities team at a lab, we ask that you pay $20 to LTFF before you look at the repo.
For fun, I put one of my (approved) lightspeed applications through the app. This isn’t a great test because Lightspeed told people to do crude applications and they’d reach out with questions if they had any. Additionally, the grantmakers already knew me and had expressed verbal interest in the project. But maybe it’s still a useful data point.
My Track Record section
HONEST AND ACCURACY 4-7/10
I forgot to record the details for the first run (which got a 4 or 5⁄10), and when I reran the same text I got a 7⁄10. The 7⁄10 review says: “The applicant has demonstrated a strong ability to conduct quantified risk assessments in important health areas. The specific mention of influencing ex-risk workers to seek treatment shows a practical impact. More detail on how these studies relate specifically to the project goals would enhance this section”
I’m a little annoyed at the name of this section, when language analysis can’t possibly check if my statements about my own work are truthful or accurate. Seems like it might mean details?
Because the input doesn’t allow links, it’s missing a lot of the information I’m presenting. OTOH, I think I could be reasonably docked for concision here, since grantmakers unfamiliar with my work are unlikely to click through 5 links and read long, weedy posts.
The wide spread on runs that at most different in white space is 🤨 .
PAST FUNDING: 2-3/10
“The description provides information about past projects but does not specify if any were funded, who the funders were, or mention any funding amounts. Mentioning previous funding and linking outcomes directly to that funding would offer insight into financial support effectiveness”
This is fair and useful. I got away with the omission this time because one of those projects was funded by a different org but the same person, but under any other circumstance a service pointing out the omission would have been a big service.
PAST PROJECTS 3-4/10
“The applicant outlines several projects focused on evaluating health-related risks, demonstrating thoroughness in research by using quantified methods. However, the input lacks reflections on the challenges faced or how past failures have influenced methodologies or outcomes in these projects. This addition would enhance the understanding of the applicant’s capacity for critical analysis and adaptation.”
I don’t remember the Lightspeed application asking for those details. The current LTFF application doesn’t make it obvious to me that they want those details either. So the analysis is useful but I also think there are problems with the question on application.
BRAGGING 5-6/10
“The input briefly mentions some accomplishments like quantified risk reviews and effects on health behaviors. However, it fails to explicitly connect these accomplishments to the specific goals or relevance of the proposed project, making it less impactful. More detailed explanations of how these accomplishments relate to and enhance the project’s goals would improve the proposal.”
This is true and would obviously be important if the grant maker wasn’t already familiar with my work.
CONTEXT 4⁄10
“The submission outlines various reviews related to substance use risks and nutritional deficiencies, yet fails to provide enough detail about the methodologies employed or the significance of the findings. For fund managers unfamiliar with these topics, clearer illustrations of impact or relevance, such as how the information was used by policymakers or healthcare providers, would be beneficial.”
Yup, true and useful if the grantmakers didn’t already have a lot of context.
Oh, I quite like the idea of having the AI score the writing on different rubrics. I’ve been thinking about how to better use LLMs on LW and the AI Alignment Forum, and I hadn’t considered rubric scoring so far, and might give it a shot as a feature to maybe integrate.