Hi EA Forum! 👋 I’m Astelle Kay, a counseling-psych grad student who moonlights in alignment whenever coursework (and caffeine) allow. Most of my brain lives where clinical psychology, systems thinking, and “please-let-humanity-stick-around” concerns intersect.
TL;DR
The VSPE Framework (Validation → Submission → Positivity → Empowerment) began life as a four-step therapy framework I tested with friends and family.
Side-effect: it cuts “flattery loops”—the reflex to mirror praise instead of telling the hard truth.
That feels relevant to large language models.
I’m shrinking VSPE into a 25-prompt Flattery-Reduction Benchmark plus a plug-and-play license so labs can tinker without hiring an ethics PhD and a lawyer.
Would love feedback, collaborators, or regrantor eyeballs before this grows beyond “one grad student + a whiteboard.”
From couch to compute cluster
Therapy roots. Validate problems → Submit to what we can’t control → realistic Positivity → Empower next steps. Four verbs, no jargon.
Unexpected pattern. When prompted to utilize my framework and simply give “empathy without advice,” my tiny GPT-4 chatbot stopped telling me I was brilliant and started giving candid, prosocial answers.
Hey, that’s sycophancy. Anthropic’s Constitutional AI and several ARC evals flag flattering compliance as a safety risk. VSPE seemed to nudge the same dial.
Fast-forward. Provisional US patent filed (mainly so nobody locks VSPE away). Reached Stage 2 of the 2025 MATS selection. Now running a Manifund pilot — 25 prompts, $9.8 k, December read-out.
Why this might matter
Psych heuristics are under-used. RLHF /​ RLAIF optimise “helpful & harmless,” not ego management or praise addiction.
Audit-friendly. Four plain verbs: easy to port, easy to critique, zero secret sauce.
Bridge material. Therapy researchers rarely read AF; alignment folks rarely parse CBT manuals. VSPE tries to translate a sliver of each world.
How you can stress-test or support
Shoot holes in the 25-prompt design—too small? Wrong metric?
Name failure modes: Could VSPE blunt candor or creativity?
Point me to prior art so I can cite, not duplicate.
Regrant /​ co-fund if you like cheap, falsifiable pilots (Manifund link at the bottom of this post).
“Psychology and AI share a flaw: both love telling us exactly what we want to hear.” — sticky note above my desk
My hope: VSPE nudges future models toward frank, human-centred dialogue—first in micro-benchmarks, later (if it survives) in training loops.
Curious, sceptical, or just chasing cross-disciplinary rabbit holes? Drop a comment or DM. I’ll post code, data, and inevitable blooper reels as the project unfolds. More context at vspeframework.com.
This work is shared for educational and research purposes. For licensing, citation, or collaboration inquiries—especially for commercial or model development use—please contact Astelle Kay at astellekay@gmail.com.
From Therapy Tool to Alignment Puzzle-Piece: Introducing the VSPE Framework
Hi EA Forum! đź‘‹
I’m Astelle Kay, a counseling-psych grad student who moonlights in alignment whenever coursework (and caffeine) allow. Most of my brain lives where clinical psychology, systems thinking, and “please-let-humanity-stick-around” concerns intersect.
TL;DR
The VSPE Framework (Validation → Submission → Positivity → Empowerment) began life as a four-step therapy framework I tested with friends and family.
Side-effect: it cuts “flattery loops”—the reflex to mirror praise instead of telling the hard truth.
That feels relevant to large language models.
I’m shrinking VSPE into a 25-prompt Flattery-Reduction Benchmark plus a plug-and-play license so labs can tinker without hiring an ethics PhD and a lawyer.
Would love feedback, collaborators, or regrantor eyeballs before this grows beyond “one grad student + a whiteboard.”
From couch to compute cluster
Therapy roots.
Validate problems → Submit to what we can’t control → realistic Positivity → Empower next steps. Four verbs, no jargon.
Unexpected pattern.
When prompted to utilize my framework and simply give “empathy without advice,” my tiny GPT-4 chatbot stopped telling me I was brilliant and started giving candid, prosocial answers.
Hey, that’s sycophancy.
Anthropic’s Constitutional AI and several ARC evals flag flattering compliance as a safety risk. VSPE seemed to nudge the same dial.
Fast-forward.
Provisional US patent filed (mainly so nobody locks VSPE away). Reached Stage 2 of the 2025 MATS selection. Now running a Manifund pilot — 25 prompts, $9.8 k, December read-out.
Why this might matter
Psych heuristics are under-used. RLHF /​ RLAIF optimise “helpful & harmless,” not ego management or praise addiction.
Audit-friendly. Four plain verbs: easy to port, easy to critique, zero secret sauce.
Bridge material. Therapy researchers rarely read AF; alignment folks rarely parse CBT manuals. VSPE tries to translate a sliver of each world.
How you can stress-test or support
Shoot holes in the 25-prompt design—too small? Wrong metric?
Name failure modes: Could VSPE blunt candor or creativity?
Point me to prior art so I can cite, not duplicate.
Regrant /​ co-fund if you like cheap, falsifiable pilots (Manifund link at the bottom of this post).
My hope: VSPE nudges future models toward frank, human-centred dialogue—first in micro-benchmarks, later (if it survives) in training loops.
Curious, sceptical, or just chasing cross-disciplinary rabbit holes? Drop a comment or DM. I’ll post code, data, and inevitable blooper reels as the project unfolds. More context at vspeframework.com.
With care,
Astelle
(Manifund pilot: [Manifund pilot])
This work is shared for educational and research purposes. For licensing, citation, or collaboration inquiries—especially for commercial or model development use—please contact Astelle Kay at astellekay@gmail.com.