Embeddings are about $20 per billion tokens with voyage-4-lite ($.02/M) and I’ve spent like $500. The model seemed to be strong on all the properties of a good embedding model at a viable price point, and Voyage 4 embeddings have an interesting property where voyage-4-lite works with voyage-4-nano (calculable locally), voyage-4, and voyage-4-large when I feel like upgrading. My chunking strategy is semantically-aware (e.g. working to split on sentences, paragraphs, common delimiters), with a target tokens of 164, with about 20% overlap.
Searching across corpuses absolutely works as embedding are just a function of text/tokens. The compositionality of embeddings works amazingly too (e.g. debias_vector(@guilt_axis, guilt_topic) searching for guilty vibes without overindexing on stuff mentioning “guilt”), although there’s absolutely footguns and intuition that should be built for it (which I try to distill in prompts for agents, and also I have a prompt designed to help teach the exploration of embedding space).
Like this is basically a canonical research substrate—a well-indexed large corpus of high leverage data with ML embeddings queryable with SQL (Datalog would be better but agents don’t have as much experience with it and implementations don’t have great support for embeddings). It really would be nice to be able to get funding for this and to have a more abundance mindset to improve shipping velocity and get this substrate in front of more researchers (e.g. Coefficient Giving) to help with triage grantmaking in the singularity.
As for comparing to Elicit, it certainly offers users powers they couldn’t dream of having Elicit answer without them basically implementing the same thing, but Elicit of course has beautiful UIs which are more friendly to the human eye and workflows researchers are more familiar with. Elicit should basically provide this functionality to users, and Scry could afford to offer novel UIs for people, but I tend to be much more comfortable iterating on backend and API functionality than UIs (which I do have taste for, but it takes a lot of time).
Embeddings are about $20 per billion tokens with voyage-4-lite ($.02/M) and I’ve spent like $500. The model seemed to be strong on all the properties of a good embedding model at a viable price point, and Voyage 4 embeddings have an interesting property where voyage-4-lite works with voyage-4-nano (calculable locally), voyage-4, and voyage-4-large when I feel like upgrading. My chunking strategy is semantically-aware (e.g. working to split on sentences, paragraphs, common delimiters), with a target tokens of 164, with about 20% overlap.
Searching across corpuses absolutely works as embedding are just a function of text/tokens. The compositionality of embeddings works amazingly too (e.g. debias_vector(@guilt_axis, guilt_topic) searching for guilty vibes without overindexing on stuff mentioning “guilt”), although there’s absolutely footguns and intuition that should be built for it (which I try to distill in prompts for agents, and also I have a prompt designed to help teach the exploration of embedding space).
Like this is basically a canonical research substrate—a well-indexed large corpus of high leverage data with ML embeddings queryable with SQL (Datalog would be better but agents don’t have as much experience with it and implementations don’t have great support for embeddings). It really would be nice to be able to get funding for this and to have a more abundance mindset to improve shipping velocity and get this substrate in front of more researchers (e.g. Coefficient Giving) to help with triage grantmaking in the singularity.
As for comparing to Elicit, it certainly offers users powers they couldn’t dream of having Elicit answer without them basically implementing the same thing, but Elicit of course has beautiful UIs which are more friendly to the human eye and workflows researchers are more familiar with. Elicit should basically provide this functionality to users, and Scry could afford to offer novel UIs for people, but I tend to be much more comfortable iterating on backend and API functionality than UIs (which I do have taste for, but it takes a lot of time).