In the comment above, Nathan Young has linked his Polis app.
To explain mechanically what seems to be going on in that page:
The website asks a bunch of questions and gets responses from people.
The answer to these questions creates a set of preferences for each person (their positive or negative reactions to each question).
It takes each person’s preferences, and compares this to other people, using some measure of “distance” (e.g. counting fraction of agreement or disagreement).
It then uses the distances to arrange people’s preferences onto a 2d grid (embeds the preferences onto a low dimensional space)
It then groups the preferences on the grid (“clusters”), like, by drawing circles around them.
This sounds complicated if your not a nerd, but it’s not, really! See footnotes[1][2][3]!
There’s very important details/subtleties/extensions here:
This relies on “high quality” questions that “interrogate and cover the space of opinions” “correctly”.
Even with perfect questions, small, subtle changes in the embedding algorithm or clustering will get very results. The envelope for different results could be further, dramatically larger if you can tweak the questions or their presentation.
In short, what you see isn’t necessarily “Truth” or even truth-seeking, and the groupings may not be “real”.
For the underlying methods, this isn’t hard to achieve. To calibrate, I think this is an example of unsupervised learning, and could come out of a CS200 class project. If you can code basic Python or start in “data science”, you could do this! Gates are open, come in!
The second and third notebook is a good introduction to clustering and uses k-means. But some find k-means a little too basic. Polis seems to be using density methods like HDBSCAN.
The embedding might be UMAP or TSNE. My guess is that the clustering is probably trivial, but there might be some subtle issues related to the embedding.
So for one, you can see groups of people and opinions. So you can now see “different camps” and “count opinions”.
This could be valuable and hard to achieve. In much social media, voting doesn’t take advantage of granular, individual preferences, so it’s hard to unravel deep opinions between groups, say on LW or the EA forum.
But counting camps is just the beginning, and probably isn’t the most valuable thing from this approach.
For one, under certain conditions, this granularity allows you to understand deep differences in opinions. This can improve communication and understanding in ways not available right now.
To see this, look at what actually happened on the page.
There’s two groups. For Group A:
The broad overview is that group A wants to see broader involvement of longtermism and ideas, in a public process.
However, Group A does not support the Bill, though I guess they have sentiment for something in this direction.
There’s also oddball questions that group A reacted to. You should be skeptical about the signal to noise of information from these questions, but the implication might be that group A has positive views on the role/size/value of government.
For the other group, Group B:
Group B has different preferences and do not want to involve government, not in a broad way that has the government highly engaged or powerful in longtermism or vice versa. One explanation *might* be that it distrusts government entry or control on the issues (presumably related to dysfunction, misuse, capture, drift, and dilution).
However, much of Group B also has “different preferences about x-risk”. This suggests other interpretations about their views. There’s a bit of mischief here, but this isn’t that important.
Again, the groups might not be real. It’s unclear what the groups represent. This might be a flaw that is endemic to this approach, or it could be fixed in some way.
There’s more:
Instead of naively embedding preferences and creating the groups, you can “take actions that encourage grouping along an existing structure or ontology”. If done correctly, the consequent content might be more useful and even more truthful.
I think this system like most approaches, requires high quality interpretation. One implication is that this probably puts pressure on or cuts out a niche for “punditry”. This can be good or bad.
All voting in some sense flattens complex issues into a simple, literally one dimensional axis. Because it builds on voting, Polis can’t completely escape this issue, but tries to alleviate them by interrogating the space with more questions. Doing this runs into other issues, e.g. curse of dimensionality. I think this is why question design and UX choices are deceptively important.
Implementation involves other sensitivities/issues (e.g. flavors of panopticon a la Bentham, Foucault or something). It will be interesting to talk to some informed people about this.
Wait, what? This is awesome.
In the comment above, Nathan Young has linked his Polis app.
To explain mechanically what seems to be going on in that page:
The website asks a bunch of questions and gets responses from people.
The answer to these questions creates a set of preferences for each person (their positive or negative reactions to each question).
It takes each person’s preferences, and compares this to other people, using some measure of “distance” (e.g. counting fraction of agreement or disagreement).
It then uses the distances to arrange people’s preferences onto a 2d grid (embeds the preferences onto a low dimensional space)
It then groups the preferences on the grid (“clusters”), like, by drawing circles around them.
This sounds complicated if your not a nerd, but it’s not, really! See footnotes[1][2][3]!
There’s very important details/subtleties/extensions here:
This relies on “high quality” questions that “interrogate and cover the space of opinions” “correctly”.
Even with perfect questions, small, subtle changes in the embedding algorithm or clustering will get very results. The envelope for different results could be further, dramatically larger if you can tweak the questions or their presentation.
In short, what you see isn’t necessarily “Truth” or even truth-seeking, and the groupings may not be “real”.
For the underlying methods, this isn’t hard to achieve. To calibrate, I think this is an example of unsupervised learning, and could come out of a CS200 class project. If you can code basic Python or start in “data science”, you could do this! Gates are open, come in!
Here are some notebooks:
https://tonio73.github.io/data-science/cnn/CnnVsDense-Part2-Visualization.html
https://ipython-books.github.io/88-detecting-hidden-structures-in-a-dataset-with-clustering/
https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html
https://towardsdatascience.com/a-gentle-introduction-to-hdbscan-and-density-based-clustering-5fd79329c1e8?gi=9c53e5cef389
The second and third notebook is a good introduction to clustering and uses k-means. But some find k-means a little too basic. Polis seems to be using density methods like HDBSCAN.
The embedding might be UMAP or TSNE. My guess is that the clustering is probably trivial, but there might be some subtle issues related to the embedding.
So why do this?
Here’s the actual results.
So for one, you can see groups of people and opinions. So you can now see “different camps” and “count opinions”.
This could be valuable and hard to achieve. In much social media, voting doesn’t take advantage of granular, individual preferences, so it’s hard to unravel deep opinions between groups, say on LW or the EA forum.
But counting camps is just the beginning, and probably isn’t the most valuable thing from this approach.
For one, under certain conditions, this granularity allows you to understand deep differences in opinions. This can improve communication and understanding in ways not available right now.
To see this, look at what actually happened on the page.
There’s two groups. For Group A:
The broad overview is that group A wants to see broader involvement of longtermism and ideas, in a public process.
However, Group A does not support the Bill, though I guess they have sentiment for something in this direction.
There’s also oddball questions that group A reacted to. You should be skeptical about the signal to noise of information from these questions, but the implication might be that group A has positive views on the role/size/value of government.
For the other group, Group B:
Group B has different preferences and do not want to involve government, not in a broad way that has the government highly engaged or powerful in longtermism or vice versa. One explanation *might* be that it distrusts government entry or control on the issues (presumably related to dysfunction, misuse, capture, drift, and dilution).
However, much of Group B also has “different preferences about x-risk”. This suggests other interpretations about their views. There’s a bit of mischief here, but this isn’t that important.
Here is the full report: https://pol.is/report/r8ef5zucaxxvtzkiym69b
Also Charles, let’s have a call some time! https://calendly.com/nathanpmyoung/omni (anyone who reads this far is also welcome)
Again, the groups might not be real. It’s unclear what the groups represent. This might be a flaw that is endemic to this approach, or it could be fixed in some way.
There’s more:
Instead of naively embedding preferences and creating the groups, you can “take actions that encourage grouping along an existing structure or ontology”. If done correctly, the consequent content might be more useful and even more truthful.
I think this system like most approaches, requires high quality interpretation. One implication is that this probably puts pressure on or cuts out a niche for “punditry”. This can be good or bad.
All voting in some sense flattens complex issues into a simple, literally one dimensional axis. Because it builds on voting, Polis can’t completely escape this issue, but tries to alleviate them by interrogating the space with more questions. Doing this runs into other issues, e.g. curse of dimensionality. I think this is why question design and UX choices are deceptively important.
Implementation involves other sensitivities/issues (e.g. flavors of panopticon a la Bentham, Foucault or something). It will be interesting to talk to some informed people about this.
I think they are really useful and would recommend the forum had some kind of Polis functionality.