What about the ability to “pass vote” ie say this is neutral. Then you could look at the % of upvotes out of all votes. Feels like that would be more accurate.
Or just look at the ratio of karma to views/reads. A high karma-to-view ratio suggests a good post with a boring title which deserves more visibility.
It looks like Hacker News uses the comment-to-score ratio for flame-war detection.
We were recently asked about the posts we found most valuable over the course of 2022. I wonder what a machine learning algorithm tasked with predicting “most valuable” status from a few simple features like karma-to-view ratio or upvote/downvote ratio would find. (Presumably, the majority of posts were not marked as “most valuable”, so you’d need a solution to the class imbalance problem—I suggest increasing the weight of posts marked as “most valuable” in the loss function, to reflect the fact that false negatives are costly. Also, you might want to Bayes-adjust your features / have a prior that needs to be overcome, to avoid over-updating on the first few data points which come in regarding a new post.)
What about the ability to “pass vote” ie say this is neutral. Then you could look at the % of upvotes out of all votes. Feels like that would be more accurate.
Or just look at the ratio of karma to views/reads. A high karma-to-view ratio suggests a good post with a boring title which deserves more visibility.
It looks like Hacker News uses the comment-to-score ratio for flame-war detection.
We were recently asked about the posts we found most valuable over the course of 2022. I wonder what a machine learning algorithm tasked with predicting “most valuable” status from a few simple features like karma-to-view ratio or upvote/downvote ratio would find. (Presumably, the majority of posts were not marked as “most valuable”, so you’d need a solution to the class imbalance problem—I suggest increasing the weight of posts marked as “most valuable” in the loss function, to reflect the fact that false negatives are costly. Also, you might want to Bayes-adjust your features / have a prior that needs to be overcome, to avoid over-updating on the first few data points which come in regarding a new post.)