Jan_Kulveit comments on Request for comments: EA Projects evaluation platform

Jan_Kulveit Mar 21, 2019, 12:07 AM
1 point
0 ∶ 0
As I’ve already explained in the draft, I’m still very confused by what
An individual reviewer and a board is much less likely to notice problems with a proposal than a broad discussion with many people contributing would …
should imply for the proposal. Do you suggest that steps 1b. 1d. 1e. are useless or harmful, and having just the forum discussion is superior?
The time of evaluators is definitely definitely definitely not free, and if you treat them as free then you end up exactly in the kind of situation that everyone is complaining about. Please respect those people’s time.
Generally I think this is quite strange misrepresentation of how I do value people’s time and attention. Also I’m not sure if you assume the time people spend arguing on fora is basically free or does not count, because it is unstructured.
From my perspective, having this be in the open makes it a lot easier for me and other funders in the space to evaluate whether the process is going well, whether it is useful, or whether it is actively clogging up the EA funding and evaluation space. Doing this in distinct stages, and with most of the process being opaque, makes it much harder to figure out the costs of this, and the broader impact it has on the EA community, moving the expected value of this into the net-negative.
Generally almost all of the process is open, so I don’t see what should be changed. If the complain is the process has stages instead of unstructured discussion, and this makes it less understandable for you, I don’t see why.
- Habryka Mar 21, 2019, 12:44 AM
  20 points
  0 ∶ 0
  Parent
  My overall sense of this is that I can imagine this process working out, but the first round of this should ideally just be run by you and some friends of yours, and should not require 100+ hours of volunteer time. My expectation is that after you spend 10 hours trying to actually follow this process, with just one or two projects, on your own or with some friends, that you will find that large parts of it won’t work as you expected and that the process you designed is a lot too rigid to produce useful evaluations.
- Habryka Mar 21, 2019, 12:33 AM
  13 points
  0 ∶ 0
  Parent
  As I’ve already explained in the draft, I’m still very confused by what [...] should imply for the proposal. Do you suggest that steps 1b. 1d. 1e. are useless or harmful, and having just the forum discussion is superior?
  I am suggesting that they are probably mostly superfluous, but more importantly, I am suggesting that a process that tries to separate the public discussion into a single stage, that is timeboxed at only a week, will prevent most of the value of public discussion, because there will be value from repeated back and forth at multiple stages in this process, and in particular value from integrating the step of finding a team for a project with the process of evaluating a proposal.
  To give you an example, I expect that someone will have an idea for a project that is somewhat complicated, and will write an application trying their best to explain it. I expect for the majority of projects the evaluators will misunderstand what the project is about (something I repeatedly experienced for project proposals on the LTF-Fund), and will then spend 2-5 hours writing a negative evaluation for a project that nobody thought was a good idea. The original person who proposed the project will then comment during the public discussion stage and try to clarify their idea, but since this process currently assigns most of the time for the evaluators and board members in the evaluation stage, there won’t be any real way in which he can cause the evaluators to reevaluate the proposal, since the whole process is done in batches and the evaluators only have that many hours set aside (and they already spend 2-5 hours on writing an evaluation of the proposal).
  On the other hand, if the evaluators are expected to instead participate mostly in a back-and-forth discussion over the course of a week, or maybe multiple weeks, then I think most likely the evaluators would comment with some initial negative impressions of the project which would probably be written in 5-10 minutes. The person writing the proposal would respond and clarify, and then the evaluator would ask multiple clarifying questions until they have a good sense of the proposal. Ideally, the person putting in the proposal would also be the person interested in working on it, and so this back-and-forth would also allow the evaluator to determine whether this person is a good fit for the project, and allow other people to volunteer their time to participate and help with the project. The thread itself would serve as the location for other people to find interesting projects to work on, and to get up to speed on who is working on what projects.
  ---
  I also think that assigning two evaluators to each project is a lot worse than assigning evaluators in general and allowing them to chime in when they have pre-existing models for projects. I expect that if they don’t have pre-existing models in the domain that a project is in, an evaluator will find it almost impossible to write anything useful about that project, without spending many hours building basic expertise in that domain. This again suggests a setup where you have an open pool of proposals, and a group of evaluators who freely choose which projects to comment on, instead of being assigned individual projects.
  - Jan_Kulveit Mar 21, 2019, 1:12 AM
    4 points
    0 ∶ 0
    Parent
    I don’t understand why you assume the proposal is intended as something very rigid, where e.g. if we find the proposed project is hard to understand, nobody would ask for clarification, or why you assume the 2-5h is some dogma. The back-and-forth exchange could also add to 2-5h.
    With assigning two evaluators to each project you are just assuming the evaluators would have no say in what to work on, which is nowhere in the proposal.
    Sorry but can you for a moment imagine also some good interpretation of the proposed schema, instead of just weak-manning every other paragraph?
    - Habryka Mar 21, 2019, 1:39 AM
      18 points
      0 ∶ 0
      Parent
      I am sorry for appearing to be weak-manning you. I think you are trying to solve a bunch of important problems that I also think are really important to work on, which is probably why I care so much about solving them properly and have so many detailed opinions about how to solve them. While I do think we have strong differences in opinion on this specific proposal, we probably both agree on a really large fraction of important issues in this domain, and I don’t want to discourage you from working in this domain, even if I do think this specific proposal is a bad idea.
      Back to the object level: I think as I understand the process, the stages have to necessarily be very rigid because they require the coordination of 5+ volunteers, a board, and a set of researchers in the community, each of which will have a narrow set of responsibilities like writing a single evaluation or having meetings that need to happen at a specific point in time.
      I think coordinating that number of people gives naturally rise to very rigid structures (I think even coordinating a group of 5 full-time staff is hard, and the amount of structure goes up drastically as individuals can spend less time), and your post explicitly says that step 1.c, is the step in which you expect back and forth with the person who proposed the project, making me think that you do not expect back and forth before that stage. And if you do expect back-and-forth before that stage, then I think it’s important that you figure out a way to make that as easy as possible, and given the difficulty of coordinating large numbers of people, I think if you don’t explicitly plan for making it easy, it won’t happen and won’t be easy.
      - Jan_Kulveit Mar 21, 2019, 2:07 AM
        3 points
        0 ∶ 0
        Parent
        I don’t see why continuous coordination of a team of about 6 people on slack would be very rigid, or why people would have very narrow responsibilities.
        For the panel, having some defined meeting and evaluating several projects at once seems time and energy conserving, especially when compared to the same set of people watching the forum often, being manipulated by karma, being in a way forced to reply to many bad comments, etc.
- Habryka Mar 21, 2019, 12:38 AM
  12 points
  0 ∶ 0
  Parent
  Generally almost all of the process is open, so I don’t see what should be changed. If the complain is the process has stages instead of unstructured discussion, and this makes it less understandable for you, I don’t see why.
  One part of the process that is not open is the way the evaluators are writing their proposals, which is as I understand it where the majority of person-time is being spent. It also seems that all the evaluations are going to be published in one big batch, making it so that feedback on the evaluation process would take until the complete next grant round to be acted on, which is presumably multiple months into the future.
  The other process that is not open are these two stages:
  1d. A panel will rate the proposal, utilizing the information gathered in phases b. and c., highlighting which part of the analysis they consider particularly important. (90m / project)
  1e. In case of disagreement among the panel, the question will get escalated and discussed with some of the more senior people in the field.
  I expect the time of the panel, as well as the time of the more senior people in the field are the most valuable resources that could be wasted by this process, and the current process gives very little insight into whether that time is well-spent or not. In a simple public forum setup, it would be easy to see whether the overall process is working, and whether the contributions of top people are making a significant difference.
  - Jan_Kulveit Mar 21, 2019, 12:50 AM
    2 points
    0 ∶ 0
    Parent
    With the first part, I’m not sure what would you imagine as the alternative—having access to evaluators google drive so you can count how much time they spent writing? The time estimate is something like an estimate how much it can take for volunteer evaluators—if all you need is in the order of 5m you are either really fast or not explaining your decisions.
    I expect much more time of experts will be wasted in forum discussions you propose.
    - Habryka Mar 21, 2019, 1:27 AM
      8 points
      0 ∶ 0
      Parent
      I think in a forum discussion, it’s relatively easy to see how much someone is participating in the discussion, and to get a sense of how much time they spent on stuff. I am not super confident that less time would be wasted in the forum discussions I am proposing, but I am confident that I and others would notice if lots of people’s time was wasted, which is something I am not at all confident about for your proposal and which strongly limits the downside for the forum case.
      - Jan_Kulveit Mar 21, 2019, 1:56 AM
        3 points
        0 ∶ 0
        Parent
        On the contrary: on slack, it is relatively easy to see the upper bound of attention spent. On the forum, you should look not on just the time spent to write comments, but also on the time and attention of people not posting. I would be quite interested how much time for example CEA+FHI+GPI employees spend reading the forum, in aggregate (I guess you can technically count this.)
        Habryka Mar 21, 2019, 2:04 AM
        4 points
        0 ∶ 0
        Parent
        *nods* I do agree that you, as the person organizing the project, will have some sense of how much time has been spent, but I think it won’t be super easy for you to communicate that knowledge, and it won’t by default help other people get better at estimating the time spent on things like this. It also requires everyone watching to trust you to accurately report those numbers, which I do think I do, but I don’t think everyone necessarily has reason to.
        I do think on Slack you also have to take into account the time of all the people not posting, and while I do think that there will be more time spent just reading and not writing on the EA Forum, I generally think the time spent reading is usually worth it for people individually (and importantly people are under no commitment to read things on the EA Forum, whereas the volunteers involved here would have a commitment to their role, making it more likely that it will turn out to be net-negative for them, though I recognize that there are some caveats where sometimes there are controversial topics that cause a lot of people to pay attention to make sure that nothing explodes).
- Habryka Mar 21, 2019, 12:41 AM
  1 point
  0 ∶ 0
  Parent
  Nevermind.