I might write a longer summary at some point, but some brief thoughts on how this week went:
Overall, I think the first ~3 days were a good use of my time, but I’m not sure about the last one or two. From the perspective of “understanding what it’s like to be a junior safety researcher” I feel like I got most of the learning within three days.
I managed to come up with a solution which handled the breakers of the regularizers listed in the original document, though it was subject to a very analogous breaker. I don’t feel like humanity is noticeably closer to solving the alignment problem by virtue of my solution, but I think I would’ve estimated ~1/3 chance that I would make even this little amount of progress before the week started. (Mostly calibrating off Ajeya saying that it would have taken her a full week on expectation, and assuming I’m substantially less qualified than her.) So overall I feel relatively happy with my work.
I feel more optimistic about humanity’s ability to solve the alignment problem now. Partially this is a reflection of me having recently been reading Eliezer’s debates, where he presents a very pessimistic view of our chance of success.
This contest seems like a really great opportunity for people to get involved in alignment research, and I’m very grateful to the ARC team for running it.
There are not very many people who could run contest like this, and I assume that in a couple weeks the ARC team is (justifiably) going to go back to doing their research. I feel sad that there aren’t more people who could run a contest like this, and I’m not sure how to create more of them. If others have thoughts on how I/CEA could do this, I would be very interested in hearing them!
I might write a longer summary at some point, but some brief thoughts on how this week went:
Overall, I think the first ~3 days were a good use of my time, but I’m not sure about the last one or two. From the perspective of “understanding what it’s like to be a junior safety researcher” I feel like I got most of the learning within three days.
I managed to come up with a solution which handled the breakers of the regularizers listed in the original document, though it was subject to a very analogous breaker. I don’t feel like humanity is noticeably closer to solving the alignment problem by virtue of my solution, but I think I would’ve estimated ~1/3 chance that I would make even this little amount of progress before the week started. (Mostly calibrating off Ajeya saying that it would have taken her a full week on expectation, and assuming I’m substantially less qualified than her.) So overall I feel relatively happy with my work.
I feel more optimistic about humanity’s ability to solve the alignment problem now. Partially this is a reflection of me having recently been reading Eliezer’s debates, where he presents a very pessimistic view of our chance of success.
This contest seems like a really great opportunity for people to get involved in alignment research, and I’m very grateful to the ARC team for running it.
There are not very many people who could run contest like this, and I assume that in a couple weeks the ARC team is (justifiably) going to go back to doing their research. I feel sad that there aren’t more people who could run a contest like this, and I’m not sure how to create more of them. If others have thoughts on how I/CEA could do this, I would be very interested in hearing them!