My AI safety independent research engineer path so far
(Also posted at https://forum.effectivealtruism.org/posts/GG3qJ2koFqitkvE3b/my-ai-safety-independent-research-engineer-path-so-far )
This post aims to highlight mistakes and successes that I believe are important on my journey. It should serve as a guide to my past self and perhaps others in similar situations to help avoid the mistakes I made along the way.
I became interested in AI safety in 2018 when I conducted a career review using the 80K hours guide. I doubted my suitability for the AI safety path at the time, but in 2022, I was more certain and decided to transition from a software developer to an AI safety research engineer. This was a difficult decision because I was, and still am, uncertain about the common claims made regarding AI and its development. Especially concerning was how much I can contribute and to what extent it is a solvable, neglected problem. It is not easy to look back and remember everything, but in this post I’ve tried to compile a list of a few important mistakes and successes I’ve experienced.
Mistake 1, 2022: Underestimating the complexity of the transition and being too theoretical. At the start of 2022, I had several years of senior software development experience, and I planned about six months for the transition, aiming to secure a research engineer role in an AI lab that year. Reality turned out differently. Although I received support from Open Philanthropy and successfully completed an ML safety bootcamp, which provided me with a stipend, it was not a stable income, nor did it provide a clear understanding of my fit. Moreover, the geopolitical situation quickly became dire: in February 2022, the Russia-Ukraine war began, followed by mobilization. In the summer of 2022, I immigrated to another country while trying to extensively study the new field. The biggest drawback was that my English skills were insufficient for theoretical study. I learned that to read scientific literature, write notes, publish, and discuss, it is essential to effectively skim text and read at 200-300 words per minute due to the vast amount of information. Additionally, I spent too much time on the theoretical side of AI safety (reading the AGISF curriculum and taking notes), while I was clearly more suited for empirical engineering roles.
Mistake 2, 2023: Over-investment in a risky long-term project. I invested 453 hours over about 10-12 months in a risky project during AI Safety Camp in 2023. In hindsight, we should have stopped the project after the presentation at the end of the camp and published what we had at that time, because the investment of more months didn’t bring better results. After several months, we managed to write a paper that was accepted at a NeurIPS workshop, but I don’t think it was a good decision. I learned that it is better to publish many somewhat acceptable research reports or experiments that answer specific questions rather than attempt a longer, more comprehensive research project targeting multiple questions. It is essential to have good velocity, which is the number of research questions answered, and to try many things at the beginning (see OODA loops and the lean startup). The aim is to increase your visibility by publishing more and learning from your mistakes more quickly.
Successes in 2023 and 2024: I consider it a success to have been accepted into the UK AISI bounty program with two of my projects, one of which received the bounty while the other was postponed. It didn’t take much effort to write those two applications (only a few hours each) and implement the evaluation for one of them—about three weeks of work—but the outcome was more significant than my participation in the MATS program or the ARENA program, which required much larger time investments. This proves that in an environment where a few receive almost all the rewards (extremistan, as per Nassim Taleb), it is better to move quickly and learn from your mistakes. The ML field is highly competitive now, as seen in these claims, which I believe still hold true. Another fortunate choice I made was specializing in chain-of-thought encoded reasoning/communication. The field moves exceedingly fast, and your research might become outdated within weeks because some research group at Google or OpenAI conducted similar or related experiments. In that situation, for an independent researcher or a small, limited group of researchers, it is better to target small, feasible research directions. These research questions might be found in recent blog posts or papers, in their discussion or future work sections.
In short, my advice to my past self is this:
- Find a structured program (MATS, ARENA, MLSS) and work hard to be one of the best there.
- Start with small, feasible projects (1-2 months, not 6+ months) approved by senior researchers or the community.
- Find your specialization.
- Balance theory and practice 60/40, not 90/10.