Writing an article for an annual review becomes a tradition for me. After writing for 6 consecutive years (link), I decided to continue this year, to share with you my journey in the tech industry, as a software engineer at Datadog, and as an explorer for other side projects during my free time. Hopefully, they will let you learn something new or inspire you to create your story. Now, let’s get started!
I joined Datadog at the end of 2019 as a software engineer working for the Event Platform. This year, I was part of the automation team, which improves the platform’s reliability, reduces toil, and facilitates engineers’ daily work. My contributions are around automation development and can be summarized in 4 parts:
- Operation workflows. Developing operation-related workflows which create or delete complex data pipelines or data stores. Before having these workflows, we had to create or delete them manually: the process was complex and time-consuming. By using workflows, we can achieve these actions more efficiently. The configuration, deployment, registration, notification, and other tasks are automated. They excel the platform’s goals, such as data-store migration, incident remediation, and data isolation improvement. The operation workflows are developed using the workflow engine Temporal. You can see the case study Temporal at Datadog by Kevin Devroede or the YouTube video Temporal at Datadog by Jacob LeGrone to learn more.
- Automation safety. To ensure the automation can be performed safely without any customer impact, I also developed a verification process, which collects data in different dimensions and makes decisions based on business requirements. Thanks to this framework, we were able to see the state of the whole system in one place and perform operations with great confidence. It allowed the storage team to migrate all data stores from the old system to the new one. The framework is also extensible: it’s easy to add additional deciders to make additional decisions.
- Automation utilities. Writing workflows can be very complex. We need different utilities to facilitate the development. This year, I developed a RESTful API client in Go, some frontend integration for workflows, a custom release mechanism, an analytics helper, some workflow templates, a Go package for config manipulation, and some other tools to simplify the development for other engineers. For example, a junior engineer was able to create a new workflow in less than 2 days during the hackathon, which was pretty cool!
- Other works. There are also other contributions made in different aspects: cross-team collaboration, being an interviewer, being an interrupt handler (IH), being on call, participating in RFC review and code review, documentation, presentation, and more.
Compared to last year, the main difference in my contributions is that I was more involved in system understanding, system design, planning, and team collaboration. These are beyond coding: I needed to understand what people need, how our team can help, how to design an extensible framework and more. It was great! For the coming year, I want to continue supporting the growth of the platform and other storage systems by bringing more solutions related to SRE and automation.
This year I put less energy into blogging and prioritized my work at Datadog. The blog was mainly about system design (5 posts). If I need to pick 3 articles to share with you, I want to pick this mini-series of SDK design, composed by the GitLab, Temporal, and Elasticsearch API client. They will help you to understand how to write an SDK and different aspects that you should care about when doing so:
- Internal Working of the GitLab API Go Client
- Internal Working of the Temporal Java Service Client
- Internal Working of the Elasticsearch Java High-Level REST Client
To improve the user experience, I developed several new features for my blog: rebranding the home page, adding the search capability, and introducing the post id. This wasn’t easy because each feature requires some work: for the home page, I need to get some inspiration from other websites, modify the Jekyll template and adjust some CSS. As for the search feature, I needed to set up a search service with Java and Elasticsearch, and I also needed to integrate with Datadog. But I loved this side project as it is useful for this community and helped me to gain more experience in development and operations.
What’s next? Last year I was too ambitious so this year I want to keep it simple 😀 I want to keep a stable delivery cadence and write 17 posts next year (1 post every 3 weeks). The content should focus on one or two main topics, e.g. microservices and system design. Also, I want to share these posts on Medium so that more people can subscribe easily.
Finance toolkit is a small library helping you to understand your personal financial situation by extracting, transforming, and aggregating transactions from different companies into a single place. The companies supported are: BNP Paribas, Boursorama, Revolut, and some others. It generates CSV files that can be used for data visualization. It was created in 2019 and written in Python by Jingwen Zheng and me, with some help from Mickaël Schoentgen. This year, we improved the Revolut integration, added support for multiple currencies (EUR, USD), brought some technical improvements (e.g. logging, testing, error handling), and open-sourced the project on GitHub. This tool helped us increase our savings by 7.3 times over the last 4 years.
Thank you for reading this article. In this article, I shared some projects that I did at Datadog as a software engineer, my blogging experience, and the finance toolkit that I open-sourced this year. Interested to know more? You can subscribe to the feed of my blog, follow me on Twitter or GitHub. Hope you enjoy this article, see you the next time!