From October-December 2024 I was looking for new job opportunities. To make the most of my time during this gap, I embarked on a little project with the aim of deepening my Rust experience implementing DDD and Microservices on top of a simple web UI in Rust with WASM. Also in this project, I explored the use of LLMs in my coding for the first time. I wrote daily updates and reflections on it, which can be found in the Project Repo. Below is a collection of my LinkedIn posts that summarize in a compact way the progress I made with the main takeaways.
The Domain which I used as example, was a simple Volleyball referee management tool, as I am very familiar with this domain. I have been Volleyball referee for many years in the past, as well as implemented a (much more complex) tool in Java (also following DDD) for the Vorarlberg Volleyball Association, which I used in production during my tenure as Volleyball referee manager of Vorarlberg.
I eventually settled on using Cursor and was extremely impressed by the way it supported me and its INSANE context awareness π₯ Particularly I was impressed by its ability to β generate boilerplate/bootstrapping code for backend REST handlers, and Leptos UI components with server requests. β help with project-wide refactoring and writing tests. It sometimes felt like refactoring changes and test cases were writing themselves - all I had to do was typing out some comments and some description in the README, and Cursor seemed to be able to derive all suggestions from that. Most worked out-of-the-box. β help with simple refactorings, such as renaming and changing types, with a better UX than the VSCode refactoring tooling. β help with writing documentation, by giving great suggestions based on the context and the first few words I wrote.
I have the impression that LLMs boost productivity of Software Engineers easily 2-10x, however to make most use of them you need to have a VERY clear understanding of what you want to do and the concepts behind it, otherwise you end up in a chaotic mess π
Therefore, I am not very concerned that LLMs/AI are gonna replace Software Engineers - after all what they do for you is come up with suggestions based on your thoughts you express via commands in chats and comments in code π€
This should not come as a big surprise, but after all, we must not forget that Software Engineering is not a production process but a learning process (see https://lnkd.in/enJSDdj). It is not about typing but about thinking. So LLMs are our friends, boosting our productivity, as they take away most of the tedious typing and give us more space and capacity to think π§
After 3 weeks of moderate work I have finished my tech prototype of monolithic DDD in Rust with a WASM Web UI using LLMs through Cursor. Here are my main takeaways
β When it comes to implementing DDD in Rust, the main challenge is how to implement the persistence layer. This is not very surprising, given that Rust doesn’t have such an amazing technology like Hibernate (yet), therefore a lot of thought had to be spent on how to abstract Repositories and implement transactional boundaries, while still allowing application services/functionality to be testable via unit tests by using mocks. π₯ Using WASM to implement a Web UI is very convenient as it allows to implement everything from backend to frontend in the same language without the need for something like OpenAPI. I am using Leptos and found it to be pretty straightforward, especially if you are familiar with React. β Cursor/the use of LLMs was a tremendous help and surpassed all my expectations. Although I started using it by asking to generate code up-front, I eventually settled on using it only to for its suggestions, giving it some contextual information either by starting to type something or opening files. However it also became very clear that without a clear plan you end up nowhere - you have to know where you want to go and do the thinking, LLMs only support you on the way.
I then sliced βοΈβοΈβοΈ the Monolith of my DDD Rust project into Microservices. Here are my main takeaways:
β I decided to slice every Aggregate and the corresponding DB table into a separate Microservice, which resulted in 6 services. β Each Microservice has its own DB, where I settled on Postgres for all of them. β Due to the fact that each service has now its own DB, there is no way of using JOINS, so the services need to resolve (forgein) IDs by requesting data from other services. I decided to use Redis to cache these requests to avoid excessive round-trips and improve performance. β I am using Kafka to broadcast Domain Events to which all Microservices are listening via consumer groups. To really put an emphasis on the point of scaling up, each Microservice has 2 instances running, with the Kafka topic having 2 partitions, therefore the Microservice instances are taking rounds on who processes the next topic in the same consumer group. β I added Jaeger for distributed tracing and observability which makes it very easy to follow what happens when and how long it takes. β For load-balancing between each of the 2 Microservices instance and as “API Gateway” I used Nginx.
I am very happy with Rusts abilities to implement Microservices: all the libraries I am using are of very high quality, have good examples and worked out of the box π₯ Docker plays a fundamental role in implementing such a Microservice project locally, as it makes spinning up 6 Postgres DBs and all other infrastructure extremely easy π€ I continued to use AI via the Cursor IDE and as already in the weeks before it was a tremendous help π€
The current solution however has a few subtle drawbacks: π There is a potential for processing the same Domain Event multiple times, so I need some form of dedup mechanism (see https://lnkd.in/dmFC28qc) π There is an edge case where committing the DB TX goes through but the Kafka committing fails for whatever reason - in this case we would lose the Domain Event, potentially leading to inconsistencies. π My current Saga implementation when committing Referee Assignments is not robust and results in inconsistencies in case of infrastructure failures.
I am currently reading up on Microservices to expand my knowledge of how to address the above drawbacks - in particular I am looking into the following books π π “Building Microservices, 2nd Edition” by Sam Newman. More high-level and conceptual. π “Microservices Patterns” by Chris Richardson. Very technical, goes into the dirty implementation details.
I feel that reading is a good approach to learning/deepening (new) concepts: start out coding by yourself, building some initial experience and when you feel you have exhausted where you can go with your knowledge, complement it with well-written books in the field π§
After one week of interviews and reading up π on the topic of resiliency in Microservices, it has become clearerπ‘now how to address the shortcomings of my Saga implementation in my Rust DDD Microservices project:
π Uwe Friedrichsens blog post “The Limits of the Saga Pattern” https://lnkd.in/dXF37WEi clearly states: “The Saga pattern can only be used to logically roll back transactions due to business errors. The Saga pattern cannot be used to respond to technical errors. This leaves the question: How can we deal with technical errors? In general, the only way is to strive for eventual completion. If you face a technical error, you need to retry in some (non-naive) way until you eventually overcome the error and your activity succeeds. This means retrying in combination with waiting, escalation strategies, etc.”
π The key of how to “strive for eventual completion” while still retaining the scaling benefits of Microservices is found in Pat Hellands fascinating but also arguably rather abstract paper “Life beyond Distributed Transactions” (see https://lnkd.in/d4EUmiwj) - it is to “Assume a transactional boundary of a single item”.
Essentially, the paper is saying that we need to reconcile the side-effect contexts of the DB and the Domain Event, that is, comitting the DB Tx and finalising consuming or producing Domain Events in Kafka - which is done by combining them into the DB Tx context π§
I have started to write a longer reflection on this which is still WIP (see https://lnkd.in/dXy9H3pa) but the main idea is the followingπ‘ β Assign UUIDs to each emitted Domain Event. β When emitting a Domain Event, rather than doing this directly via Kafka, store it in a new outbox table and commit the INSERT in the same DB Tx that makes changes to the Aggregate. β A new async processor goes through the unsent Domain Events of the outbox table and emits them via Kafka. Due to the different side-effect contexts we need to mark the Domain Event as sent in a separate DB Tx, but if the committing of this DB Tx fails after committing the Kafa offset, the only thing we can do is to retry. Therefore we end up with duplicate sends resulting in “at-least-once” Domain Event semantics. β On the receiving end, introduce an inbox table that INSERTS Domain Events consumed from Kafka in the same DB Tx which makes changes to the Aggregate. In this case, when committing the Kafka offset fails after the DB Tx committed successfully, the only thing to do is to retry the event consumption. However, we would be able to detect that we already processed the Domain Event by looking at the inbox table, and therefore skip re-processing the event, only committing the Kafka offset.
My plan is to start the implementation within the next days πͺ and continue my reading on Microservices, focusing on the more technical book “Microservices Patterns” by Chris Richardson as I expect a lot of details on the above issues there π₯
After landing and preparing for a new job (which I’m gonna start officially from 1st January onwards - stay tuned π₯) in the last weeks, I finally found time to wrap up my Microservices Rust project (see https://lnkd.in/dzAmiMdq) by refactoring it to Debezium πͺ
π In my last post (see https://lnkd.in/dTQcx_Zm) I briefly explained why and how I solved the problem of different transactional contexts across the Database and Kafka, which becomes relevant when you are emitting Domain Events from your Services. The approach I followed was the so called “Transactional Outbox Pattern”, see https://lnkd.in/dQ622Jdf for a high-level explanation from Chris Richardsons excellent “Microservices Patterns” book π§
π The issue with my approach was that it was a “hand-made” implementation, that relied on DB triggers and the notification system of Postgres. It worked but I was wondering, whether there exists something better, that comes out-of-the-box. I told myself “Surely, I can’t be the first one to solve this problem” β
Indeed, rather accidently while preparing for my new job, I ran into Debezium, which is a piece of infrastructure, that allows to implement the Outbox Pattern without resorting to DB triggers/polling/notification, taking away a lot of complexity in the code, shifting it to the infrastructure side π€. See this blog entry https://lnkd.in/dXPtAbk9 for an excellent technical explanation of the outbox pattern and how to implement it with Debezium π‘
The most complicated part turned out to be setting up and configuring Debezium, getting it to talk to Kafka and creating the corresponding Kafka topics - refactoring the code was straightforward thanks to Rusts type system and my E2E test coverage πͺ
π Concluding I would say that the usage of a solution like Debezium should always be a priority when implementing Microservices, because it is a very mature and proven technology - however one must be aware that although the code complexity is reduced, the infrastructural / deployment complexity goes up a lot and must not be underestimated.