Around IT in 256 seconds

#16: Akka: distributed actor-based toolkit for the JVM

September 23, 2020 | 3 Minute Read

Akka is a toolkit for building highly scalable, concurrent applications. It’s written in Scala and based on the ideas from Erlang. Its approach to achieve concurrency is quite radical. Rather than mutexes, semaphores and shared memory, Akka uses so-called actor model. An actor is a small, stateful object that doesn’t expose traditional methods. Instead, actors send and receive asynchronous messages with each other. There is no other way to interact with an actor. If you want an actor to do something or give you some information, message passing is the only way. Send a message, actor will receive it at some point in time, consume it and optionally send a response back.

There are some fundamental rules in Akka. First of all, no actor, ever, handles more than one message at a time. Actors are essentially single-threaded. If two messages arrive at the same time, they are put into a queue, known as mailbox. Once the first message is handled, the second one is delivered. Secondly, the deployment location of an actor is transparent. An actor can live in the same JVM but just as well it can be deployed in a different data center. As long as you understand asynchronous nature and at-most-once guarantees, it makes no difference. Akka either passes a message in-process or serializes it over the network. This feature allows scaling out Akka applications without changing a single line of code.

I said that actors are single-threaded. As a matter of fact you could implement Akka by creating a new thread for each actor. But that defeats the purpose of this project. Akka encourages non-blocking actors. This means that handling a message should not block on I/O. For example when you make an HTTP request, you don’t wait for a response. Once it’s available, your actor will get notified. By an incoming message, of course. Exploiting this feature means you can manage hundreds of thousands of actors with just a few threads. Ideally, matching the number of CPU cores. An actor per request or per user is not unusual. Memory footprint of an actor is measured in kilobytes. A thread is more like a megabyte.

If you must block inside an actor or your workload is CPU bound, Akka is there for you. A special load-balancer dispatches work in between multiple instances of your actor class. And guess what? These instances can be scaled out onto multiple machines!

Another fundamental concept of Akka is fault tolerance. “Have you tried turning it off and on again?” Akka is all about that! When an actor crashes with an exception, by default it’s restarted. Chances are that a fresh instance will recover. But it gets even better! Actors form a hierarchy. Optionally, when an actor crashes, its parent and all siblings can be restarted as well. Just in case. This can propagate further up.

I worked on Akka clusters deployed on thousands of machines, distributing financial computations. However, the programming model is quite unusual. Despite the fact that it is close to original concept of object-oriented programing in Smalltalk. It can do wonders in terms of scalability, but maintaining a large codebase can be challenging. Also troubleshooting Akka, even on a single node, is not very straightforward.

Historically actors were not type-safe. You could literally send anything to an actor, hoping it can handle it. And you had no way of knowing if it succeeded, because messages are asynchronous. These days Akka has typed actors which helps a lot.

That’s it, thanks for listening, bye!

More materials

Tags: actor-model, akka, erlang, smalltalk

Be the first to listen to new episodes!

To get exclusive content: