§Cassandra Read-Side support
This page is specifically about Lagom’s support for Cassandra read-sides. Before reading this, you should familiarize yourself with Lagom’s general read-side support.
§Query the Read-Side Database
Let us first look at how a service implementation can retrieve data from Cassandra.
import akka.NotUsed;
import com.lightbend.lagom.javadsl.api.ServiceCall;
import com.lightbend.lagom.javadsl.persistence.cassandra.CassandraSession;
import java.util.concurrent.CompletableFuture;
import javax.inject.Inject;
import akka.stream.javadsl.Source;
public class BlogServiceImpl implements BlogService {
private final CassandraSession cassandraSession;
@Inject
public BlogServiceImpl(CassandraSession cassandraSession) {
this.cassandraSession = cassandraSession;
}
@Override
public ServiceCall<NotUsed, Source<PostSummary, ?>> getPostSummaries() {
return request -> {
Source<PostSummary, ?> summaries = cassandraSession.select(
"SELECT id, title FROM blogsummary;").map(row ->
new PostSummary(row.getString("id"), row.getString("title")));
return CompletableFuture.completedFuture(summaries);
};
}
}
Note that the CassandraSession is injected in the constructor. CassandraSession
provides several methods in different flavors for executing queries. The one used in the above example returns a Source
, i.e. a streamed response. There are also methods for retrieving a list of rows, which can be useful when you know that the result set is small, e.g. when you have included a LIMIT
clause.
All methods in CassandraSession
are non-blocking and they return a CompletionStage
or a Source
. The statements are expressed in Cassandra Query Language (CQL) syntax. See Querying tables for information about CQL queries.
§Update the Read-Side
We need to transform the events generated by the Persistent Entities into database tables that can be queried as illustrated in the previous section. For that we will implement a ReadSideProcessor with assistance from the CassandraReadSide support component. It will consume events produced by persistent entities and update one or more tables in Cassandra that are optimized for queries.
This is how a ReadSideProcessor class looks like before filling in the implementation details:
import akka.Done;
import com.datastax.driver.core.BoundStatement;
import com.datastax.driver.core.PreparedStatement;
import com.lightbend.lagom.javadsl.persistence.AggregateEventTag;
import com.lightbend.lagom.javadsl.persistence.ReadSideProcessor;
import com.lightbend.lagom.javadsl.persistence.cassandra.CassandraReadSide;
import com.lightbend.lagom.javadsl.persistence.cassandra.CassandraSession;
import org.pcollections.PSequence;
import javax.inject.Inject;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.CompletionStage;
import static com.lightbend.lagom.javadsl.persistence.cassandra.CassandraReadSide.*;
public class BlogEventProcessor extends ReadSideProcessor<BlogEvent> {
private final CassandraSession session;
private final CassandraReadSide readSide;
@Inject
public BlogEventProcessor(CassandraSession session, CassandraReadSide readSide) {
this.session = session;
this.readSide = readSide;
}
@Override
public ReadSideProcessor.ReadSideHandler<BlogEvent> buildHandler() {
// TODO build read side handler
return null;
}
@Override
public PSequence<AggregateEventTag<BlogEvent>> aggregateTags() {
// TODO return the tag for the events
return null;
}
}
You can see that we have injected the Cassandra session and Cassandra read-side support, these will be needed later.
You should already have implemented tagging for your events as described in the Read-Side documentation, so first we’ll implement the aggregateTags
method in our read-side processor stub, like so:
@Override
public PSequence<AggregateEventTag<BlogEvent>> aggregateTags() {
return BlogEvent.TAG.allTags();
}
§Building the read-side handler
The other method on the ReadSideProcessor
is buildHandler
. This is responsible for creating the ReadSideHandler that will handle events. It also gives the opportunity to run two callbacks, one is a global prepare callback, the other is a regular prepare callback.
CassandraReadSide has a builder
method for creating a builder for these handlers, this builder will create a handler that will automatically handle readside offsets for you. It can be created like so:
CassandraReadSide.ReadSideHandlerBuilder<BlogEvent> builder =
readSide.builder("blogsummaryoffset");
The argument passed to this method is the ID of the event processor that Lagom will use when it persists offsets to its offset store. The offset store is a Cassandra table, which will be created for you if it doesn’t exist. You can manually create this table yourself if you wish, the DDL for its creation is as follows:
CREATE TABLE IF NOT EXISTS offsetStore (
eventProcessorId text,
tag text,
timeUuidOffset timeuuid,
sequenceOffset bigint,
PRIMARY KEY (eventProcessorId, tag)
)
§Global prepare
The global prepare callback runs at least once across the whole cluster. It is intended for doing things like creating tables and preparing any data that needs to be available before read side processing starts. Read side processors may be sharded across many nodes, and so tasks like creating tables should usually only be done from one node.
The global prepare callback is run from an Akka cluster singleton. It may be run multiple times - every time a new node becomes the new singleton, the callback will be run. Consequently, the task must be idempotent. If it fails, it will be run again using an exponential backoff, and the read side processing of the whole cluster will not start until it has run successfully.
Of course, setting a global prepare callback is completely optional, you may prefer to manage Cassandra tables manually, but it is very convenient for development and test environments to use this callback to create them for you.
Below is an example method that we’ve implemented to create tables:
private CompletionStage<Done> createTable() {
return session.executeCreateTable("CREATE TABLE IF NOT EXISTS blogsummary ( " +
"id TEXT, title TEXT, PRIMARY KEY (id))");
}
It can then be registered as the global prepare callback in the buildHandler
method:
builder.setGlobalPrepare(this::createTable);
§Prepare
In addition to the global prepare callback, there is also a prepare callback. This will be executed once per shard, when the read side processor starts up. It can be used for preparing statements in order to optimize Cassandra’s handling of them.
Again this callback is optional, here is an example of how to prepare a statement for updating the table:
private PreparedStatement writeTitle = null; // initialized in prepare
private CompletionStage<Done> prepareWriteTitle() {
return session.prepare("INSERT INTO blogsummary (id, title) VALUES (?, ?)")
.thenApply(ps -> {
this.writeTitle = ps;
return Done.getInstance();
});
}
And then to register them:
builder.setPrepare(tag -> prepareWriteTitle());
§Event handlers
The event handlers take an event, and return a list of bound statements. Rather than executing updates in the handler itself, it is recommended that you return the statements that you want to execute to Lagom. This allows Lagom to batch those statements with the offset table update statement, which Lagom will then executed as a logged batch, which Cassandra executes atomically. By doing this you can ensure exactly once processing of all events, otherwise processing may be at least once.
Here’s an example callback for handling the PostAdded
event:
private CompletionStage<List<BoundStatement>> processPostAdded(BlogEvent.PostAdded event) {
BoundStatement bindWriteTitle = writeTitle.bind();
bindWriteTitle.setString("id", event.getPostId());
bindWriteTitle.setString("title", event.getContent().getTitle());
return completedStatements(Arrays.asList(bindWriteTitle));
}
This can then be registered with the builder using setEventHandler
:
builder.setEventHandler(BlogEvent.PostAdded.class, this::processPostAdded);
Once you have finished registering all your event handlers, you can invoke the build
method and return the built handler:
return builder.build();
§Underlying implementation
The CassandraSession
is using the Datastax Java Driver for Apache Cassandra.
Each ReadSideProcessor
instance is executed by an Actor that is managed by Akka Cluster Sharding. The processor consumes a stream of persistent events delivered by the eventsByTag
Persistence Query implemented by akka-persistence-cassandra. The tag corresponds to the tag
defined by the AggregateEventTag
.