Cluster deployment
This page describes how you can deploy a distributed Restate cluster.
Configuration
To deploy a distributed Restate cluster without external dependencies, you need to configure the following settings in your server configuration:
# Let every node run all rolesroles = ["metadata-server", "admin", "worker", "log-server"]# Every node needs to have a unique node namenode-name = "UNIQUE_NODE_NAME"# All nodes need to have the same cluster namecluster-name = "CLUSTER_NAME"# Make sure it does not conflict with the other nodesadvertised-address = "ADVERTISED_ADDRESS"# At most one node can be configured with auto-provision = trueauto-provision = false[bifrost]# Only the replicated Bifrost provider can be used in a distributed deploymentdefault-provider = "replicated"[bifrost.replicated-loglet]# Replicate the data to 2 nodes. This requires that the cluster has at least 2 nodes to# become operational. If the cluster has at least 3 nodes, then it can tolerate 1 node failure.default-replication-property = 2[metadata-server]# To tolerate node failures, use the embedded metadata servertype = "embedded"[metadata-store-client]# Use the embedded metadata storetype = "embedded"# List all the advertised addresses of the nodes that run the metadata-server roleaddresses = ["ADVERTISED_ADDRESS", "ADVERTISED_ADDRESS_NODE_X"][admin]# Make sure it does not conflict with the other nodesbind-address = "ADMIN_BIND_ADDRESS"[ingress]# Make sure it does not conflict with other nodesbind-address = "INGRESS_BIND_ADDRESS"[admin.query-engine]# Make sure it does not conflict with other nodespgsql-bind-address = "PGSQL_BIND_ADDRESS"
It is important that every Restate server you start has a unique node-name
specified.
All servers that are part of the cluster need to have the same cluster-name
specified.
At most one server can be configured with auto-provision = true
.
If no server is allowed to auto provision, then you have to manually provision the cluster.
Refer to the Cluster provisioning section for more information.
The log provider needs to be configured with default-provider = "replicated"
.
The default-replication-property
should be set to the number of servers that the data should be replicated to.
If you run at least 2 * default-replication-property - 1
servers, then the cluster can tolerate default-replication-property - 1
server failures.
See the log documentation for how to configure the log to tolerate up to n
server failures.
The metadata server type should be set to embedded
to tolerate server failures.
Every server that runs the metadata-server
role will join the metadata store cluster.
To tolerate n
metadata server failures, you need to run at least 2 * n + 1
Restate servers with the metadata-server
role configured.
The metadata-store-client
should be set to embedded
and configured with the advertised addresses of all servers that run the metadata-server
role.
Every Restate server that runs the worker
role will also run the ingress server and accept incoming invocations.
For those servers that run on the same machine, make sure that the ports do not conflict.
Cluster provisioning
Once you start the server that is configured with auto-provision = true
, it will provision the cluster so that other servers can join.
The provision step initializes the metadata store and writes the initial NodesConfiguration
with the initial cluster configuration to the metadata store.
In case none of the servers is allowed to auto-provision, then you need to provision the cluster manually via restatectl
.
restatectl cluster provision --addresses <SERVER_TO_PROVISION> --yes
This provisions the cluster with default settings specified in the server configuration. See the restatectl documentation for more information about how to provision and operate the cluster.
Growing the cluster
It is possible to grow the cluster after it has been started.
To do so, you need to start a new server with the same cluster-name
and at least one of the addresses of a running server in metadata-store-client.addresses
.
The latter is needed to let the server discover the metadata servers and join the cluster.
If you are using an external metadata store, then metadata-store-client
should point to the external metadata store.
If you plan to grow your cluster after some time, we strongly recommend that you enable snapshotting. Otherwise, you risk that newly added nodes won't be fully used by the system. See the snapshotting documentation for more information.
Currently, it is not supported to shrink the cluster. This means that servers that have joined the cluster once, cannot be stopped! We will add support for removing servers from the cluster soon.