Cold start and SLA

I am trying to build a telco application. Is there any guarantee from kalix or SLA for 99.99 or similiar? and is there any cold start in kalix and if so, how long is the latency for the cold start? How long can a function instance live, for example in aws lambda, user function instance has 15 minutes time to live before it is destroyed.

We do have an SLA for paying customers, though off the top of my head I’m not sure on the exact conditions.

At present, there is no cold start in Kalix, and deployments are not destroyed on a schedule. It’s possible that we may add scale to zero features in future, though Kalix’s stateful nature means that unlike something like AWS Lambda, starting up and destroying instances of a deployment is not free - a Kalix service, when it starts up, forms a cluster, and that cluster gossips information such as what entities are sharded to which nodes etc. Adding or removing nodes from the cluster requires rebalancing the sharding state and other state, as a result, it’s counter-productive to start/stop nodes on whim like AWS Lambda does, we have to scale up and down more slowly. And, as for cold starting the cluster, in order to guarantee that only one cluster is ever formed (if two clusters were bootstrapped, you’d end up with big consistency issues in the state), a cold start takes around 10 seconds, which is too long for scale-to-zero.

When we investigate scale-to-zero, we’ll likely do something like have very low resource pilot cluster nodes that just queue requests that they receive until higher resource worker nodes have started and joined the cluster, that way we can keep the cluster always bootstrapped, and avoid the long cold starts. But that’s an optimisation that we hope to add later.

1 Like

Thanks for the answer, If there is at least 1 instance available, what about the cost? does it include in computing cost?

Has this feature been implemented? will there be a cold start issue in that case?