Tezos Consensus Signing with AWS Lambda, DynamoDB and KMS

MIDL.dev
6 min readOct 26, 2023

--

by Nicolas Ochem

TLDR: we introduce a new method for safely signing Tezos baking consensus operations on AWS. This is suitable for enterprises and large bakers, and costs only about $10 a month. The source code and doc is here. Deploy it from the Amazon Serverless Application Repository.

A Lambda signing a parchment paper, by DALL-E (not quite a lambda, in fact)

We present a minimalist, but secure and cheap solution for Tezos consensus signing.

Key Management Systems

Amazon KMS (Key Management System) is mostly used for signing TLS certificate requests and encrypting secrets. It also supports the secp256k1 signature type, commonly used in cryptocurrency. In Tezos, the associated public key hash is prefixed by tz2.

A cryptographic secret key is only as secure as the hardware it’s running on. KMS such as the one offered by Amazon offer keys that never leave the hardware chip where they were generated: they only take unsigned messages and give back signed messages. Think about it of a cloud version of a Ledger wallet.

But wait a second! It’s very hard to extract the private key from a KMS signer, but it does not really matter if you can sign whatever you want. You can still take all the money and run! Just sign a message sending it all to an address that you control and voilà.

This is an underappreciated fact of secure key management: it is worthless unless surrounded by a bastion that severely restricts what can be signed.

Ideally, this restriction would also be part of the secure element — as Ledger does. But, this is not yet possible in the cloud.

As an alternative, we:

  • make a list of the all the restrictions that the bastion must enforce,
  • build a simple gating system made of AWS components,
  • secure it properly.

The potential damage of an intruder is thus contained: unauthorized access to the baker may at worse cause denial of service (the bakery stops operating). But slashing, or theft of funds require access to the signing bastion, which is much harder to get into.

Organizationally, you can define policies where a small group has signer access, while a wider group has access to the baker. As most issues with Tezos baking can be troubleshooted without signer access, a less trusted, larger group of engineers can have their hands on the baking infra without compromising your security posture.

Consensus Keys

While it is possible to bake on Tezos with the delegate key, a recommended method is to assign the right to sign consensus messages to a consensus key.

The key only concerns itself with signing blocks, preattestations and attestations. Other functions such as staking, unstaking, moving tokens, and voting in governance, are still under the responsibility of the delegate key.

There are several reasons why it is advantageous to use a consensus key with a cloud baking setup:

  • the delegate key can be kept cold or in custody, only used occasionally for voting or transfers,
  • having the consensus key in the cloud is an acceptable trade-off: indeed, in the case where cloud access is lost or denied, it is possible to switch to a new key,
  • all operations listed above are prefixed by a magic byte. There is no policy in the lambda that allows other kind of signatures. This makes the code simple and robust.

Lambda Functions

Once touted as a new computing paradigm, Lambda Functions are small stateless chunks of code that get triggered by events, typically a http(s) call. Each time they gets triggered, the amount of time and RAM used is metered. You are then billed by the millisecond-MB.

Tezos offers a remote signing interface that supports https, which make Lambdas suitable for a remote signer frontend.

The lambda runtime is another advantage: we used a Node 18 runtime. It already includes the AWS SDK to interface with other components such as DynamoDB and KMS.

This is superior than (say) a Docker container. Indeed, a typical container contains a multitude of files. In contrast, the entire Lambda runtime is provided by Amazon, which has a stake in keeping it secure. Just import a zip file with your Javascript code and npm dependencies and you are good to go.

Our Lambda function consists of about 400 lines of code, with only two dependencies, base58check and noble-curves. This is a drastic improvement to full-fledged solutions such as Signatory. The flipside is that it is much less configurable, and only supports consensus key signing.

The combination of a locked-down runtime, low dependency count and low code make this solution easy to audit.

High Watermarks and Concurrency

The magic byte restriction will prevent signature of unauthorized operations such as transfer or drain, but we also need to prevent double signatures. Indeed, the signer must never sign two operations at the same level and round. This is considered equivocating, and anyone in possession of two contradictory operations may submit a denunciation on-chain and slash the baker.

As Lambdas have no state, we are using DynamoDB, a popular hosted NoSQL database, to store the high watermark for level and round: the number can only go up. Any violation will cause the signature request to be rejected.

We are left with the issue of concurrency: an attacker might flood the signer with signature requests. KMS signatures are not instantaneous, therefore two requests at the same level and round might get signed before DynamoDB has a chance to update itself.

We solve this by making the transaction atomic using DynamoDB Conditional Updates. It goes like this:

  • receive request, query high watermark
  • sign request
  • write new high watermark, but only if it hasn’t changed since we last read it. Otherwise, toss the signature.

Baker Authentication

The bastion exposes a https interface to the outside. Typical solutions for securing it involve API Keys or JWT tokens, however Tezos offers a little-known native solution: authorized_keys.

When enabled, the signer only accepts signature requests authenticated by a key (yes, the signature request itself is signed!).

The baker treats this key just as any Tezos address in its client directory. In order to limit the number of dependencies in the Lambda, the authorized_key must also be of the same secp256k1 type as the KMS key itself.

The public authorized key must be passed to the Lambda as an environment variable. The associated private key is known only by the baker and should be kept secure.

Usage of an authorized_key is mandatory. It ensures that only the intended baker can submit signature requests. Otherwise, anyone can trivially DoS the signer by submitting one dummy request to set the high watermark really high.

As a benefit, the signer interface can be on the Internet: the Lambda function is directly exposed on the Internet through an API gateway. (Users might still opt to only expose the signer within a VPC to increase security or satisfy third-party requirements.)

Tying it all together with IAM

We create IAM policies to ensure that only the Lambda function can interface with the KMS key and the DynamoDB table.

By default, any privileged AWS user (including the root account)can alter any IAM policy and gain access to the bastion. Therefore, it is advisable to deploy this setup into a dedicated AWS account to which only trusted operators have access. Otherwise, it is possible to lock down access in an account with a combination of authorized and unauthorized roles. We recommand to undergo IAM training and certification before engaging in this.

In Conclusion

We have presented a simple setup to sign consensus operations on Tezos. It has few moving parts, low code, and relies heavily on off-the-shelf Amazon products. It can be deployed fully from within the AWS web console, no CLI is required. Simplicity is an asset, as it reduces confusion and decreases the likelihood of incidents.

The costs are very reasonable due to its low footprint: a full-fledged Tezos mainnet baker signing an endorsement every 15 seconds can expect under $10 of monthly costs.

It is also fast: it takes a few hundreds of milliseconds for each signature request to send out a response: if your baker and signer are geographically close enough, good attesting performance is achievable.

How to deploy

See the deployment guide.

Special thanks to Aryeh Harris

--

--