AWS Lambda Under the Hood
19 Mar 2024 (10 months ago)
Lambda Architecture
- Lambda is a serverless computer system that allows users to execute code on demand without managing servers.
- Lambda supports synchronous and asynchronous invocation models.
- Lambda's tenets include availability, efficiency, scale, security, and performance.
Invocation and Execution
- Invoke request routing connects microservices and provides availability, scale, and execution access.
- Worker manager reuses previously created sandboxes to reduce initialization latency.
- Assignment service replaced worker manager to provide reliable distributed and durable storage for sandbox states.
- The introduction of a new node allows for easy rebuilding of the state from the log, significantly increasing system availability and making it fully tolerant to single host failures and availability zone events.
- The distributed consistent sandbox state is implemented regionally, and a leader-follower architecture is applied for quick failovers.
Compute Fabric
- Compute fabric owns all the infrastructure required to run code, including worker fleets, capacity manager, placement, and data science team for smart decision-making.
- Rust was used to rewrite the new service, increasing efficiency and performance of every host, improving processing volume, and reducing overhead latency.
Isolation and Security
- Data isolation is crucial to prevent interference between different functions running on the same worker.
- Virtual machine isolation provides sufficient guarantees to run arbitrary code in a multi-tenant computer system.
- Firecracker is a fast virtualization technology specifically designed for serverless compute needs, allowing multiplexing of thousands of functions from different customers on the same worker with consistent performance.
- Firecracker provides strong isolation boundaries, is very fast with little system overhead, and enables decorrelation of demand to resources for better control of worker fleet heat.
- A custom indirection layer enforces strict copy-on-read to eliminate shared memory and prevent security threats in a multi-tenant execution environment.
- Introduced a callback interface to restore uniqueness of code after resuming multiple VMs from the same snapshot.
- Snapshotting is used to reduce the cost of creating new execution environments by resuming VMs from snapshots instead of initializing them from scratch.
- Implemented on-demand chunk loading to reduce snapshot distribution time and improve performance.
- Utilized convergent encryption to deduplicate common chunks across container images and increase cache locality.
- Addressed the issue of inefficient memory access by recording page access patterns and optimizing snapshot loading.
- Enabled Lambda snapshot on Java functions for users to experience VM snapshot functionality.
- Firecracker uses a distributed cache in multiple availability zones to maintain a coherent cache of the configuration database, making lookups faster.
- The speaker is open to discussing how Lambda functions can be built in a company's own data center during a follow-up talk.
- The same techniques used in Firecracker could be used to make EBS snapshots faster, but it would require more work due to the complexity of hardware and virtualization layers.
- Different services communicate with each other using a mixture of synchronous request-response communication and GPC and HTTP2 streams, depending on the requirements of the particular communication.
- Firecracker uses metal instances because they meet the requirements of the system, while nested virtualization would be much slower.
- During Lambda function updates, the previous function version is used until the snapshot of the updated function is finished, at which point the system switches to the latest version.
- The engineering process balances security, efficiency, and latency, with security being the top priority.