Managing 238M Memberships at Netflix
22 Feb 2024 (9 months ago)
Netflix's Membership Team
- The membership team at Netflix is responsible for signups, streaming, and managing user accounts.
- They operate a dozen microservices with four nines availability and serve millions of requests per second.
- Key user touchpoints include the "Start Membership" button, the "Play" button, and the account page.
- Netflix's membership services manage user sign-ups, plan changes, and partner bundle activations.
- All memberships are saved in Cassandra, a persistent store.
- Netflix's tech footprint consists of a distributed system architecture optimized for high read RPS.
- It includes over 12 microservices, gRPC for HTTP layer communication, Java and Kotlin as primary programming languages, Spring Boot for server-side development, Kafka for message passing, and Spark and Flink for offline reconciliation.
- Netflix uses extensive request and response logging, dashboards, and Kibana Elastic Search to monitor and troubleshoot errors.
- Distributed tracing is used to isolate and identify issues in specific microservices.
- Production alerts are set up for key metrics such as latency and sign-up rate.
- Machine learning models are used to detect anomalies and trigger automatic corrections or alerts.
Case Studies
- The first case study examines the evolution of Netflix's pricing technology choices, highlighting the challenges of scaling a simple in-memory library to support global expansion, multiple plans, and various domains.
- The second case study discusses the importance of cataloging every modification made to records in their systems, ensuring data integrity and enabling them to answer questions about historical changes.
- The third case study focuses on the journey of scale expansion, detailing how Netflix evolved its architecture to meet increasing demands and requirements.
Member History Service
- The member history architecture initially relied on application-level events for tracing historical state and triggering actions.
- A new service was created to capture the direct delta of changes on key operational data sources using a change data capture pattern.
- Member History Service tracks and persists all membership rights updates in an append-only fashion, solving the split-brain problem and enabling powerful debugging and tracing capabilities.
- Member History Service replaced all app-level events with a view on top of member history, providing a single interface for all other systems within Netflix to access membership data.
- Member History Service uses Iceberg tables to persist events, enabling reconciliation with core membership systems and replay of events for downstream analytics.
- Member History Service helped save the day during a data corruption incident by replaying corrupted records and restoring data integrity.
Conclusion
- Architectural choices can pay off significantly in the long run, as demonstrated by the Member History Service journey.
- The evolution of subscription ecosystems requires continuous innovation and investment in architectural evolution to stay ahead of scalability challenges.
- Netflix pricing choices should be future-proofed to avoid the need for reactive pivots.
- Some architectural choices can pay off significantly in the long run, and it's important to have the courage to invest in big bets.
- Member subscriptions evolution is an ongoing process, and there is always room for improvement and innovation.