Managing 238M Memberships at Netflix

22 Feb 2024 (9 months ago)
Managing 238M Memberships at Netflix

Netflix's Membership Team

  • The membership team at Netflix is responsible for signups, streaming, and managing user accounts.
  • They operate a dozen microservices with four nines availability and serve millions of requests per second.
  • Key user touchpoints include the "Start Membership" button, the "Play" button, and the account page.
  • Netflix's membership services manage user sign-ups, plan changes, and partner bundle activations.
  • All memberships are saved in Cassandra, a persistent store.

Netflix's Tech Footprint

  • Netflix's tech footprint consists of a distributed system architecture optimized for high read RPS.
  • It includes over 12 microservices, gRPC for HTTP layer communication, Java and Kotlin as primary programming languages, Spring Boot for server-side development, Kafka for message passing, and Spark and Flink for offline reconciliation.
  • Netflix uses extensive request and response logging, dashboards, and Kibana Elastic Search to monitor and troubleshoot errors.
  • Distributed tracing is used to isolate and identify issues in specific microservices.
  • Production alerts are set up for key metrics such as latency and sign-up rate.
  • Machine learning models are used to detect anomalies and trigger automatic corrections or alerts.

Case Studies

  • The first case study examines the evolution of Netflix's pricing technology choices, highlighting the challenges of scaling a simple in-memory library to support global expansion, multiple plans, and various domains.
  • The second case study discusses the importance of cataloging every modification made to records in their systems, ensuring data integrity and enabling them to answer questions about historical changes.
  • The third case study focuses on the journey of scale expansion, detailing how Netflix evolved its architecture to meet increasing demands and requirements.

Member History Service

  • The member history architecture initially relied on application-level events for tracing historical state and triggering actions.
  • A new service was created to capture the direct delta of changes on key operational data sources using a change data capture pattern.
  • Member History Service tracks and persists all membership rights updates in an append-only fashion, solving the split-brain problem and enabling powerful debugging and tracing capabilities.
  • Member History Service replaced all app-level events with a view on top of member history, providing a single interface for all other systems within Netflix to access membership data.
  • Member History Service uses Iceberg tables to persist events, enabling reconciliation with core membership systems and replay of events for downstream analytics.
  • Member History Service helped save the day during a data corruption incident by replaying corrupted records and restoring data integrity.

Conclusion

  • Architectural choices can pay off significantly in the long run, as demonstrated by the Member History Service journey.
  • The evolution of subscription ecosystems requires continuous innovation and investment in architectural evolution to stay ahead of scalability challenges.
  • Netflix pricing choices should be future-proofed to avoid the need for reactive pivots.
  • Some architectural choices can pay off significantly in the long run, and it's important to have the courage to invest in big bets.
  • Member subscriptions evolution is an ongoing process, and there is always room for improvement and innovation.

Overwhelmed by Endless Content?