How Amazon’s Custom AI Chips Work | WSJ Tech Behind
Generative AI chips (0s)
- Generative AI has experienced a significant increase in interest and discussion over the past year.
- Demand for AI chips has sharply increased, with market expectations rising from $150 billion to over $400 billion.
- Major tech companies are developing custom AI chips to run AI applications more efficiently and quickly, believing these chips are the future of technology.
Amazon’s chip lab (45s)
- Amazon designs its custom AI chips, named Inferentia and Trainium, at its chip lab in Austin, Texas, for use in AWS servers.
- The chips begin as wafers containing dice, which are individual chips with tens of billions of transistors.
- AI chips differ from CPUs in that they have a greater number of cores that operate in parallel, enabling the simultaneous processing of large amounts of data, such as generating images of cats.
Breakdown of the tech (2m18s)
- AI chips need to be integrated into a package to function and are used in two key processes: training and inference.
- Training involves exposing an AI model to millions of examples to teach it to recognize patterns, whereas inference involves the model generating an original output like an image of a cat.
- Training requires tens of thousands of chips due to its complexity, while inference typically uses 1 to 16 chips.
- AI chips produce significant heat and require temperature regulation for reliability testing and heat sinks for cooling.
- In Amazon's AWS cloud, training chips are mounted on servers that are designed to work together on the same task, such as powering an AI chatbot, where CPUs and AI chips like Inferentia2 collaborate to perform large-scale computations and deliver results.