A Primer on AI for Architects with Anthony Alford

16 Sep 2024 (10 months ago)

Machine Learning Concepts

Architects need to understand AI and machine learning concepts to have intelligent conversations with their co-workers. (2m30s)
Most people are referring to machine learning, specifically deep learning or neural networks, when they talk about AI. (3m21s)

Software developers can think of machine learning models as functions that take complex inputs, such as images or audio, and produce complex outputs, such as transcripts or summaries. (4m20s)
Tensors are multi-dimensional arrays used in machine learning models. (5m57s)
Machine learning models are trained using a process called supervised learning, which involves providing the model with inputs and expected outputs, similar to unit tests in software development. (6m15s)

Language models, such as ChatGPT, are trained on vast amounts of text data to predict the probability of a word occurring next in a sequence. (8m39s)
Large language models (LLMs) are characterized by having tens or hundreds of billions of parameters. (12m42s)
Hugging Face is a platform similar to GitHub, hosting and providing access to LLMs, including smaller models that can run on personal laptops. (13m39s)
LLMs utilize "tokens," which are units of text smaller than words, allowing them to generate novel words and phrases not found in a standard vocabulary. (17m2s)
Tokenization is a process that breaks down text into smaller units, typically larger than a character but smaller than a word. (17m54s)
Large language models (LLMs) like ChatGPT and OpenAI's API use tokens to measure usage and billing. (18m9s)
The "T" in GPT stands for Transformer, a neural network architecture that utilizes an "attention" mechanism to process and generate text. (20m59s)

There are publicly available, commercial large language models (LLMs) such as GPT-4, ChatGPT, Claude, Google's Gemini, and offerings from AWS. These can be accessed through web-based APIs and integrated using SDKs. (23m56s)
While using commercial LLMs can be cost-effective in the short term due to a pay-per-token model, long-term cost and privacy concerns may arise. (24m34s)
Open-source LLMs offer an alternative to commercial options, allowing for in-house implementation and greater control over data privacy. (25m19s)

Large language models (LLMs) are non-deterministic, meaning the output for a given input is not always predictable. (29m5s)

Retrieval augmented generation (RAG) is a technique that can improve the quality of LLM results by providing the model with relevant context from a knowledge base. (30m38s)
RAG works by converting documents and user queries into vectors, then finding the closest matching vectors to provide context to the LLM. (32m0s)

Transfer learning is a machine learning technique used to pre-train a model for general purposes and then fine-tune it for specific tasks. (34m37s)
Fine-tuning involves restarting the training process with a smaller dataset specific to the desired outcome, allowing for adjustments to the model's responses. (35m1s)

Vector databases, often used in semantic or neural search, employ nearest neighbor search algorithms to efficiently find vectors similar to a given input vector, enabling the retrieval of related content based on meaning. (38m9s)

LLMs can be understood as tools for addressing natural language processing tasks, such as named entity recognition and parts of speech recognition. (40m6s)
LLMs are versatile and can be fine-tuned for specific use cases or chosen based on their proximity to a desired application, offering advantages in cost, quality, and speed. (40m50s)