Mastering Observability: Unlocking Customer Insights with Gojko Adzic
19 Oct 2024 (3 months ago)
CuCon San Francisco and Engineering Culture Podcast
- The upcoming CuCon San Francisco conference will take place from November 18 to 22, where engineers will share how they implement innovations in real-world scenarios, exploring tracks like architectures, engineering productivity, and generative AI in production (10s).
- Shane Hasty is hosting the InfoQ Engineering Culture podcast, where he is joined by Gojko Adzic, a developer with over 12 years of experience in building software for himself and operating his own products (45s).
- Gojko Adzic started developing software at a young age, copying and pasting code from German magazines on a Commodore 64, and has since been building software for himself, including doing pre-sales, product management, support, development, and even making coffee in the office (1m13s).
- Adzic has written nine books, mostly as a way to free up his memory, with his past life involving contract development and helping others improve their process (2m1s).
Gojko Adzic's Background and "Specification by Example"
- One of Adzic's notable books is "Specification by Example," which was driven by his anger towards a conference speaker who claimed that nobody actually uses Test-Driven Development (TDD) and Behavior-Driven Development (BDD) in the real world (2m50s).
- Adzic interviewed people with good processes and documented their names, companies, and experiences, resulting in a book that showed that many people and companies, including investment banks and startups, successfully use TDD and BDD in their projects (3m41s).
Reasons for Not Adopting TDD/BDD and Measuring Value
- Organizations may not adopt Test-Driven Development (TDD) and Behavior-Driven Development (BDD) because they have too much money and can afford to repeat the same processes over and over, with incentives focused on effort rather than actual value delivered to the market (4m25s).
- The most common metrics used to measure progress are related to effort, effectively measuring how much money is being spent rather than the value being delivered (5m13s).
- The product management community has been working on figuring out how to measure value and deliver valuable products, and prove that value is being delivered to the market (5m41s).
- Organizations may not prioritize high-quality software delivery because software is not a bottleneck for them, and they may pay lip service to the idea but not actually implement it (5m57s).
- Personal incentives, politics, and discomfort with communication can also prevent organizations from adopting TDD and BDD, as many people in the industry prefer working with computers to working with people (6m10s).
- The focus on effort can lead to people keeping themselves busy or being kept busy by organizations, making it difficult to pause and coordinate as a team to figure out what to do (6m45s).
- Investing more effort in things that should not be done in the first place is a form of waste, as noted by W. Edwards Deming (7m15s).
Measuring Value Delivery and User Behavior
- To ensure that their work is adding the most value, technologists can start looking at outcomes rather than effort, and consider models such as the value exchange system or value exchange loop presented by Melissa Perrey in her book "Escaping the Build Trap" (7m37s).
- Many organizations struggle to track the value delivered by their products, and users often get value from products in ways that are difficult to measure (8m15s).
- Ron Kohavi published a book called "Trustworthy Online Controlled Experiments," which shows that only about 30% of experiments at companies like Microsoft and Slack have positive results (8m31s).
- This means that around 70% of the work done by these companies may be wasted, and data from companies like Netflix suggests that up to 90% of their early efforts were unsuccessful (9m7s).
Improving Software by Eliminating Wasted Work
- This low success rate presents a massive opportunity for improvement, particularly in areas like integration testing and maintenance (10m3s).
- By identifying and eliminating work that does not deliver value, organizations can reduce rework, improve efficiency, and develop better products (11m10s).
Tracking User Behavior Changes and Value Delivery
- One way to measure value delivery is to track changes in user behavior, as this can indicate whether a solution is solving a problem or not (11m54s).
- If user behavior does not change or changes in a negative way, it may be a sign that the solution is not delivering value (11m31s).
- By focusing on behavior change, organizations can develop more effective solutions and improve their competitiveness (11m14s).
- Behavior changes are a model for value delivered to the market, allowing for quick observation and measurement after something is delivered, unlike metrics like profit, market share, and revenue, which have a long delayed cycle (12m47s).
- Behavior changes are leading indicators that can be measured quickly, providing insight into whether a product is likely to win in the market, rather than just indicating whether it is winning (13m25s).
- Technical people can measure behavior changes using techniques similar to those used to measure system behavior, such as observability, tracing, distributed measurements, and combining logs and error measurements (13m37s).
- These techniques can be extended to measure user behavior, helping to determine if a product is actually helping users achieve their goals faster, easier, and sooner (14m3s).
- A trivial example of this is a payment form that required users to enter a tax ID, but the form's validation caused confusion and obstacles for users, particularly those from the European Union (14m21s).
- The issue was resolved by streamlining the form and removing the tax ID field, making it easier for users to complete the payment process (15m57s).
- European Union countries have different tax ID formats, with some having a prefix, and users are not expected to know these details, making it important to simplify the payment process (14m58s).
- The payment provider's validation of the tax ID caused issues for users who did not enter the prefix, leading to errors and frustration (15m23s).
- Some users even selected a different country to avoid entering their tax ID, highlighting the need for a simpler payment process (15m40s).
- A website was modified to ask users for a tax ID, which was thought to be a good idea, but it caused an even bigger problem as people were unable to enter the tax ID and were scared that their payment would not be processed (16m5s).
- By measuring behavior changes, it was discovered that the modification was causing more problems, and the idea was proven to be bad, highlighting the importance of measuring behavior changes to inform decision-making (16m33s).
Understanding "Lizard" Behavior and Product Opportunities
- The concept of a "lizard" refers to individuals who do things that no rational human would do, and they cannot be comprehended, but they still do them, as described by Scott Alexander in his blog post about Phantom Lizardman (17m11s).
- Lizards make up a percentage of users who are confused, distracted, or have other issues that affect their behavior, and they will do things that cannot be explained by rational human logic (18m21s).
- An example of a lizard behavior is people paying to convert blank PowerPoint presentations to videos, which makes no sense for a rational human, but it was discovered that they were using the text-to-speech function to extract an audio track (18m50s).
- By understanding the lizard logic, a product opportunity was discovered, as making an audio file is operationally cheaper than making a video, and a new product could be created to cater to this need (19m39s).
- A simple screen was created to allow users to upload Word documents and get an audio track, increasing product usage significantly as it made the process easier for users who were struggling, and this improvement benefited all users (19m49s).
"Lizard Optimization" and User Behavior Tracking
- The idea of "lizard optimization" involves making products better for users who are struggling in some way, which can be achieved by having analytics and observability built in to see what users are doing with the products (20m24s).
- Technical people in the industry have a massive opportunity to extend existing tools and figure out what unexpected things users are doing with their products (20m39s).
- Most analytics used by product people look for expected things like clicks, conversions, and purchases, but devops and observability tools can be used to look for unexpected things like exceptions, mismatches, and deviations (20m59s).
- These tools can be extended for user behavior tracking and to spot weird things that are happening, allowing for the extraction of signal from noise (21m40s).
Logging User Errors and Unexpected Behavior
- A simple thing that can be done is logging whenever a user sees an error screen or does something that causes an unexpected path in the application (21m54s).
- Many applications ignore this level of logging, but it can provide valuable insights into user behavior, such as when a user tries to select a file of an unknown type or a type that the software can't support (22m30s).
- For example, a text-to-speech system logs when users try to convert a JPEG file or an APK file, which can provide insights into unexpected user behavior (22m54s).
- People often try to upload incorrect file types, such as MP3 files, into a text-to-speech system, but occasionally, they attempt to upload files that make sense to support, like subtitle files, which are essentially text files with timestamps (23m26s).
- Allowing the upload of subtitle files led to complaints about the system reading timestamps, which were later removed, and then complaints about the speaking speed, revealing that users actually wanted to create alternate audio tracks for videos (23m51s).
- This feedback loop led to the development of a feature to create synchronized audio tracks when uploading subtitle files, which was already 99% functional, and this feature became highly profitable when a massive enterprise software company started using the system (24m50s).
- This example illustrates the importance of monitoring user errors and using this information as product management signals to make better products and provide more value to users (25m33s).
Five Stages of Growth and Product Evolution
- The five stages of growth, a concept from the book "Lean Analytics" by Alistair Croll and Benjamin Yoskovitz, describe how products evolve and what people should focus on at each stage, including empathy, stickiness, virality, retention, and revenue (26m14s).
- The empathy stage is about determining if a product is solving a good problem and finding a good solution, while the stickiness stage is about proving there is a market for the product (26m43s).
- Understanding these stages can help technologists and operations people provide valuable signals to improve products and make better decisions (25m47s).
- When developing a product, it's essential to determine if there's a market for it by checking if people use it frequently and if the product is sticky, meaning users stay and come back, indicating potential market demand (27m7s).
- In the early stages, iterating around different user groups can help find the right market, as seen in the example where people were initially building videos but later shifted to building more audio files when the product was repositioned (27m30s).
- The product-market fit stage is crucial, where the product is used extensively by the target market, and the next stage is growth, which involves proving that the market can be reached through various means such as advertising or word of mouth (27m58s).
- The growth stage is not just about virality but also about finding the right growth engine to reach the target customers (28m4s).
- After growth, the revenue stage involves building a sustainable business model, which may require optimizing how the product makes money (28m24s).
- The final stage is scale, which involves consolidating business operations, optimizing profit, and capturing more value from the market (28m36s).
- The value exchange loop consists of delivering value to the market and capturing value from the market, with the first two stages focusing on delivering value and the last two stages focusing on capturing value (28m49s).
- As engineers, it's essential to focus on the right stages and not optimize for performance or scalability too early, as this can be a waste of time and resources if the product is not yet profitable or market-ready (29m11s).
Consequences of Not Prioritizing Stickiness
- A past experience as a CTO of a company about 16 years ago highlights the importance of understanding these stages, as the company messed up by not focusing on the sticky part of the product and instead jumped into marketing too early (29m56s).
- The consequences of not optimizing the stickiness of the product can be severe, as seen in the example where only 1 or 2 out of every 1000 people who came to the website came back, indicating a lack of market demand (30m34s).
- A startup invested its entire marketing budget into bringing a million people to its website, but only 2,000 to 3,000 people stayed due to the product not being sticky, resulting in the company not doing well afterwards (30m40s).
- The outcome would have been different if the company had focused on making the product sticky before bringing in a large number of users (31m13s).
Using Growth Stages and Guardrail Metrics
- The five stages of growth are used to determine when to focus on certain aspects and when to stop, with one stage used to optimize and another to create guardrail metrics (31m34s).
- Guardrail metrics are used to ensure that certain aspects, such as operational costs, do not grow out of proportion while focusing on growth (31m53s).
- When scaling, it's essential to avoid damaging stickiness, and guardrail metrics can help determine if changes are having a negative impact (32m27s).
Problem Frames and Solution Frames
- A viable solution is one that solves a problem, and measuring behavior changes and value is more important than measuring the product itself (32m41s).
- The concept of problem frames, as described by Michael Jackson, involves understanding the problem and solution frames and how they interact (33m3s).
- Software developers often focus on the solution frame, but it's essential to consider the problem frame to evaluate whether the solution is sensible (33m31s).
- Gojko Adzic can be contacted by searching for his unusual name online, but be aware that a footballer with the same name may appear in search results first (33m53s).
- Gojko Adzic can be easily found by searching for his name on Google, even in Serbia (34m15s).
- He appreciates the opportunity to talk and thanks the host for taking the time to speak with him (34m22s).