AI is fixing — and ruining – our photos | The Vergecast
09 Oct 2024 (1 month ago)
- The key to camera control on the iPhone 16 is that it does whatever it's going to do when you let go of the button, not when you press the button (3s).
- A decision has been made to take fewer photos and videos on the phone, especially of life memories, to avoid ruining the memory of the moment (18s).
- The goal is to have photos without taking them in a way that negatively impacts the experience, and to avoid mindless phone use, such as checking fantasy teams (26s).
- There are many camera options available, including old phones, such as old Pixels and iPhones, and other devices like the DJI Osmo Pocket and the Sony Alpha 6000 (45s).
- Instead of buying a new camera, such as the Fuji X100, consideration is being given to repurposing one of the existing gadgets, like the Humane AI pin or the rabbit R1 (1m15s).
- Alternative options, like using an old camera or buying disposables, are also being considered to change the way the camera is used (1m40s).
Exploring Camera Options (1m43s)
- The topic of discussion is photos, specifically how smartphone cameras have become more AI-based, changing the idea of what happens when the shutter button is pressed and what the resulting photo looks like (1m54s).
- The show will explore this topic by talking to the creators of the popular camera app Halide, discussing their new feature called "Process Zero" which removes Apple's processing from iPhone photos, resulting in lower quality images that people surprisingly like (2m32s).
- The team will also discuss an AI photo experiment conducted by Allison Johnson, where she explores how AI changes the way photos are taken, edited, and remembered (2m45s).
- A hotline question unrelated to photos will also be addressed later in the show (3m0s).
- The host had to pause the show to find a micro USB cable to charge a Sony camera, which led to a humorous discovery of having 12 extra micro USB cables (3m29s).
- The show will continue with a discussion about the Halide app and its features (3m42s).
Deep Dive into Halide (3m47s)
- The iPhone camera has undergone significant changes over the years, not only in terms of technology and design but also in its intended purpose and functionality (3m47s).
- Ben Sandofsky and Sebastian de With, co-founders of Lux Optics, have been developing camera apps, including Halide and Kino, and exploring what kind of camera app people need now given the advancements in smartphone technology (4m9s).
- The pronunciation of the app name Halide has been a topic of discussion, with Ben Sandofsky alternating between "Halide" and "Highlight," and the team eventually settling on "Halide" (4m54s).
- The team has had to deal with pronunciation issues in the past, including Apple's keynote team having trouble with the name "Kino" due to its similarity to "keynote" (6m9s).
- The conversation touches on the philosophical implications of building a camera app, with the team realizing that it involves making decisions about the universe and how people perceive and interact with photography (7m6s).
- The discussion aims to explore the changes in iPhone photography, particularly with the introduction of Process Zero, and the team's thoughts on what a camera app should be and do (6m42s).
- The initial development of the app Halide began in 2014 with a few lines of code, but it wasn't until 2016 that significant progress was made, including features like focus peaking and manual value adjustments. At that time, most photography apps were simple interfaces for Apple's APIs, offering limited control over the final image. (7m13s)
- In 2017, the release of the iPhone X marked a shift in computational photography with the introduction of features like Smart HDR, which involved complex image processing to enhance photo quality. This period also saw the emergence of night mode and other advanced features that transformed photography into a data science, allowing small sensors to produce high-quality images. (8m20s)
- By 2020, there was a convergence of advanced computational photography tools, raising questions about the necessity of third-party apps like Halide as built-in camera apps improved significantly. However, Halide's developers, including Sebastian, continued to focus on raw photography, which allows users to capture sensor data and process images independently in apps like Lightroom. (9m53s)
- The advancement in smartphone camera technology has enabled users, including those who don't know much about photography, to take high-quality photos with just a tap of a button, thanks to improved hardware and software capabilities (10m28s).
- The term "raw" in photography has become a topic of discussion, with some people suggesting that shooting in raw format can improve image quality, but this can be misleading as raw files require processing to produce a usable image (10m52s).
- The suggestion to shoot in raw format is often made in response to users expressing dissatisfaction with their smartphone camera, but this can be a "horrible suggestion" as it results in large files that require processing to achieve a desired look (11m7s).
- Apple's Pro Raw system allows users to generate a file that can be processed to turn off certain features like Ultra HDR, but this is not the same as true raw format (11m26s).
- The difference between native Raw and Pro Raw lies in the level of processing applied to the sensor data, with native Raw being the unprocessed data and Pro Raw being a partially processed file (11m52s).
- Native Raw refers to the unprocessed sensor data, which contains voltage values from light converted to numbers, and requires an algorithm to interpret the data and produce a usable image (12m16s).
- Apple's Pro Raw is a partially processed file that allows users to make some adjustments, but it is not the same as native Raw and is more like a "half-baked cake" that still requires processing to produce a final image (12m46s).
- Apple's Pro Raw system is designed to make it easier for users to work with raw files, but it can be confusing for those who are not familiar with the technology, and the term "Pro Raw" can be misleading (12m58s).
- The concept of a "raw photo" is also misleading, as a raw file is not a photo in itself, but rather the unprocessed data that requires processing to produce a usable image (13m48s).
- A raw file from a camera is essentially a giant spreadsheet containing data from the sensor, which needs to be processed to create a displayable image, and this processing step is already done in a pro file, creating an image with separate information for tone mapping, white balance, sharpening, and noise reduction (13m58s).
- The processing required to create an image from a raw file involves combining data from the sensor, and this process can be compared to film photography, where the negative would be developed and printed on paper, with techniques like dodging and burning used to recover dynamic range (14m1s).
- Apple's processing of images involves algorithms that can adjust tone mapping, white balance, and sharpening, and noise reduction is baked into the process, but this can result in an image that lacks authenticity and imperfection, which some people prefer (14m7s).
- The concept of a digital negative is similar to a film negative, and the processing of this negative can be compared to the development of a film photograph, where the photographer would make adjustments to create the final image (15m11s).
- Ansel Adams, a famous landscape photographer, used analog techniques to adjust the dynamic range of his photographs, and Apple's algorithms can make similar adjustments, but with limited options for the user (15m37s).
- Some people prefer the look of images that have not been heavily processed, and this preference for authenticity through imperfection is an interesting phenomenon, especially given the advanced processing capabilities of modern smartphones (16m40s).
- The perception that iPhone cameras are bad is a common feeling among people, despite the fact that the cameras are capable of producing high-quality images with advanced processing techniques (16m54s).
- The quality of photos has improved over time, but people seem to like them less, creating a disconnect between quality and preference (17m2s).
- The iPhone has a tendency to remove noise and shadows from photos, which can make them look unnatural and overly processed (17m12s).
- The idea that imperfections in photos are desirable is gaining traction, with some people preferring the look of older, less polished photos (17m26s).
- As camera hardware has improved, software has become more opinionated about how a photo should look, often resulting in an incongruous or unnatural appearance (17m38s).
- There is a growing desire for more choice and flexibility in photo editing, rather than being limited to a single, overly processed look (18m2s).
- The concept of "process zero" refers to a rebranded raw image format that prioritizes noise and detail over a polished, noise-free appearance (18m11s).
- The value of noise in photos is a matter of opinion, with some people seeing it as "analog dithering" that adds character and detail to an image (18m36s).
- The challenge of balancing noise reduction with image detail is a complex one, and there is no one-size-fits-all solution (18m48s).
- The trend towards overly processed, HDR-style photos is not universally popular, with some people preferring a more natural, film-like aesthetic (19m0s).
- As camera technology continues to evolve, the question of how to balance image quality with artistic intent will become increasingly important (19m27s).
- The line between a "real" and "fake" looking image is blurry, and will require careful consideration as photo editing software becomes more advanced (19m36s).
- The homogenization of camera styles, with many phones producing similar-looking images, is a result of designing cameras for the greatest common denominator (20m2s).
AI Photography Experiment with Allison Johnson (40m25s)
- The Vergecast is sponsored by One Password, a password management service that combines industry-leading security with user-friendly design, allowing users to securely manage their passwords and switch between devices with ease (40m25s).
- One Password offers a free two-week trial for listeners at onepassword.com/vergecast, providing a secure solution for businesses to manage their passwords and protect against data breaches (41m22s).
- The Vergecast is also sponsored by Grammarly, an AI-powered writing tool that helps users communicate clearly and efficiently, reducing miscommunication and unnecessary meetings (41m40s).
- Grammarly integrates with over 500,000 apps and websites, making it a trusted writing partner for 70,000 teams and 30 million people (42m28s).
- Alon Johnson, a reviewer for The Verge, conducted an experiment to see what life would be like if she fully leaned into the AI capabilities of her phone camera, using all the features and upgrades available (42m47s).
- The experiment aimed to explore what would happen if she didn't have any existential crises about AI photos and cameras and just bought into the technology (43m14s).
- Alon Johnson joined the Vergecast to discuss the results of her experiment and share her experiences with using AI-powered phone cameras (43m26s).
- The experiment involved using a phone camera to take memories, not just photos, and editing them using Google Photos' AI tools, specifically the "reimagine" tool, to enhance the memories (44m14s).
- The goal was to live out Google's definition of a camera being for memories, not just photos, and to see how AI can be used to make memories more special (46m30s).
- The process involved taking a thousand pictures of a child over a week, then editing them using Google Photos' AI tools to remove distractions, add elements to the scene, and make the memories feel more magical (44m33s).
- The editing process was uncomfortable for the person conducting the experiment, as they normally don't edit phone photos and prefer to rely on computational photography (46m55s).
- The person had to clear their mind and focus on what the moment felt like, rather than what was actually in the photo, to decide how to edit it (47m9s).
- The editing process involved removing distractions, such as people in the background, and adding elements to the scene to make it feel more special (47m15s).
- The person also imagined what could be added to the scene to make it feel more magical, such as removing snot from a child's nose or changing their expression (47m47s).
- The experiment was designed to test the idea that editing a photo can change the way we remember the moment it was taken, and to see how AI can be used to enhance those memories (45m40s).
Ethical Dilemmas (48m0s)
- The process of editing photos using AI can make one think differently about the kinds of photos they take and how they set up to take those pictures, potentially changing their approach to photography (48m2s).
- The use of AI in editing can also make one question whether they need to take a certain picture in the first place, as they can potentially fix or enhance it later (49m2s).
- The experience of using AI in editing can be similar to forcing oneself to take a photo when they normally wouldn't, with the intention of fixing it later, which can lead to taking more shots (49m16s).
- This approach can be seen in the behavior of a "classic parent photographer," who takes multiple photos in hopes of capturing the perfect moment, even if it doesn't cooperate (49m49s).
- The editing process typically involves selecting the best photo from a set and making minor adjustments, such as cropping, straightening, and adjusting saturation, before sharing it on social media (50m31s).
- When editing photos using AI, one may enter a headspace where they consider the memory or feeling behind the photo and try to enhance or recreate it (50m42s).
- The use of AI in editing can also lead to a more relaxed approach to photography, where one doesn't worry as much about getting the perfect shot, knowing that they can fix it later (48m55s).
AI's Role in Memory and Photography (50m45s)
- The discussion explores the philosophical and emotional aspects of editing personal photos using AI, highlighting the discomfort and reflection involved in altering memories. (50m46s)
- A specific example is given where a diaper bag was removed from a photo of a husband holding a child, demonstrating how generative AI can easily edit out elements without leaving noticeable traces. (52m20s)
- The edited photos were shared on Instagram, leading to feelings of unease about potentially deceiving friends, as the changes made the images materially different from the original scenes. (53m27s)
- The tools used for editing included Google Photos, even for photos taken on an iPhone, and the edits included adding elements like birds in the sky, which were not originally present. (53m2s)
- A notable example of AI editing involved a photo of a child in a park, where the changes were significant enough to be described as "cheesy," illustrating the extent of transformation possible with AI tools. (54m6s)
- A photo was edited to include a dinosaur in the background, which was intentionally added to be humorous and obvious, and although it's not a well-made dinosaur, it gives the photo a fun, Jurassic Park-like vibe (54m17s).
- Another photo was edited to include a flock of birds above a person's head while they're running in a field next to industrial machinery, making it a lovely but potentially AI-generated shot (54m53s).
- The birds in the second photo were actually added using AI, and there were no birds in the original photo, with the editor drawing inspiration from a nearby iron bird sculpture that flapped its wings when a pulley was pulled (56m2s).
- The editor added the birds to capture the feeling of the moment, which was a happy and beautiful day outdoors, but doesn't think the edited photo will be printed or displayed, despite the birds making the photo cooler to look at (56m40s).
- The addition of the birds helps communicate the feeling of the photo, but it's the person's smile and running pose that does most of the work in conveying the emotion, and the editor finds it fascinating that the AI edit didn't improve their personal connection to the photo (57m30s).
- The use of AI to edit photos raises questions about the purpose of the edits, whether it's to make the photo look more like what's in the person's mind or to portray something on social media that looks more ideal (57m56s).
- Professional photography has a set of expectations, such as capturing the right moment and light, and AI tools may be intended to help achieve this without being a professional photographer (58m19s).
- However, there's a limit to what can be recreated with AI, and some things need to be present in the original photo, such as the right light and a genuine moment (59m6s).
- A professional photographer would know how to capture a moment when the subjects are not screaming or making a sad face, and the memory of the day is more important than the imperfections in the photo (59m31s).
- The idea of editing out imperfections, such as a child's snot or a scratch on their face, raises questions about where to draw the line between acceptable and unacceptable edits (59m50s).
- The generative AI in Google Photos does not allow edits to be made to people, but the "best take" feature can be used to pick the desired expression from a burst of photos (1h0m41s).
- The effectiveness of this feature is limited when there's only one subject, but it can be useful when there are multiple people in the scene (1h1m11s).
- The decision to make minor edits, such as cleaning up boogers, can lead to questions about where to stop and how much editing is acceptable (1h1m27s).
- Editing photos with AI can lead to a slippery slope, where one might start by removing minor distractions but end up removing essential context, resulting in a bland picture that lacks character (1h1m40s).
- Overly doctored photos can make it seem like the scene is happening in a vacuum, devoid of people or context, which can be unnatural and weird (1h2m34s).
- Removing people from a photo can make it seem like the location is empty, which can be unrealistic, such as a beach on a nice day (1h2m43s).
- Using AI to edit photos can also lead to the removal of important details, making the scene seem unnatural or even post-apocalyptic (1h2m59s).
- In one example, a photo of Len playing with a truck on a table was edited to remove a blurry person in the background, but the editor struggled to decide what else to remove without making the photo worse (1h3m18s).
- The same photo had a glass of milk added to it using AI, which was convincingly realistic, but also distracting and not part of the original scene (1h3m26s).
- The glass of milk was added to replace an orange juice bottle, which the parent thought was too unhealthy to be in the photo (1h4m12s).
- The parent used AI to add the glass of milk to make the scene seem more wholesome, but it also changed the context of the photo (1h4m37s).
- Another photo of Len in a Jeep in a park was edited to remove a building in the background, which was not offensive but did not fit with the desired aesthetic (1h4m56s).
- The process of editing photos using AI can make them look generic and less authentic, losing the original context and details of the moment they were taken (1h5m7s).
- This raises questions about how people will think about and remember these edited photos over time, and whether they will forget that certain elements were added or removed (1h5m27s).
- Changing photos can alter how people remember the events they depict, and this can be a complex and philosophical issue (1h5m54s).
- Removing people from photos can also change how people remember experiences, and this can be a difficult decision to make when editing photos (1h6m10s).
- For some people, Google Photos serves as a journal, and editing photos can be seen as altering memories and potentially losing important details (1h6m25s).
- The decision to edit photos and remove or add elements can be seen as lying to one's journal, and this can be a strange and uncomfortable feeling (1h7m23s).
- Figuring out what details are important to remember and what can be altered is a slippery slope, and it's difficult to know what will be significant in the future (1h7m35s).
- Making decisions about what to edit and what to keep in photos can be uncomfortable, even when it's useful, and it's a complex issue to navigate (1h8m25s).
- The ability to edit photos using AI can also change the way people take photos, with some people taking more photos in the hopes of finding a good one that can be edited later (1h8m40s).
- The same tools used to make photos more realistic can also be used to make them less realistic, and it's up to individuals to decide if they're okay with that (1h14m57s).
- The capabilities of photo editing technology are not getting worse anytime soon and will likely continue to improve rapidly (1h15m17s).
- A disastrous example of photo editing was when an attempt was made to remove people from a photo of a waterfall in Iceland, resulting in a boring photo, and the people were later put back in (1h15m31s).
- Samsung makes the Moon look beautiful in photos so users don't have to worry about editing it (1h16m13s).
- Miro's Innovation workspace is a platform that encourages collaboration and curiosity, and it comes loaded with AI tools to make operations more efficient (1h17m4s).
- Miro's platform can produce AI-generated summaries of meeting notes, lengthy documents, and research summaries, and it functions seamlessly on a limitless blank canvas (1h17m25s).
- Life 360 is a family connection and safety app that makes it easy to keep track of family members, including their location, and provides features such as crash detection, roadside assistance, and SOS with emergency dispatch (1h17m57s).
- Life360 is a service that provides location history and low battery alerts, allowing users to have peace of mind with their loved ones, and is offering a promotion for one month of the gold package for free with code "Verge" (1h18m39s).
- Toyota is promoting their Crown family of vehicles, which come with quality, reliability, and bold elegance, and are available with a hybrid Max powertrain and a by-tone exterior finish (1h18m57s).
- The show is sponsored by Toyota and Life360 (1h18m54s).
- A listener, Zach from Las Vegas, is considering spending a few hundred dollars on a new gadget, either the Pixel Buds Pro 2 or the Meta Ray-Ban smart glasses, and is seeking advice on which one to purchase (1h20m15s).
- The Meta Ray-Ban smart glasses are a new class of tech that incorporates sunglasses and speakers for music, but the listener is hesitant to add a new assistant or AI to their life since they are already invested in the Google ecosystem and Pixel products (1h20m40s).
- The question of whether the Meta Ray-Ban smart glasses can be a useful multi-purpose exercise accessory is discussed, and it is noted that the show has received this question multiple times before (1h21m23s).
- The host and a guest, V, have both tried and are fans of the Shokz OpenRun headphones, which are a type of bone conduction headphones (1h21m36s).
- The host is wearing the Shokz OpenRun Pro 2 headphones for running and loves them, having previously used the Beats Fit Pro (1h21m58s).
- Aftershocks bone conduction headphones have decent bass and good sound quality for bone conduction, using a combination of air conduction speakers and bone conduction, but may not be suitable for those who want strong bass (1h22m23s).
- The headphones are secure but may not be comfortable for people with narrow ear-to-head space, especially when wearing glasses, due to the crowded airspace (1h23m6s).
- Meta Ray-Ban smart glasses are great for consolidating sunglasses and audio, suitable for running in quiet areas, but may not be effective in loud environments, such as near highways (1h23m32s).
- In terms of running in loud environments, traditional headphones are the best option, followed by bone conduction, and then Meta Ray-Ban smart glasses, which may not be able to block out loud noises (1h24m5s).
- The Meta Ray-Ban smart glasses are not ideal for running races, as they may not be able to block out loud music and may cause distraction (1h24m22s).
- The glasses are heavier and larger than regular sunglasses, which may be a concern for some users, especially for long runs (1h25m4s).
- For runs of 5K or less, the weight of the glasses may not be a significant issue, but for longer runs, especially in hot and humid weather, the sweat and weight may cause discomfort (1h25m30s).
- The Bose Tempo glasses are suitable for individuals with a low nose bridge, but they may slip due to sweat and sunscreen during intense activities like running (1h26m3s).
- For long-distance runners, the glasses' weight and battery life might be concerns, but for casual runners, these issues are less likely to be a problem (1h26m21s).
- The glasses are not ideal for loud environments, such as big races, as they may not provide sufficient sound isolation (1h27m55s).
- Fit is a personal aspect, and it is recommended to try the glasses before purchasing to ensure a comfortable fit (1h27m38s).
- The Bose Tempo glasses are a smart option for running, especially in sunny conditions, but they may not be suitable for loud environments or big races (1h27m45s).
- The concept of smart glasses, like the Oakley's smart glasses, is likely to be the next step in wearable technology, potentially offering a more convenient and stylish option for athletes (1h28m28s).
- Bose has already developed a similar product, the Bose fit Tempo, but it did not manage to capitalize on the market demand for such a product (1h29m2s).
- The discussion starts with a personal anecdote about running and music preferences, mentioning Stray Kids, a K-pop group, as a favorite to run to due to their energetic and electronic music style, which provides a good beat to run to without being too distracting or bubblegum-like (1h29m39s).
- The importance of a strong beat in running music is emphasized, as it helps to maintain motivation and pace, and K-pop is preferred because the lyrics can be easily ignored (1h30m1s).
- A recommendation is made to try Ray Bands as a fun gadget for running, as they are enjoyable to use and can be taken on a test run to see if they work well (1h30m52s).
- If Ray Bands do not work out, Shokz (also known as bone conduction headphones) are highly recommended, as they are popular among running influencers on TikTok and have received positive reviews for their functionality and comfort (1h31m17s).
- The latest version of the Open Run Shokz has been praised for its performance in various activities, including running, dog walks, and pushing a stroller, and has become a popular choice among serious runners (1h31m29s).
- The durability of Shokz is also highlighted, as they have been shown to work well even in rainy conditions, unlike the Meta Ray Bands, which may not be as reliable in such weather (1h32m13s).