The confusing state of Apple Intelligence | The Vergecast

25 Oct 2024 (5 months ago)

Intro (0s)

The Vergecast podcast is hosted by David Pierce, who had a tumultuous week due to the "jelly scrolling" scandal, but was later vindicated (10s).
Richard Lawler, another host, is unaware of the "jelly scrolling" scandal and doesn't pay attention to Apple drama (27s).
Richard Lawler jokingly offered to give a Bitcoin to listeners if they vote for The Vergecast, but it's actually a joke about a "Quantum locked" Bitcoin (57s).
Richard Lawler pointed out that companies like Starbucks and Nike, which heavily promoted NFTs a few years ago, are now struggling, but not because of NFTs (1m15s).
The theory is that these companies had a problem in their boardroom, where executives were not focused on what customers wanted, and NFTs were a symptom of this problem (1m45s).
The issue is that these companies think selling nothing, or experiences behind digital tokens, is a functional business, which is not the case (2m0s).
The hosts will discuss this topic further, as well as other topics, in the podcast (2m28s).

Apple's iPad Mini Review: Disappointment Unveiled (2m31s)

A review of the iPad Mini was conducted, which was described as a disappointing experience (2m32s).
The review process was likened to a "hostage situation" for the reviewer, David (2m37s).
Apple is set to release iOS 18.1, but it is expected to be followed by iOS 18.2, which will contain the actual features (2m44s).
The release of iOS 18.1 and the subsequent iOS 18.2 is considered confusing (2m53s).
A series of Mac announcements is expected to take place over the course of a week, with Greg Jiak currently working on the coverage (2m56s).
There have been various AI-related news stories, including some humorous updates from Humane (3m0s).
The discussion will also include a lightning round of topics (3m5s).

Apple's Upcoming Announcements and AI News (3m6s)

The iPad Mini was recently updated with a new chip, but the update was seen as minimal, with some considering it to be the "literal least" Apple could have done, and this was viewed as disappointing (3m6s).
The updated iPad Mini supports Apple Intelligence and has 8 gigs of RAM, but the update was not seen as sufficient, with the device feeling like it was "designed by a supply chain, not by a designer" (3m27s).
The iPad Mini is considered a self-selecting device, meaning that people who want one will buy it regardless of the updates, which may be why Apple did not try to make it better (4m32s).
The device is still good for what it is, but it is not as good as it could be, and the lack of effort from Apple is seen as a missed opportunity (5m3s).
The old iPad Mini will likely still be available for sale at a discount, and the only meaningful difference between the old and new models is the support for Apple Intelligence (5m48s).
The iPad Mini is often used as a secondary device, and the features that Apple Intelligence provides may not be as important for this type of device (6m3s).
The need for Apple Intelligence on the iPad Mini is not seen as a major concern, as people are unlikely to use it as a primary device for tasks such as writing emails or using Siri (6m15s).
The update to the iPad Mini has highlighted the tension between what the iPad is and what people want it to be, with some feeling that it is not living up to its potential (5m29s).
The current state of Apple intelligence does not justify purchasing any Apple product, with the iPad Mini being the least in need of it at the moment (6m46s).
The iPad Mini uses the A17 processor from the iPhone, rather than the M4 processor used in other tablets and laptops, and benchmarks slightly below the iPhone 15 Pro (7m17s).
The A17 processor in the iPad Mini has one fewer GPU core than the iPhone 15 Pro, suggesting it may be a chip that did not meet full manufacturing capacity (7m33s).
This practice of using imperfectly manufactured versions of better processors is common among companies, but it is unusual for a brand-new product to use a worse version of last year's chip (7m47s).
Apple's emphasis on the need for high performance to run Apple intelligence, combined with the upgraded performance of other iPads, makes the iPad Mini's processor seem like the "worst of the worst" (8m8s).
The use of a lower-performance processor in the iPad Mini, despite its premium branding, has been likened to putting a Cadillac logo on a Chevy or a Pontiac (8m16s).

The Jelly Scrolling Controversy Revisited (8m28s)

The iPad Mini has effectively no competition in the Android market, and it has the iPad app library, which doesn't exist on the Android side (8m35s).
The iPad Mini has a notorious issue with "jelly scrolling," where one side of the screen moves faster than the other when scrolling, causing a warped effect (9m45s).
Jelly scrolling is a problem that some people can see, while others can't, but once it's noticed, it's difficult to ignore, similar to motion smoothing (9m57s).
A 2021 review of the iPad Mini by Marques Brownlee highlighted the issue of jelly scrolling and provided a test to check for it (10m14s).
Apple initially downplayed the issue in 2021, stating it was a normal problem that didn't require concern, but the company has since attempted to fix it (10m52s).
The 2024 iPad Mini has improved jelly scrolling compared to the 2021 model, but the issue still exists, as confirmed by a reviewer who spent 90 minutes comparing the two devices (11m22s).
The reviewer initially questioned their own observation of jelly scrolling in the 2024 iPad Mini, but after re-checking, they confirmed that the issue still persists (12m7s).
The discussion revolves around the scrolling issue on Apple devices, specifically the iPad Mini, where the right side of the screen appears to pull down and then up faster than the left side when scrolling quickly, creating a "jelly" effect (13m1s).
This issue is compared to the 2021 model, which had a more pronounced "jelly" effect, but the new model is considered "meaningfully better" (12m44s).
The issue is noticeable when scrolling quickly on websites with a lot of white boxes and images, such as the Amazon homepage (12m53s).
Not everyone notices the issue, and some reviewers didn't mention it in their reviews, but once it's pointed out, it's hard to ignore (13m34s).
The author received mixed reactions to their review, with some people agreeing and others disagreeing, but eventually, some people came around to seeing the issue (13m48s).
The author believes that the "jelly" effect is not a deal-breaker and that people should not avoid buying the device because of it (13m59s).
The issue is more noticeable when the device is not strapped to the body, such as when walking or using it in a pocket (14m17s).
The author mentions that learning to evaluate screens can be a negative experience, as it can make people more critical of minor issues (15m14s).
Apple has lined up a series of announcements for the upcoming week, including the launch of Apple Intelligence, a week of Mac announcements, and the release of iOS 18.1, which may cause confusion among casual users due to the sheer amount of information being released at once (15m43s).
iOS 18.1 is launching out of beta on Monday, featuring some Apple Intelligence capabilities, such as help with writing and message summaries, although some of these features may not be entirely useful (17m1s).
iOS 18.2 is currently in developer beta and is expected to bring more significant changes, but the exact release timeline is unclear, with some speculating it may be released between now and spring (17m47s).
The upcoming week of Mac news is expected to include announcements about new Macs, possibly including M4 MacBook Pros, a new Mac Mini with a smaller case, and other Mac models, generating significant headlines (18m19s).
The release of iOS 18.2 is seen as the real iOS 18, with the current iOS 18.1 being a precursor to the more substantial update (18m11s).
The Apple Intelligence features, including those showcased in iPhone 16 ads, have been criticized for being disingenuous and not entirely representative of the actual capabilities of the technology (16m40s).
Some of the Apple Intelligence features, such as message summaries, have been found to be humorous but not particularly useful, with examples including a Ring notification being summarized as "5 to 10 people are at your door" (17m22s).
Apple Intelligence is now available, but the features people will have on their phones are minimal, including message summaries, a new Siri animation, and a "help me write" feature (19m1s).
The actual features people want, such as alternative browsers and message clients, are available in iOS 18.2, which has been released and includes big sweeping changes to iOS (19m27s).
iOS 18.2 also includes features like visual intelligence, Gen Emoji, and Chat GPT integration, which are the features people thought would be available from the beginning (19m50s).
Gen Emoji is expected to be a big deal and might be sticky, as it is the next step to tapback emojis, which have become normalized in messaging (20m15s).
Tapback emojis have become a good messaging UI and are wished to be available in email, allowing users to signify receipt of an email without needing to respond (20m42s).
A neutral emoji or a "check" emoji is desired to signify receipt of an email without expressing any further reaction (21m30s).
The raised hands emoji is sometimes used to acknowledge receipt of a message, but its meaning can be misinterpreted (21m52s).
Gen Emoji and Chat GPT are the features that people are expected to care about, but it remains to be seen if Chat GPT will be successful (22m34s).
iOS 18.2 is expected to have significant changes, including the introduction of Apple Intelligence features, which may take longer to develop than other features, and could be available in a Dev beta for an extended period (22m45s).
The update will allow users to set default phone and messaging apps, which is a major change to iOS, and will also introduce different browser engines in the EU, a huge change to iOS (23m21s).
The new browser engine feature will enable users to have a different browser engine, rather than just a different interface on top of Safari, allowing for more features and capabilities (23m49s).
The update will also bring small features to various apps, such as categories in the mail app, on-device AI scanning for child safety, and new features in iMessage, which may be more noticeable than the Apple Intelligence features (24m17s).
The real promise of Apple Intelligence is a smart Siri that can perform tasks, but this is still far from being realized (25m14s).
The ability to change default messaging apps could be significant, but it depends on whether SMS and RCS in the US will be supported on these apps (25m55s).
The introduction of different browser engines has been available for nine months, but no major browsers have been built yet, raising questions about the feasibility of this feature (25m36s).

Anthropic's New AI Update (42m35s)

Anthropic has released a new update to its chatbot, Claude, which is a competitor to GPT 40 and Llama, and it can now use a computer to perform tasks for the user (42m36s).
The update was demonstrated in a video where a researcher asked Claude to pull data from a spreadsheet and put it into an email, and Claude was able to click around, search for the data, and complete the task (42m55s).
The demo showed Claude's ability to track its actions and use the computer to pull data from various sources, including vendor software (43m20s).
Although the demo is not a shipping product, it is available in the API for other developers to build on top of, and someone has already built an app that can perform similar tasks on a Mac (43m34s).
Anthropic is productizing its AI better and faster than its competitors, making it an interesting development in the AI industry (43m44s).
The technology is similar to what Rabbit is doing with its large action model, which also clicks around web pages to perform tasks, but Anthropic's version is more elegant (43m58s).
The AI industry is still in its early stages, and it is unclear whether these technologies can be used to do anything useful, but companies like Anthropic and Rabbit are making progress (44m22s).
Anthropic's success in productizing its AI is attributed to its head of product, Mike Creger, and the company's ability to hire great product people, such as Kevin Wheel (44m36s).

The Future of AI Assistants (44m54s)

The concept of Apple Intelligence is often associated with building the ultimate Google Assistant or "super Siri," which can understand and respond to user queries, but current AI technology is limited in its ability to comprehend actual language, requiring specific queries to be written and matched (44m55s).
The solution to this limitation seems to be using computers to perform tasks, with AI solving problems, but this approach is still in its early stages (45m45s).
The Anthropic video provides an interesting breakdown of how this technology works by taking screenshots, using OCR to extract information, and performing simple tasks to achieve a goal (45m52s).
The key to this technology is breaking down complex tasks into simple, discrete steps that can be explained to a computer, but this is still a challenging task (46m15s).
Now that computers can understand language and images, the next step is to figure out how to feed them this information to perform tasks (46m39s).
The concept of turning the universe into a series of screenshots for AI to process is similar to how autonomous vehicles work, taking pictures and reacting to them (47m5s).
Many people building AI products use the self-driving car metaphor, which is a similar problem and challenge to overcome (47m35s).
The Claude demo showed the AI asking itself to generate code to solve a problem, demonstrating a recursive ability that is not possible with self-driving cars (48m2s).
This recursive ability allows Claude to solve problems by opening a web browser, asking itself for an answer, and providing a solution (48m12s).
Benchmarks in the world of AI are considered too early to be meaningful, but they do show that Claud scores around 7% on a test where humans score 70%, which is twice as good as the next nearest thing (48m23s).
The idea that a computer can figure out how to use itself and generically use any computer to reason its way to a solution is powerful, as seen in the suggestion to use Claud to solve a problem (49m0s).
The industry needs to come to grips with the idea that programming computers is no longer the primary approach, as APIs are becoming the best way to use services like Spotify (49m16s).
There is a shift towards a place where tasks are accomplished by telling the computer what to do and having it do it deterministically, rather than relying on traditional programming methods (49m31s).
The future of AI companies may rely on robots taking an infinite number of screenshots of a Windows desktop and clicking on stuff to support their billion-dollar valuations, but this approach is uncertain and potentially broken (49m34s).

Humane's AI Ambitions (49m56s)

Apple is working on a new version of Siri, which is expected to be more agentic, allowing developers to open up their apps to Siri's capabilities, potentially giving Apple a significant advantage in the AI space (49m57s).
Apple's control over its operating system and apps gives it an edge in building agentic technology, as it can do more API layer work and has more access to structured data (50m30s).
Humane, a company working on AI technology, has cut the price of its AI pin from $699 to $499, and is still offering a $24 a month subscription, as it continues to develop and ship its products (52m8s).
Humane is also working on an operating system called Cosmos, which it plans to license to other companies, similar to the webOS story, in an effort to become the AI operating system of choice (52m43s).
There are several companies, including Amazon and Google, that are also vying to become the dominant AI operating system, with Amazon planning to spend a significant amount of money to promote Alexa as a potential solution (53m21s).
Apple's ability to tell developers to give it access to their data and have them comply is a significant advantage, as it would allow Siri to focus on understanding user requests and figuring out how to fulfill them, rather than having to solve complicated AI problems (51m15s).
Netflix is one of the few companies that has resisted Apple's requests for integration, and it remains to be seen whether Apple will be able to convince other developers to open up their APIs to Siri (51m31s).
Companies like Humane are working on integrating AI into their products, with the goal of making most of their money by baking their technology into other products, which is considered a significant win if successful (53m37s).
Humane's Cosmos OS replaces traditional applications with "agents" and adds more blocks to understand user requests, but this is a complex, industry-wide project that cannot be easily shipped to other companies like HP to put in a car (54m12s).
Despite the introduction of agents, traditional applications are still needed, and the easiest way to access services like music is often through established APIs, as seen in the partnership between Rivan and Apple Music (54m54s).
The idea of agents replacing traditional computer interfaces is confusing, and it's unclear why companies would give up on established methods just to license their technology (55m32s).
AI systems like Siri could potentially replace some functions of Android Auto or CarPlay over time, but it's unlikely to immediately overthrow the entire ecosystem (55m51s).
Agents may be useful and powerful in certain instances, but the idea that they will immediately revolutionize the current ecosystem is incorrect (56m10s).
Companies like Humane are betting on becoming the controllers of all agents, but there are currently few agents, and they are not very good, and no one needs them yet (56m31s).
Even if agents work perfectly, they may still require users to click around and use them like traditional interfaces, which can be frustrating when errors occur (56m53s).
The idea of having a computer that can be more honest and transparent about its limitations and issues is appealing, such as saying "sorry bro" when a connection is not working, rather than providing generic responses like Siri does (57m19s).
The current state of AI agents and chatbots is unclear, and it seems that the business imperative behind investing in these technologies is not yet fully understood (57m59s).
The potential applications of AI, such as chatbots, image generators, and disinformation, may not be the primary drivers of investment in this technology, but rather the possibility of creating an all-powerful robot that can perform tasks for users (58m3s).
The idea of a robot that can use a computer for a user, fill out spreadsheets, and perform other tasks is seen as a potential business opportunity, but it is unclear if this technology can be successfully developed (58m38s).
The concept of talking to a computer and having it perform tasks is an appealing one, but there are doubts about whether current technology can deliver on this promise (58m57s).

Copyright vs AI: The Legal Battles (59m5s)

Perplexity, a company that uses AI models trained on copyrighted information, is facing lawsuits for copyright infringement, including a recent lawsuit from News Corp, which owns The Wall Street Journal and the New York Post (59m18s).
Perplexity has been accused of infringing on copyrighted content, with some publications, such as Wired, calling it a "content thief" and pointing out that it has stolen stories about itself (59m42s).
The New York Times has also sent a cease and desist letter to Perplexity, alleging copyright infringement, and has an ongoing lawsuit against Open AI (1h0m11s).
VOX media, the company behind The Verge, has signed a deal with Open AI, as have other publishers such as The Atlantic, which has developed open AI-powered tools on its website (1h0m21s).
The foundation of the AI industry is built on alleged copyright infringement, with many creators, including Kevin Bacon and Kate McKinnon, speaking out against the use of their work without compensation (1h1m8s).
There are differing opinions on the issue, with some seeing it as a major crime and others arguing that it is simply a computer reading and summarizing information (1h1m41s).
The future of AI assistants, such as Alexa and Siri, is built on the foundation of these AI models, which are currently the subject of unresolved copyright lawsuits (1h1m56s).
The AI industry is moving forward despite these unresolved lawsuits, with some seeing it as a major risk and others as a necessary step towards progress (1h2m14s).
Multiple companies, including Open AI and Anthropic, are currently trying to raise large amounts of money, with Open AI having raised the biggest round of funding ever, indicating a sense that they will need vast amounts of cash to build and sell their products, protect themselves from lawsuits, and make their technology lucrative (1h2m23s).
These companies are simultaneously promising a great future and preparing for potential problems, creating a bizarre dynamic where they are trying to make a case for the technology's potential to be big and lucrative in order to fund the fight to get there (1h3m5s).
Microsoft and Open AI are giving money to newsrooms to create more AI content, which is seen as a bizarre inversion of the situation, as they are giving money to take content or take it for free, and then giving more money to make AI stuff, with $10 million announced, including $5 million in software credits (1h3m15s).
The credits are not the same as money, and the situation is complex, with everything running as fast as possible in mutually exclusive ways, but it feels like if the money gets big enough, it will be okay (1h3m48s).
There is a concern that if the money gets big enough, judges in copyright cases will say that the companies should pay the people whose work they used without permission, as the value transfer doesn't make sense, and the companies should take some of the money they raised and pay it to the people whose work they need to make their product (1h4m15s).
Open AI and other companies may try to build a big enough war chest to buy out or wait out their problems, including tech, regulatory, and fights with publishers, but there is a wild card that a judicial decision could knife through the whole industry (1h5m9s).
The companies are also being attacked from the output side, with accusations of copyright infringement and questions about who is responsible for what these tools do, including a lawsuit against Character AI and Google (1h5m44s).
There are many unanswered questions about the legal responsibility of AI tool providers when users interact with their generative AI tools, such as who is responsible for the consequences of the interaction and what safety measures should be implemented (1h6m1s).
The lack of clear rules and regulations for AI tools is a significant problem, especially when companies advertise these tools as therapy tools, implying a level of responsibility for the outcomes (1h6m30s).
A statement signed by 11,500 people, including artists and creatives, claims that the unlicensed use of creative works for training generative AI is a major unjust threat to the livelihoods of the people behind those works and must not be permitted (1h7m4s).
The statement is seen as a mild response to the issue, with some arguing that it does not go far enough in addressing the problem (1h7m29s).
The issue of AI-generated content and its potential impact on the livelihoods of creators is a pressing concern, with some arguing that it is already too late to address the problem (1h7m56s).
OpenAI has raised $7 billion in funding, which has been criticized as excessive and potentially problematic (1h8m5s).
Companies that make AI tools, such as Google and Apple, are slowly starting to add labels and metadata to their work to address concerns about AI-generated content (1h8m19s).
Google is open-sourcing its watermarking tool for AI-generated text, and Google Photos will show AI-generated metadata in photos (1h8m34s).
Apple's Craig Federighi has stated that he wants photos to be photos, not fantasy, and that the company is working to show more information in photos (1h8m54s).
Despite these efforts, many argue that the companies are not doing enough to address the issue of AI-generated content and its potential impact on society (1h9m4s).
Companies are slow to develop and implement AI technology due to the need for extensive testing, hardening, and ensuring it works seamlessly within their products, which is in contrast to the rapid development of other tools that may contribute to problems in the first place (1h9m40s).
The development of AI technology is being taken more seriously than other tools, with companies being cautious and taking their time, whereas other teams within the same companies are quickly developing and releasing other products (1h10m2s).
The "fire team" at Google is mentioned as an example of a team that quickly develops and releases products, with one of the buildings in Mountain View being referred to as the "just add gun building" (1h10m13s).
The Books Palma 2, a smartphone-sized e-reader, has been released, with a price tag of $280, which is considered too expensive (1h12m7s).
The Books Palma 2 fixes some issues with the previous model, including being slow and running an outdated version of Android, but it still runs a less current version of Android (1h12m23s).
The device has the same size, screen, RAM, and storage as the previous model, with the upgrade being described as gentle (1h12m45s).
Books is described as a company that releases a wide range of products, similar to Dyson, but for e-ink devices, with their website offering various dimensions of e-ink devices (1h13m8s).
The Books Palma 2, an e-ink device with a fan, is getting a second revision, suggesting it is successful and meaningful to people (1h13m17s).
A version of the device launched in China does not have a camera, which is seen as a positive feature for security and privacy reasons (1h13m39s).
The global version of the device still has a camera, despite some users not using it, and it is suggested that removing the camera could reduce the price (1h14m1s).
The device's thesis is that the camera can be used for scanning and digitizing, but it is argued that this feature is not necessary (1h14m20s).
The device's combination of size, e-ink screen, and ability to run Android apps is seen as a positive feature (1h14m38s).
Panos Panay, a Microsoft executive, refused to look at the device on camera, which was seen as upsetting (1h14m49s).
The device's price is seen as a problem, with $280 being too much for a device that may not be used frequently (1h16m35s).
The Kindle Paperwhite is mentioned as a comparison, with its lower price point making it a more attractive option for some users (1h14m59s).
The Books Palma 2's press photo is mocked for featuring a person who appears uninterested in the device (1h15m7s).
Despite the device's flaws, it is acknowledged that it is good that devices like this exist, and that someone should make an Android device with an e-ink screen (1h15m48s).

The Click-to-Cancel Rule Controversy (1h20m59s)

The Federal Trade Commission (FTC) has passed the "Click to cancel" rule, which requires companies to make it as easy to cancel a subscription as it was to sign up, by allowing customers to cancel with just one click (1h21m1s).
The rule aims to address the issue of negative option contracts, where companies make it difficult for customers to cancel their subscriptions, and is seen as a positive move for consumers (1h21m21s).
However, several companies, including ISPs, wireless carriers, and advertisers, have sued to stop the rule, arguing that the FTC is overstepping its authority and trying to regulate consumer contracts across all industries (1h22m6s).
The companies involved in the lawsuit include Comcast, Charter, Cox, Disney, AMC, Paramount, Warner Brothers, Discovery, Google, Netflix, Amazon, Meta, and the NFL, among others (1h22m26s).
The lawsuit argues that the FTC's rule-making authority is being used to regulate consumer contracts in a way that is not authorized by law, and that the rule is an overreach of the agency's powers (1h23m13s).
The FTC's rule-making authority has been impacted by a recent Supreme Court decision that overturned the concept of Chevron deference, which allowed courts to defer to agency decisions (1h24m7s).
The lawsuit is seen as an attempt by companies to avoid making it easier for customers to cancel their subscriptions, and is viewed as a negative move for consumers (1h24m36s).
The current state of streaming TV has essentially reinvented cable, but with the added benefit of easier cancellation, which is considered a significant victory despite the often worse user experience and similar costs (1h24m52s).
Many subscription services, such as The Wall Street Journal, make it difficult for users to cancel their subscriptions, often requiring a phone call and a conversation with a representative (1h25m21s).
This issue is not unique to The Wall Street Journal, as many services, including gyms, make it hard for users to cancel their memberships, sometimes requiring in-person visits (1h25m55s).
The difficulty in canceling subscriptions can lead to frustration and a sense of being trapped in unwanted memberships (1h26m10s).
The Vergecast, a podcast, jokingly suggests that listeners who want to cancel their subscription have to call a specific number and explain why they want to cancel, but in reality, it is not possible to cancel a podcast subscription in the same way as other services (1h26m21s).
The Vergecast is a production of The Verge and Vox Media Podcast Network, and is produced by Liam James, Will Pore, and Eric Gomez (1h26m46s).