How We Should Prevent ‘Sextortion’ Scams on Snapchat
Issie Lapowsky, reporting for Fast Company:
In the excruciating hours after her 17-year-old son Jordan DeMay was found dead of an apparent suicide in March of 2022, Jennifer Buta wracked her brain for an explanation.
“This is not my happy child,” Buta remembers thinking, recalling the high school football player who used to delight in going shopping with his mom and taking long walks with her around Lake Superior, not far from their Michigan home. “I’m banging my head asking: What happened?”
It wasn’t long before Buta got her answer: Shortly before he died, DeMay had received an Instagram message from someone who appeared to be a teenage girl named Dani Robertts. The two began talking, and when “Dani” asked DeMay to send her a sexually explicit photo of himself, he complied. That’s when the conversation took a turn.
According to a Department of Justice indictment issued in May 2023, the scammer on the other end of what turned out to be a hacked account began threatening to share DeMay’s photos widely unless he paid $1,000. When DeMay sent the scammer $300, the threats continued. When DeMay told the scammer he was going to kill himself, the scammer wrote back, “Good. Do that fast.”
The sorrowful story of DeMay’s death is tragically not unique. Regular readers will know I’m typically against placing the onus of protecting children on the platforms on which people communicate rather than the parents of the victims of cybercrime, but this is a lone and important exception. The problem of stopping heartless scammers from extorting children and manipulating them sexually is an entirely separate conundrum, one that should be investigated and solved by the government and authorities. But the suicide issue — what makes a few pixels on a smartphone screen turn into a deadly attack — is solely on the platform owners to deal with. There is a lot of content on the internet, and only some of it is deadly enough to murder an innocent child. Platforms need to recognize this and act.
The truth is that platforms know when this deadly communication occurs, and they have the tools to stop it. Even when messages are end-to-end encrypted — which Snapchat direct messages aren’t — the client-side applications can identify sexual content and even the intent of the messages being sent, via artificial intelligence. This is not a complicated content moderation problem: If Snapchat or Instagram identify an unknown stranger telling anyone that they need to pay money to stop their explicit images from being shared with the world, the app should immediately educate the victim about this crime and tell them they’re not alone and how to stay safe. It might sound frugal, but this is an emotional debate, not one that requires much logic. It’s logical for someone in a good mindset to know that suicide is worse than having nude images leaked, but people driven to the brink of suicide need a reality check from the platform they’re on. This is a psychological issue, not a logical one.
In addition to showing a “You’re not alone” message when such content is identified, regardless of the ages of both parties in a conversation, platforms can and should intelligently prevent these images from being shared. Snapchat tells a user when another person has taken a screenshot of a chat, so why can’t it tell someone when an image they’ve shared has been saved? And why can’t someone disallow the saving or screenshotting of the photos they’ve sent? How about asking a sender for permission every time a receiver wants to save a photo? Adults who work for and use these social media platforms will scoff at such suggestions, saying the prompts are redundant and cumbersome for adult users who are already aware of the risks of sending explicit pictures online, but false positives are better than suicides. There should be a checkbox that allows people to always opt into photo sharing automatically, but that checkbox should come with a disclaimer educating users on the risks of sextortion scams.
Education, prompts, alerts, and barriers to simple tasks are usually known as frugal in the world of technology, but they shouldn’t be. When content on a screen drives someone to end their life, education is important. Prevention is more important than direct action, because oftentimes, action is impossible. These criminals create new accounts after they get their last victim, and it’s impossible to track them down. Snapchat on Tuesday announced new features to prevent minors from talking to people they don’t know, but this won’t prevent any deaths. Children lie about their age to get access to restricted services. The solution to this epidemic is not by ostracizing the youngest users of social media — it’s by educating them and giving them tools to protect themselves independently.
Further reading: Casey Newton for Platformer; the National Center for Missing and Exploited Children; Chris Moody for The Washington Post; and Snapchat’s new safety features, via Jagmeet Singh for TechCrunch.
Debunking E.U. Claims About Apple Violating the DMA
Today, the European Commission has informed Apple of its preliminary view that its App Store rules are in breach of the Digital Markets Act (DMA), as they prevent app developers from freely steering consumers to alternative channels for offers and content.
In addition, the Commission opened a new non-compliance procedure against Apple over concerns that its new contractual requirements for third-party app developers and app stores, including Apple’s new “Core Technology Fee”, fall short of ensuring effective compliance with Apple’s obligations under the DMA.
Dan Moren, writing for Six Colors:
At the root of this decision is the EC’s contention that Apple is overly limiting the way developers are allowed to send potential customers to their own storefronts. That includes both the actual design restrictions of external links, as well as Apple’s fee structure (the company takes a cut of any digital good or service up to seven days after the customer follows the external link). Such moves would seem to be in violation of the DMA regulation that developers can advertise and direct users to their own sites without cost.
So, two problems:
-
The commission doesn’t like Apple’s “scare screens,” the prompts that discourage users from accessing and downloading third-party app marketplaces and external payment processors. I surmise this is the main issue the commission has with Apple’s implementation, knowing its vibes-based approach to regulation.
-
The commission also doesn’t like Apple’s 10-to-17 percent1 cut it takes when a developer has opted into the new financial terms and distributes their app on the App Store with an alternative payment processor. Apple has two sets of terms: the old ones, which only allow developers to operate on the App Store and use In-App Purchase, and the new ones — called the “Alternative Terms Addendum” — which allow developers to operate in third-party app marketplaces and use alternative payment providers. Per these new terms, when an app is distributed in a third-party marketplace, a per-download Core Technology Fee applies; when an app is distributed on the App Store, a per-in-app-purchase fee applies.
Speaking of the CTF:
Simultaneous to this decision, the EC has also announced a new non-compliance investigation, its third into Apple. This action specifically looks into Apple’s developer terms in the EU, including alternative app stores and distribution methods. At the heart of this matter are three issues: whether the process for users taking advantage of alternative app distribution is too onerous, whether Apple is too restrictive in its eligibility terms (such as the rule that developers must be “of good standing” to qualify), and the existence of the Core Technology Fee.
Again, vibes-based regulation. The DMA doesn’t actually prohibit Apple from being restrictive in its terms, it just requires “gatekeepers” to allow third-party app marketplaces entirely. It also doesn’t rule out the possibility of a per-download fee like the CTF, but because the European regulators simply don’t like it, they’re able to launch another one of their investigations. And the legislation certainly doesn’t describe what an “onerous” requirement may be, because, again, it doesn’t even describe this as a possibility. The commission can’t possibly levy a fine for violating a law that doesn’t exist.
About that second snag Apple was found “guilty” of: As Moren notes, the DMA does tell gatekeepers that they must allow developers to link out to their own payment processors “free of charge,” which is exactly what Apple allows them to do when they opt into the new terms, although the steps for ditching the fee are more convoluted. When a developer opts into the Alternative Terms Addendum, Apple takes a commission of 17 percent for each external, non-IAP purchase — but that commission is for App Store distribution access; it is not a royalty for linking to a third-party payment processor. The DMA says that “the gatekeeper shall allow business users, free of charge, to communicate and promote offers…” The “free of charge” clause applies to the “communicate and promote offers” part of the law.
If a developer wants to get around this 17 percent commission and pay Apple zero for distribution in the European Union, they can distribute their app via a third-party app marketplace, in which scenario, Apple would not take a commission aside from the $100-a-year developer fee for access to Apple technologies. That’s not what Apple is being dinged for here; it’s being fined for the 10-to-17 percent fee for distribution on the App Store. There is a way to be exempt from paying fees, it just requires distribution via a third-party app marketplace — and that behavior is allowed per the rules of the DMA. (See: Article 5, Section 4.)
Neither of the policies Apple is being fined for is illegal under the DMA. And the new non-compliance investigation penalizes Apple for its new developer terms purely based on feelings, not on facts, which is a horrible way to regulate. The DMA also doesn’t make a per-download CTF illegal, and the European Commission knows that — but in a few weeks, Brussels will come back with some more bad news for Cupertino because it’s set out to put technology companies in their place. Monday’s ruling is complete nonsense.
-
The cut is 10 percent for developers who make less than $1 million a year on the App Store, and 17 percent for everyone else. ↩︎
The Debate About AI Scraping
Kali Hays, reporting for Business Insider:
The world’s top two AI startups are ignoring requests by media publishers to stop scraping their web content for free model training data, Business Insider has learned.
OpenAI and Anthropic have been found to be either ignoring or circumventing an established web rule, called robots.txt, that prevents automated scraping of websites.
TollBit, a startup aiming to broker paid licensing deals between publishers and AI companies, found several AI companies are acting in this way and informed certain large publishers in a Friday letter, which was reported earlier by Reuters. The letter did not include the names of any of the AI companies accused of skirting the rule.
Yours truly, writing on Wednesday about Perplexity, another artificial intelligence firm, doing the same thing:
What makes this different from the New York Times lawsuit against OpenAI from last year is that there is a way to opt out of ChatGPT data scraping by adding two lines to a website’s robots.txt file. Additionally, ChatGPT doesn’t lie about reporting that it sources from other websites.
That aged well. I haven’t been able to replicate Business Insider or TollBit’s findings yet through my own ChatGPT requests, but if they’re true, they’re concerning. Hays asked OpenAI for comment, but a spokeswoman for the company refused to say anything more than that it already respects robots.txt
files. This brings me back to Perplexity. Mark Sullivan, interviewing Aravind Srinivas, Perplexity’s chief executive, for Fast Company:
“Perplexity is not ignoring the Robot Exclusions Protocol and then lying about it,” said Perplexity cofounder and CEO Aravind Srinivas in a phone interview Friday. “I think there is a basic misunderstanding of the way this works,” Srinivas said. “We don’t just rely on our own web crawlers, we rely on third-party web crawlers as well.”
What a cop-out answer — it just proves Srinivas is a pathological liar and his company makes its fortune by stealing other people’s work. Perplexity is ignoring the Robot Exclusion Protocol, and it is lying about it. By saying Perplexity isn’t lying about it, Srinivas is fibbing. It’s just comical and entirely unacceptable. On top of that, he audaciously tells people that they’re the ones misunderstanding him, not the other way around.
Some people, like Federico Viticci and John Voorhees, who write the Apple-focused blog MacStories, have taken particular offense to this AI scraping, which they do not consent to. If it is true that OpenAI and Anthropic are ignoring the Robot Exclusion Protocol, then yes, they deserve to be put to the test; they’ll have to explain why they’re defying a “No Trespassing” sign, as I wrote on Wednesday. But I’ve been pondering this ethical dilemma for the past few days, and in conclusion, I don’t think AI scraping in its entirety is a bad thing. If a site doesn’t disallow AI scraping, it is a core tenet of the open web to allow anyone to use that content to learn. Granted, if the chatbot is partaking in plagiarism — copying words without attribution — just like Perplexity does, that’s both morally and probably legally wrong. But if a site doesn’t have disallow rules in place, I think it’s perfectly fine for an AI company to scrape it to help its chatbot learn.
In my case, I’ve disallowed AI chatbot scraping from all the major AI companies for now, but that’s subject to change. (I suspect it will change in the near future.) If OpenAI and Anthropic can prove that they aren’t ignoring robots.txt
rules, I’ll be glad to remove them from my disallow list and allow their chatbots to learn from my writing to improve their products. I think these products have every right to learn from the open web — the words themselves aren’t copyrighted, it’s the idea. So if a chatbot is just learning the sequence of words, not the ideas, from my writing, I think it should be able to. That’s not what Perplexity is doing, though: it’s been caught flat-footed in blatantly copying authors’ work and then lying about it. (It does that to my articles, too.) That’s unethical and wrong; it’s a violation of copyright law.
I don’t frown on Viticci and Voorhees for being so down on AI scraping. Though I might disagree with their ethical stance that AI scraping of the open web is bad, period, I think they have every right to be annoyed about these reckless AI companies stealing their content when they don’t consent to it. That’s the golden word here: consent. If a publisher doesn’t consent to their content being used by scrapers, it shouldn’t be — but if they haven’t put up disallow rules, it’s a free-for-all unless content is being plagiarized one-to-one. Every writer, no matter how famous, has learned how to write from other people, and large language models should be able to do the same. But if I copied and pasted someone else’s work without attribution, and then lied about taking their words, that would be unethical and illegal. That’s what Perplexity is doing.
I do think we need new legislation to make the robots.txt
file of a website legally binding, though. Most writers don’t work for a company with a legal team that can write well-intentioned terms of service for their website, so the robots.txt
should be enough to tell AI companies how they can use the data on a site. If an LLM violates that “contact,” the copyright owner should be able to sue. I can’t imagine legislators will take this simple approach to AI regulation, however, which is why I’m weary of dragging the government into this debate. It’ll almost certainly make the situation worse. But for now, here’s my stance: AI companies should continue to sign deals with large publishers and respect robots.txt
files. If they’re not barred from a website, they can scrape it. And writers on the internet should think for themselves if they’d like LLMs to learn from their writing: if they’re not comfortable, they should put up a “No Trespassing” sign in their robots.txt
file.
Europeans Finally Understand What Regulation Does
Samuel Stolton and Mark Gurman, reporting for Bloomberg:
Apple Inc. is withholding a raft of new technologies from hundreds of millions of consumers in the European Union, citing concerns posed by the bloc’s regulatory attempts to rein in Big Tech.
The company announced Friday that it would block the release of Apple Intelligence, iPhone Mirroring, and SharePlay Screen Sharing from users in the EU this year, because the Digital Markets Act allegedly forces it to downgrade the security of its products and services.
“We are concerned that the interoperability requirements of the DMA could force us to compromise the integrity of our products in ways that risk user privacy and data security,” Apple said in a statement.
In response to this, the most friendly, levelheaded, understanding, not-angry-all-the-time people in the world — European users of Mastodon — are raging hard, not at the European Commission, but at Apple. Of course. Let me make it clear: This is not a move of retaliation from Apple, nor is it meant to snub E.U. users purely for the sake of it. Know-it-alls on Mastodon can say that all they want, but it’s purely nonsensical from a cynical, business perspective. As Gurman writes on X, Apple needs to sell as many iPhones 15 and 16 Pro as possible because the feature is so limited. By cutting Apple Intelligence off from the iPhone’s second-biggest market, even temporarily, Apple loses an incentive for customers to buy more high-end iPhone models.
Let me put it another way: When Apple keeps Apple TV+ or Apple Intelligence out of China due to the same regulatory concerns, do Chinese people blame Apple for “retaliating” against the Chinese government and its people, or do they blame their authoritarian regime for policing what they’re able to do, say, and watch? It’s impossible to know for certain — thanks, Chinese Communist Party — but I’m guessing it’s the latter. Same for those who live in Russia or North Korea. But a minute subset of Europeans feel a raging sense of self-entitlement and that if a company excludes certain features from their home, it’s doing it for nefarious purposes.
Europe, as John Gruber, the author of Daring Fireball, writes on Mastodon, enforces the spirit of the DMA, not the actual letter of the law. How is Apple supposed to bring any new features that integrate with its other products with any amount of certainty when Europe is destined to penalize it over and over again for absolutely no reason or justification? Take the Core Technology Fee, which Apple has reduced only to affect the largest conglomerates that both accept the new business terms and set up a third-party app marketplace. European legislators in Brussels never even thought of that as an opportunity and began prematurely celebrating with champagne at just the thought of American “Big Tech” giants having to pay fees. But Apple did the work and, through its lawyers, determined the fee was legal and a clever way of complying with the law. The commission did not like that, so it said it was about to fine Apple for non-compliance.
Because Europeans don’t express any skepticism toward their government’s autocratic actions whatsoever, they really do think Apple failed to comply with the DMA. In actuality, to anyone who has read the law, the Core Technology Fee certainly does comply with it because there is no clause against it. Europe’s terribly written law has no clause saying “gatekeepers” can’t charge a per-download fee to offset the costs of complying with the regulation. But regardless, European regulators apply a vibes-based approach to applying the rules. This is a hostile environment to operate any business in, so Apple simply chose to exercise its rights to not do business. What will the European Commission do, levy a fine because Apple chose to withhold a feature from its dear kingdom’s citizens for some time? We’ll see how that works out.
Europeans will continue to be mad at Apple because they don’t understand what their government is doing. They don’t understand what their law says. They don’t even have the patience to understand that a democratically elected government can be wrong sometimes because they’re always caught up in “Big Tech is bad, Big Tech is bad.” Now, they’re making the argument that Apple’s new features aren’t illegal under the DMA and that Apple is purposely punishing Europeans because it’s dissatisfied with the regulation, but that argument is moot once the big picture becomes clear: Europe doesn’t regulate according to the law, but to its feelings.
If Apple Intelligence makes a mistake, European commissioners will immediately designate Apple Intelligence as a “very large online platform” under the Digital Services Act, a related law that regulates social media platforms. Then, once enough Europeans complain about Image Playground’s creation of racially diverse Nazis, or whatever the case may be, Europe will slam its gavel down and fine Apple 10 percent of its daily global revenue for “repeat infractions.” Is bringing Apple Intelligence to Europe illegal according to the DMA? Absolutely not. But doing business in the European Union as a large company is. Europe is criminalizing business by applying its fees however it pleases, so it comes as no surprise that Apple wants to be cautious when it does business there.
If Apple brings iPhone Mirroring to macOS in the European Union, my best guess is that it will be punished under the DMA for not opening it up to Android. The European Commission will say that limiting such a useful feature to its own devices is gatekeeping and preventing competition from thriving, and thus, Apple needs to be penalized unless it develops an Android app to make the same feature for a competitor’s product. It sounds ridiculous now, but so does “E.U. Fines Meta for Charging Users to Access Its Product.” That’s a real headline, obviously modified to be more humorous, but it isn’t untrue. The European Commission will go to the craziest lengths to make its money, and I think Apple was within its rights to withhold these features from a hostile regime until it can ready them for the regulatory scrutiny that they will inevitably receive.
Meta Users Sue to Regain Access to Lost Accounts
Karissa Bell, reporting for Engadget:
Last month, Ray Palena boarded a plane from New Jersey to California to appear in court. He found himself engaged in a legal dispute against one of the largest corporations in the world, and improbably, the venue for their David-versus-Goliath showdown would be San Mateo’s small claims court.
Over the course of eight months and an estimated $700 (mostly in travel expenses), he was able to claw back what all other methods had failed to render: his personal Facebook account.
Those may be extraordinary lengths to regain a digital profile with no relation to its owner’s livelihood, Palena is one of a growing number of frustrated users of Meta’s services who, unable to get help from an actual human through normal channels of recourse, are using the court system instead. And in many cases, it’s working.
Engadget spoke with five individuals who have sued Meta in small claims court over the last two years in four different states. In three cases, the plaintiffs were able to restore access to at least one lost account. One person was also able to win financial damages and another reached a cash settlement. Two cases were dismissed. In every case, the plaintiffs were at least able to get the attention of Meta’s legal team, which appears to have something of a playbook for handling these claims.
What a wild, fascinating story. Meta users, primarily on Facebook, receive no support from Meta’s account recovery teams, so they sue the company in small claims court for up to $10,000. Meta usually requests for plaintiffs to drop the case, but since they don’t, it rarely ever shows up to court to defend itself, resulting in a victory and financial recourse for the plaintiffs. It’s a genius idea to receive financial compensation for this very prominent problem so many people face: Either the user makes some money or they regain access to their account because Meta doesn’t want to litigate the suit.
Meta can’t possibly have a large enough legal team to show up to court for every small claims suit it has to defend, so it simply doesn’t. I don’t think any company on the planet has that much time. What it should do, however, is build out its customer support team to adequately address users’ concerns, especially if their accounts are hacked or suspended for no reason. These are common issues that arise on social platforms, but because Meta did the cost-benefit analysis to determine whether litigation is a more cost-effective solution than hiring more support staff, customers are stuck at the receiving end of Meta’s failures.
As Bell writes, yes, these are extraordinary lengths — but they’re also lengths to hold the world’s largest platforms accountable for their actions. Google, Meta, Apple, and Microsoft quite literally are integral parts of people’s livelihoods, so their support staff should be, if anything, more advanced and up-to-snuff than the government’s bureaucrats. (Arguably, government bureaucrats, such as the ones who work for the Internal Revenue Service, are also useless.) These large platforms essentially act as governments of the private sector; what would happen to the world if Microsoft banned a whole Fortune 500 company’s accounts erroneously? A massive chunk of the economy could fall apart.
Customer service shouldn’t just be limited to “paying” customers — it should be available to everyone, regardless of if they have an account or not, because these companies are so crucial to so many people’s lives. Social media isn’t just a fun section of the web for the nerdy anymore, and platforms need to begin treating it like the essential service that it is.
Perplexity is a Thief and Serial Fabulist
Dhruv Mehrotra and Tim Marchman, reporting for Wired:
A WIRED analysis and one carried out by developer Robb Knight suggest that Perplexity is able to achieve this partly through apparently ignoring a widely accepted web standard known as the Robots Exclusion Protocol to surreptitiously scrape areas of websites that operators do not want accessed by bots, despite claiming that it won’t. WIRED observed a machine tied to Perplexity—more specifically, one on an Amazon server and almost certainly operated by Perplexity—doing this on WIRED.com and across other Condé Nast publications.
The WIRED analysis also demonstrates that, despite claims that Perplexity’s tools provide “instant, reliable answers to any question with complete sources and citations included,” doing away with the need to “click on different links,” its chatbot, which is capable of accurately summarizing journalistic work with appropriate credit, is also prone to bullshitting, in the technical sense of the word.
WIRED provided the Perplexity chatbot with the headlines of dozens of articles published on our website this year, as well as prompts about the subjects of WIRED reporting. The results showed the chatbot at times closely paraphrasing WIRED stories, and at times summarizing stories inaccurately and with minimal attribution. In one case, the text it generated falsely claimed that WIRED had reported that a specific police officer in California had committed a crime. (The AP similarly identified an instance of the chatbot attributing fake quotes to real people.) Despite its apparent access to original WIRED reporting and its site hosting original WIRED art, though, none of the IP addresses publicly listed by the company left any identifiable trace in our server logs, raising the question of how exactly Perplexity’s system works.
Relatedly, Sara Fischer, reporting for Axios:
Forbes sent a letter to the CEO of AI search startup Perplexity accusing the company of stealing text and images in a “willful infringement” of Forbes’ copyright rights, according to a copy of the letter obtained by Axios…
The letter, dated last Thursday, demands that Perplexity remove the misleading source articles, reimburse Forbes for all advertising revenues Perplexity earned via the infringement, and provide “satisfactory evidence and written assurances” that it has removed the infringing articles.
What makes this different from the New York Times lawsuit against OpenAI from last year is that there is a way to opt out of ChatGPT data scraping by adding two lines to a website’s robots.txt
file. Additionally, ChatGPT doesn’t lie about reporting that it sources from other websites. Perplexity not only sleazily ignores disallow rules on sites it crawls by using a different user agent than it advertises on its website and support documentation but also lies about journalists’ reporting to users, potentially making the publisher suddenly liable for defamation claims and other legal nonsense. Perplexity is both a thief and a serial fabulist.
I maintain my position] that scraping the open web is not illegal, but simply unethical — and there are exceptions for when it is acceptable to scrape without permission. But I’m no ethicist, and while I have AI scraping disabled on my own website, I’m not sure how to feel about misattribution when quoting other websites. I do feel it’s a threat to journalism, however, and companies should focus on signing content deals with publishers like OpenAI did. Stealing, however, is a red line: If a company tells an AI scraper not to touch their website, masquerading as a completely different computer with a different IP address and user agent is disingenuous and probably illegal. If someone calls the police and trespasses someone they don’t want on their premises, and then the next day they come in with a different jacket, that’s still illegal. The property owner has trespassed the unwanted visitor, so no matter what jacket they’re in, they’re still somewhere they’re not allowed to be.
It’s not illegal for one to go into a shop they’re not barred from entering when the shop is open to the public. A flag in a robots.txt
file is the internet equivalent of trespassing AI bots from scraping a website. If the website doesn’t have a flag, I think it’s fair game for AI websites to be able to crawl it; this is why I wasn’t explicitly disappointed in Apple for scraping the open web. I wish Apple had told publishers how to disable Applebot-Extended — its AI training scraper — before it began training Apple Intelligence’s foundation models, but it doesn’t really matter in the grand scheme: I allowed my website to be scraped by Apple’s robots, so I can’t be mad, only disappointed. (I’ve now disallowed Applebot-Extended from indexing this website.) The same is true for The New York Times and OpenAI, but that’s not the case for Perplexity, which is putting on a disguise, trespassing, and stealing.
Perplexity is doing the equivalent of breaking into a Rolex store, stealing a bunch of watches, taming the Rolex logo off of them, then selling them on the street for 10 times the price saying “I made these watches.” It’s purely disingenuous and almost certainly illegal because the robots.txt
file acts as a de facto terms of service for that website. Websites like Wired and Forbes, owned by multinational media conglomerates, almost certainly have clauses in their terms of service that disallow AI scraping, and if Perplexity violates those terms, the companies have a right to send it a cease and desist. Would suing go a step too far? Probably, but I also don’t see how that wouldn’t be legally sound, unlike The Times’ suit against OpenAI.
You might think I’m playing favoritism with Silicon Valley’s golden child AI startup, but I’m not — they’re just two different cases. One company, Perplexity, is violating the terms of service of a website actively every day presently. ChatGPT scraped The Times’ website before The Times could “trespass” OpenAI after ChatGPT’s launch, and that’s entirely fair game. On top of that, it used disingenuous means to target Times articles through ChatGPT, whereas Perplexity’s model just plagiarized without even being asked. Perplexity is designed by its makers to disobey copyright law and is actively encouraged to plagiarize. If Perplexity didn’t want to do harm, it could just switch back to the “PerplexityBot” user agent it told publishers to block, but even when the company is in the news for being nefarious, it’s still not budging. In fact, Aravind Srinivas, Perplexity’s chief executive, had the audacity to say Wired’s reporters were the ones who didn’t know how the internet works, not his company. Shameful. Perplexity is a morally bankrupt institution.
Ilya Sutskever and Friends Found Safe SuperIntelligence Inc.
Dr. Ilya Sutskever, Daniel Gross, and Daniel Levy, writing on the website of their new company:
Building safe superintelligence (SSI) is the most important technical problem of our time.
We have started the world’s first straight-shot SSI lab, with one goal and one product: a safe superintelligence.
It’s called Safe Superintelligence Inc…
Our singular focus means no distraction by management overhead or product cycles, and our business model means safety, security, and progress are all insulated from short-term commercial pressures.
“Superintelligence” is not a word in the dictionary, but it’s meant to be a catch-all, alternative term for artificial general intelligence, a term for a computer system as smart as or even smarter than humans. Dr.Sutskever is one of OpenAI’s co-founders, and he served as its chief scientist until he suddenly resigned in May. Gross and Levy are also expatriates of OpenAI, whose mission is to “ensure that artificial general intelligence benefits all of humanity,” as posted on its website. I assume Dr. Sutskever’s new company is using “superintelligence” instead of “AGI” or simply “artificial intelligence” because he tried to accomplish that with OpenAI and apparently failed — so now, the mission has to be slightly modified to try it all again.
The last line I quoted about “distraction by management overhead” seemingly alludes to OpenAI’s obvious loss of direction. It’s true that OpenAI has become commercialized, which is potentially concerning for the safe development of AGI — OpenAI’s mission — but I guess the mission doesn’t matter anymore if Sam Altman, the chief executive, wants to eliminate board oversight of his company in the near future. So, thus, Safe Superintelligence — a boring name for a potentially boring company. Safe Superintelligence probably won’t create the next GPT-4 — the large language model that powers ChatGPT — or advance major research projects because it’ll struggle to raise the capital OpenAI has. It won’t have deals with Apple or Microsoft and certainly won’t be motivated by profit in the same way Altman’s company now is. Safe Superintelligence is the new OpenAI, whereas the real OpenAI is more akin to “Commercial AI.”
Is the commercialization of AI a bad thing? Probably not, but there are some doomsayers who believe it is because AI could “go rogue” and destroy humanity. I think the likelihood of such an event is minimal, but nonetheless, I also believe AI research institutes like Safe Superintelligence should exist to study the effects of powerful computer systems on society. I don’t think Safe Superintelligence should build anything new like how OpenAI did — it’s best to leave the building to the companies with capital — but the oversight should exist in a well-balanced industry. If OpenAI cooks up a contraption that has the potential to do harm, Safe Superintelligence should be able to probe it and understand how it works. It’s best to think of Safe Superintelligence and OpenAI as collaborators, not just competitors, especially if OpenAI truly does disband its board.
Let’s hope Safe Superintelligence actually lives up to its name, unlike OpenAI, though. AI is like drugs for the business industry right now: OpenAI dabbled with making a consumer product, ChatGPT — which was intended to be a limited research preview when it launched in November 2022 — the product went viral, and its entire corporate strategy shifted from safe AGI development to money making. If Safe Superintelligence, contrary to my prediction, achieves a scientific breakthrough and a hit consumer product, it’s quite possible it’ll get carried away just like OpenAI. Either Safe Superintelligence has more self-restraint than OpenAI (probably the case), or it’ll suffer the same fate.
Apple Rejects Non-JIT Version of UTM via Notarization
Also from Benjamin Mayo for 9to5Mac:
App Review has rejected a submission from the developers of UTM, a generic PC system emulator for iPhone and iPad.
The open source app was submitted to the store, given the recent rule change that allows retro game console emulators, like Delta or Folium. App Review rejected UTM, deciding that a “PC is not a console”. What is more surprising, is the fact that UTM says that Apple is also blocking the app from being listed in third-party app stores in the EU.
As written in the App Review Guidelines, Rule 4.7 covers “mini apps, mini games, streaming games, chatbots, plug-ins and game emulators”.
UTM says Apple refused to notarize the app because of the violation of rule 4.7, as that is included in Notarization Review Guidelines. However, the App Review Guidelines page disagrees. It does not annotate rule 4.7 as being part of the Notarization Review Guidelines. Indeed, if you select the “Show Notarization Review Guidelines Only” toggle, rule 4.7 is greyed out as not being applicable.
This is confusing, but I think what Apple is saying is that, even with notarization, apps are not allowed to “download executable code.” Rule 2.5.2 says apps may not “download, install, or execute code” except for limited educational purposes. Rule 4.7 makes an exception to this so that retro game emulators and some other app types can run code “that is not embedded in the binary.” This is grayed out when you select Show Notarization Review Guidelines Only, meaning that the exception only applies within the App Store. Thus, the general prohibition remains in effect for App Marketplaces and Web Distribution.
This is a clear instance of Apple itself being confused by its own perplexing guidelines. Rule 4.7 says:
Apps may offer certain software that is not embedded in the binary, specifically HTML5 mini apps and mini games, streaming games, chatbots, and plug-ins. Additionally, retro game console emulator apps can offer to download games. You are responsible for all such software offered in your app, including ensuring that such software complies with these Guidelines and all applicable laws.
Apple later “clarified” to UTM that it was not being barred from the App Store because of Rule 4.7, but because of Rule 2.5.2, which bans just-in-time compilation. Rule 4.7 purports to be an exception to Rule 2.5.2 for “retro game console emulator apps,” but it is not in practice, because no app with a JIT compiler has been able to make it through App Review. Delta, a retro game console emulator by Riley Testut, also had a JIT compiler, but Testut had to remove it in the App Store and third-party app marketplace versions of the app — Rule 4.7 didn’t give him an exception like how it hints it may.
What Rule 4.7 allows, however, is “retro game console emulator apps” on the App Store — and thus, disallows any that aren’t “game console” emulators. But crucially, this only applies to apps submitted to the App Store, not third-party app marketplaces, meaning that any emulator should be allowed on a third-party app marketplace even if it can’t be on the App Store because Rule 4.7 is not part of the “Notarization Review Guidelines,” which govern third-party app marketplaces. (Apps distributed through those marketplaces must be notarized by Apple, but their content is not reviewed.) In other words, there’s no restriction on PC emulators in third-party app marketplaces. Apple applied Rule 4.7 to both third-party app marketplaces and the App Store, which is incorrect.
Tsai is correct: Apple most likely forbids any just-in-time compilers from running on iOS, period, regardless of if the app is a game emulator or not. But I don’t think the disagreement should involve Rule 2.5.2 at all because that rule is most likely a blanket, general ban on JIT compilers, regardless of if the app is on the App Store or not; hence why only Rule 4.7 is excluded from the Notarization Review Guidelines, not Rule 2.5.2. Instead, Apple originally said it was barring UTM from operating on iOS outright because a PC is not a “console” — a rule 4.7 infraction.
2.5.2 would have applied if UTM uses a JIT compiler, but here’s the kicker: it doesn’t. Instead, because Apple realized its original decision of applying Rule 4.7 was incorrect, it quickly switched to blaming 2.5.2, which doesn’t even apply in this scenario — if anything, 4.7 does, but only to the App Store version, not the one submitted for notarization for third-party distribution. In the case of Rule 4.7, the semantics of “console” and “PC” would matter because that one change in wording determines if an app is allowed on the App Store or not.
What Tsai argues is that for apps that (a) aren’t console emulators and (b) aren’t on the App Store, Apple prohibits JIT compilation as per 2.5.2, which the European Union allows Apple to enforce as part of the clause in the Digital Markets Act that allows gatekeepers to bar apps that might be a security risk. But that guideline doesn’t even matter in this context because (a) UTM SE — the version of the app UTM submitted — doesn’t include a JIT compiler, and (b) Apple barred UTM from operating on both the App Store and third-party app marketplaces on the basis of wording, not the JIT compiler, before it backtracked. Now, Apple wants to conveniently ignore its original flawed reasoning.
Apple can’t apply Rule 4.7 to apps that want access to a third-party marketplace because it is not a notarization guideline, only an App Store one. This behavior is illegal under the DMA: Apple applied its ability to bar UTM’s access to the App Store to third-party app marketplaces as well, which it can’t do. When it got caught red-handed, it defaulted to an unrelated rule UTM SE already passed. Because App Review can’t read, it backtracked, was incorrect in its backtracking, UTM got rejected, and Apple’s two given reasons for rejecting the app were both abysmally false. This kerfuffle should have been unrelated to Rule 2.5.2, which would only apply if UTM SE used a just-in-time compiler, which, again, it doesn’t. If it did, yes, the rules would fall back to 2.5.2, which applies throughout iOS — but the only rule that matters is 4.7, which was applied incorrectly the first time.
I’m sure the European Commission will cite this mess when it fines Apple.
Sources: Apple Preparing Cheaper Vision Pro for 2025
Benjamin Mayo, reporting for 9to5Mac:
Apple is reportedly working on a cheaper, cut-down version of the Apple Vision Pro, scheduled to arrive by the end of 2025, according to The Information. At the same time, the publication says development work on a second-generation high-end model of the Vision Pro has been shelved, seemingly to prioritize the cheaper hardware path…
The Information says it is possible Apple could resume work on a high-end second-gen Vision Pro at some point, but it seems relatively confident that the move reflects a change in strategy for the time being…
The Information says the number of employees assigned to the second-gen Vision Pro had been gradually declining over the course of the last year, as attention turned to the cheaper model.
Many news outlets are running with the headline, “Apple Halts Work on Second-Generation Vision Pro.” While I guess that’s technically true, the Apple Vision Pro team at Apple is still relatively small. They’re only going to focus on one core product for the lineup at a time, and I think switching attention to the cheaper version now that the full-features “professional” model is out is a better strategy. If Apple instead went full speed ahead on developing another marginally improved Apple Vision Pro, as it does for its already segmented products, it would never be able to break into new markets. The incremental year-over-year upgrades should come once there is already a market for the product, but until the user base is stabilized, it should focus on bringing the price down. After that, it can use what it learned from the cheaper product to shape the true “next-generation” high-end Apple Vision Pro.
I don’t think the cheaper “Apple Vision” product will eclipse Apple Vision Pro in Apple’s lineup for now, but it will eclipse the older version in sales. That’s precisely the point, unlike with product lines like the iPhone or iPad. When the first iPhone was introduced in 2007, Apple immediately went to work on iPhone 3G; the same went for the iPad. But Apple Vision Pro isn’t like either of those products because it’s so astronomically expensive. It’s more akin to the Macintosh — if February’s Apple Vision Pro is the Macintosh 128K from January 1984, the low-cost headset is the iMac. The “Classic Macintosh” line of Macs is no more, and the same will be true for the first-generation Apple Vision Pro. It’s better to think of the Apple Vision Pro product line as a new generation of computers for Apple rather than accessories to the Mac like the iPod or iPhone once originally were.
The bottom line is this: I wouldn’t be too worried about this first-generation Apple Vision Pro fading into obscurity quickly. And neither do I think Apple Vision Pro buyers should buy the cheaper headset when it comes out — it’s destined to be worse. But it’s important to note that the first generation of this all-new platform doesn’t exist to be a consumer product, it’s there for developers and video producers to make content for the overall platform at large. Once the content and apps exist, Apple needs to sell a total package in a palatable product for most normal buyers, probably priced at $1,000 to $1,500. That’s exactly what we’re seeing here, and I think it’s a good strategic move. Once it makes the iMac of the Vision line, it can make the Mac Pro — and that actually good Apple Vision Pro will eventually cost much less than $3,500 because Apple has mastered producing the product at scale.
E.U. Will Fine Apple for Violating DMA
Javier Espinoza and Michael Acton, reporting for The Financial Times:
Brussels is set to charge Apple over allegedly stifling competition on its mobile app store, the first time EU regulators have used new digital rules to target a Big Tech group.
The European Commission has determined that the iPhone maker is not complying with obligations to allow app developers to “steer” users to offers outside its App Store without imposing fees on them, according to three people with close knowledge of its investigation.
The charges would be the first brought against a tech company under the Digital Markets Act, landmark legislation designed to force powerful “online gatekeepers” to open up their businesses to competition in the EU…
If found to be breaking the DMA, Apple faces daily penalties for non-compliance of up to 5 per cent of its average daily worldwide turnover, which is currently just over $1bn.
Firstly, it’s hilarious that this was leaked by Europe to The Financial Times.
Secondly, this is entirely unsurprising to anyone who understands how the European Commission, the European Union’s executive branch, functions. The reason the DMA was written was to punish “Big Tech” companies — specifically American ones — not regulate them. Moreover, the commission’s enforcement of the DMA has continuously proven to be draconian because it’s bending the rules however it wants to levy whatever punishments it wants. The DMA was just a facade for democracy, to show the world that the commission wouldn’t “regulate” the technology industry autocratically; and that regulating Apple, Google, Meta, etc., was in the interest and wishes of Europeans. The DMA, in reality, works as a free pass for the European Commission to do whatever it wants — it’s a badly written law with no real footing in legal doctrine and only exists to further strengthen the commission’s control over the market.
When the commission fully reveals why it’s fining Apple, it’ll point to a clause in the DMA that doesn’t exist, just like it did to Meta when it began its investigation of the Facebook parent. In the case of Meta, it forced the company to offer a free way for users to opt out of tracking on its services, when the DMA only required “gatekeepers” to offer a way for users to opt-out entirely, even if that way cost money. Meta’s lawyers aren’t stupid or incompetent: they knew the DMA was written only for gatekeepers to offer a tracking-free service, so they advised Meta to offer a paid, ad-free subscription. The commission didn’t like that for some reason, so it launched an investigation. That’s not a fair application of the law — it’s an application of a law that doesn’t exist.
Just as it did with Meta, the commission will probably target the Core Technology Fee, which Apple has modified so that only large companies have to pay it. But because the commission didn’t think of a per-download fee as even an option a gatekeeper could employ, it’ll erroneously target it with a law that doesn’t exist. By every measure, the Core Technology Fee — especially the amended version from May — is within the scope of the DMA and follows the laws of Europe. Apple wouldn’t risk violating the law because it knows what’s at stake here — its lawyers are competent in E.U. law and aren’t going to tell Apple to be sly about obeying. But the commission is treating Apple as if it has no interest in complying, which leads me to believe that maybe Apple shouldn’t comply.
The European Commission will fine Apple, Google, Amazon, Meta, and the rest of its long list of gatekeepers indeterminate amounts of money however it pleases because it gave itself the keys to the antitrust kingdom. These companies are dealing with a branch of government with an unchecked amount of power: it writes the law, it enforces the law, and it chooses how to enforce it. The law does not act as a check on the commission as it does in the United States, so why should Apple even comply? Apple has no chance of winning this fight against one of the most powerful regulatory bodies in the world, so it just shouldn’t. In fact, I’d say Apple should go rogue entirely and see what happens. It should increase its In-App Purchase fee to 50 percent in the European Union, tighten anti-steering rules, and subject E.U. apps to extra scrutiny in the App Review process.
What would the European Commission do in response to this blatant, unapologetic defiance of the law? Fine Apple 5 percent, which it was going to do anyway even after Apple put in all the work to comply. It’s a lose-lose situation for Apple no matter what it does because the commission has gone rogue. When your boss goes rogue and you can stand the consequences — and I’m sure Apple can; 5 percent of global daily revenue isn’t much — you should go rogue, too. Instead of applying the principle of malicious compliance, Apple should apply malicious defiance. What would Europe do, ban Apple devices from the bloc? Europeans would travel to Brussels to riot because that would be undemocratic. Would Europe pass more laws? That’s also possible, but if it fines Apple too much, Apple should just leave Europe and let the riots ensue.
I wasn’t all that supportive of the DMA when it was first passed and applied, but I never thought I’d tell Apple to break the laws of a region in which it operates. Now, that seems like the best course of action, because no matter what, it’s destined to lose.
Gurman: Apple Following in Ive’s Footsteps
Mark Gurman, reporting in his Power On newsletter for Bloomberg:
Over the past several years, Apple appeared to be shifting away from making devices as thin and light as possible. The MacBook Pro got thicker to accommodate bigger batteries, more powerful processors, and more ports. The Apple Watch got a heftier option as well: an Ultra model with more features and a longer life. And the iPhone was fattened up a bit too, making room for better cameras and more battery power.
When Apple unveiled the new iPad Pro in May, it marked a return to form. The company rolled out a super-thin tablet with the same battery life as prior models, an impressive screen, and an M4 chip that made it as powerful as a desktop computer. In other words, Apple has figured out how make its devices thinner again while still adding major new features. And I expect this approach to filter down to other devices over the next couple of years.
I’m told that Apple is now focused on developing a significantly skinnier phone in time for the iPhone 17 line in 2025. It’s also working to make the MacBook Pro and Apple Watch thinner. The plan is for the latest iPad Pro to be the beginning of a new class of Apple devices that should be the thinnest and lightest products in their categories across the whole tech industry.
We do not need this. I’d much rather take extra battery life, which has suffered in recent years, on most of Apple’s product lines than thinness, which doesn’t make sense to obsess over on “professional” products. While I do support making the MacBook Air or Apple Watch thinner, the MacBook Pro should be off-limits because there’s always more to add to that product. Imagine a thicker MacBook Pro with a larger battery and M4 Ultra processor, for example — or perhaps better cooling or improved speakers. The entire premise of the “Pro” lineup is inherently to pack the maximum amount of features into the product as possible.
Jony Ive, Apple’s former design chief who obsessed over thinness to the point where Apple’s products began to suffer severely, is slowly inching his way back into the company, albeit not directly. He clearly still has influence over the top designers, and now that Evans Hankey, who succeeded Ive, has also left the company, there’s a lack of direction from within. Take the iPhones 17 Pro, for example: Last year, Apple already thinned the phone down significantly, but now it wants to do that again, even when battery life has suffered. No iPhone has had better battery life than iPhone 13 Pro Max, and that was not a fluke. That model was one of the thickest iPhones Apple had offered before 2021, but users loved it.
I shouldn’t need to reiterate this basic design principle to Apple’s engineers over and over again. There should be a limit to sleekness, and when every other company is focusing on adding more features and larger batteries to their products each year, Apple should do the same — not go in the other direction. I don’t want the MacBook Pro to become thinner, even though I think it’s heavy and cumbersome to carry around, because its power will inevitably suffer. The reaction to this statement is always something like: “Apple made the iPad Pro thinner and it still works fine,” but that’s a misunderstanding. If Apple kept the thickness the same — the iPad Pro was already thin enough, in my opinion — but added the organic-LED display, which is more compact, it could’ve added a larger battery which would address the iPad’s abysmal standby time.
I’m not frustrated by Apple’s thinness spiel with the iPad mostly because I don’t think of the iPad as a “professional” device. I do, however, take offense to Apple applying the same flawed mentality to arguably its most professional product, the MacBook Pro. Apple can do what it wants to the MacBook Air, the lower-end iPhones, or even the iPad — but it shouldn’t think in even remotely the same direction in relation to the high-end important products.
Why Apple Intelligence is the Future of Apple Platforms
Apple’s suite of AI tools is here. How will it change how people use their devices?

Apple on Monday announced a new suite of artificial intelligence features at its Worldwide Developers Conference, held from its Apple Park headquarters in Cupertino, California. The new features, together called “Apple Intelligence,” allow users to summarize articles, emails, text messages, and notifications; improve and generate new writing in system-wide text fields; pull data from across their apps like Mail, Photos, and Contacts to power a wide range of natural language processing features; and interact with a new version of Siri, which can now be typed to and can perform actions within apps using an improved version of a technology called App Intents.
It also allows users to generate new AI images and emojis with features like “Genmoji” and “Image Playground” integrated into Messages and other third-party apps, as well as have AI create videos of photos coupled together with motion effects and music — a feature called “memory movies.” Users can also remove unwanted objects from the background of photos, search their libraries using natural language, and edit images with effects and filters automatically. Apple Intelligence runs both on-device and in the cloud depending on what Apple’s internal logic believes is necessary for the task. It leverages a breakthrough called Private Cloud Compute, utilizing the security of Apple silicon processors to handle sensitive user data — ensuring it remains end-to-end encrypted. Private Cloud Computer servers run an operating system that can be inspected by outside security researchers, Apple said, via software images that can be verified to ensure they are the ones running on Apple’s servers. Greg Joswiak, Apple’s marketing chief, said the servers run on 100 percent renewable energy. These servers were easily the most intriguing technical demonstration of the day.
Apple also announced a partnership with OpenAI to bring ChatGPT, its flagship large language model, to iOS 18, iPadOS 18, and macOS 15 Sequoia — the new operating systems coming to Apple devices this fall — via Apple Intelligence, powering general knowledge queries and complicated creative writing assignments Apple deems are too intensive for its own LLMs, both in the cloud and on-device. The integration — also coming in the fall — does not build a chatbot into the operating systems, but rather is used as a fallback for Apple Intelligence when it needs to search the web or generate more lengthy pieces of text. When ChatGPT is used, a user’s IP address is obscured and Apple makes the call to ChatGPT directly, asking a user to confirm if it is OK to use the external service to handle the query. Apple stressed that the feature would be turned off by default and that no personal data would be handed over to ChatGPT, a marked difference from its own foundation models. It also announced that more models would become available soon, presumably as the company signs more contracts with other AI makers, such as Google.
Together, the new features, which will be enabled in the fall for beta testers, finally catch Apple up to the AI buzz that has engulfed the technology industry since the launch of ChatGPT in November 2022. Investors have quizzed Tim Cook, Apple’s chief executive, on every post-earnings call since then about when Apple would join the AI frenzy, and now, its answer is officially here. Apple Intelligence does things differently, however, due to the ethics of who it’s made by: Apple Intelligence focuses on privacy and on-device intelligence more than fancy gimmicks other tech companies like Google and Microsoft have launched. Yes, by adding AI to its flagship operating systems used by billions around the world, Apple becomes vulnerable to hallucinations — phenomena where chatbots confidently provide incorrect answers — and involves itself in the difficult business of content moderation. But it also sets a new gold standard for privacy, security, and safety in the industry while bringing novel technology to its widest audience yet.
That being said, no technology comes without reservations. For one, Apple Intelligence’s Image Playground features look cheaply made, generating poor-quality images that most artists would rather do without. The systems will also easily be subjected to abuse by their users, including being asked to synthesize illegal, sexually explicit, and immoral content that Apple Intelligence may be tricked into creating even if prohibited by Apple. But Apple has said that it has thought of these issues: In response to a question from John Gruber, the author of Daring Fireball, Apple executives said Apple Intelligence isn’t made to be a general-purpose AI tool as much as it is a personal assistant that uses people’s personal data to provide helpful, customized data and answers. One example a presenter demonstrated onstage was the question, “When should I leave to pick up my mom from the airport?” Siri, in this case, was able to surface the appropriate information in Messages, track the flight, and then use geolocation and traffic data to map directions and receive the estimated travel time. Apple Intelligence is not meant to answer questions about the world — it’s intended to act as a companion in iOS and macOS.
Apple Intelligence has one glaring compromise above all, though: It only works on iPhones 15 Pro or later, iPads with the M1 chip or later, and Apple silicon Mac computers. The narrow compatibility list will inevitably cause furor within broader communities outside of the tech media, with cynicism that Apple artificially created the limitation to boost sales of new devices already spiraling on social media — but the reason for why this bottleneck exists is rather simple: AI requires significant computing power. Intel Macs don’t have neural processing units called “Neural Engines” specialized for LLMs, and older iPhones — or current-generation iPhones with less powerful processors — lack enough “grunt,” as John Gianandrea, Apple’s machine learning chief, put it Tuesday at “The Talk Show,” live from WWDC. Add to that the enormous memory constraints that come with running an entire language model on a mobile device, and the requirement begins to make sense: When an LLM needs to answer a question, the whole model — which can be many gigabytes in size — needs to fit in a computer’s volatile memory.
After mulling over the announcements from Monday for a few days, I have thoughts on each of the integrations and how users might use them. I think Monday was one of the most impressive, remarkable, and noteworthy developer conferences Apple has hosted in recent years — at least since 2020 — and while I haven’t tried Apple Intelligence yet, I’m very intrigued to learn more about its capabilities and how it will shape the nascent future of Apple’s platforms. Here are my takeaways from the Apple Intelligence portion of Monday’s keynote.
Siri and App Intents

Siri finally received a much-needed update, further integrating the assistant within the system and allowing it to perform actions within apps. The new version of Siri uses “richer natural language understanding,” powered by Apple Intelligence, to allow users to query the assistant just as they would a person, adding pauses in speech, correcting mistakes, and more. It also can transform into what is essentially an AI chatbot by allowing users to type into a text field by double-tapping at the bottom of their iPhone or iPad screen, featuring a new, rounded interface and animation that wraps around the device’s bezel and using Apple Intelligence to parse questions. Siri also knows exactly what is on the screen of someone’s device at a given moment; instead of having to ask Siri about a particular show, for example, a user can ask: “Who stars in this?” If a notification pops up, Siri knows of its contents and can perform actions based on the newfound context.
Siri now utilizes personal information from all apps, adding emails, text messages, phone call summaries, notes, and calendar events — all information stored on iCloud or someone’s phone — to what amounts to a knowledge graph part of the foundation models’ training data, which Apple calls the Semantic Index. This information is used as personal context for Siri, and any app can contribute its data to the context pool. The current version of Siri in iOS 17 does perform searches, but those searches are only keyword-based, i.e., if someone asks for a specific detail from an old text message thread, Siri wouldn’t be able to find it. The new version leverages its own intuition to search through user-generated content, going beyond basic regular expressions and keywords and using semantic searches instead. Additionally, Apple Intelligence can use its summary capabilities to catch users up on messages, emails, and notes, similar to the Humane Ai Pin and Rabbit R1’s ambitions.
The most remarkable new feature is Siri’s ability to take action in apps. Using a technology called App Intents, which exposes actions from apps to the system, Siri can use a prompt to decide what actions to run without intervention from a user. Because Siri has the advantage of personal context, it already knows what data is available to be acted upon, so if a user wants to, say, send a note made earlier as an email, they can simply instruct Siri to do so without having to name the note or where it is located in the system, such as what app it’s in. Siri also uses its vision capability to use what is on the screen as context — a user can ask Siri to fetch a particular photo simply by describing it, and then ask for it to be inserted into the current document. It’s a perfect example of “late but still great” that Apple perfectly achieves: Apple is combining four features — LLMs, personal context, on-screen context, and App Intents — into one without even notifying the user of each step. It’s nothing short of magic.
Developers of apps that belong to any category in Apple’s predefined list — examples include word processing, browsing, and camera apps — can add App Intents for the Apple Intelligence-powered version of Siri to use with some modifications to their code, just as they would to add support for interactive widgets or Shortcuts. Somewhat interestingly, apps that aren’t part of Apple’s list aren’t eligible to be used with the new Apple Intelligence version of Siri. They can still expose shortcuts to Siri, just as they did in previous versions of Apple’s operating systems, but Siri will be unable to interface with other apps to perform actions in one step. Apple says it’ll be adding more app categories in the coming months, but some niche apps inevitably won’t be supported at all, which is a shame. Skimming the rumors over the past year, I expected Apple would be using a more visually focused approach, learning the behavior of user-facing buttons and controls within apps, but Siri’s actions are all programmatic.
Either way, the new version of Siri amounts to two things: an AI chatbot with a voice mode, and a “large action model.” That combination will sound familiar to keen observers because it’s exactly what Rabbit aimed to achieve with the R1 in April — except that time, it “relied” heavily on vision to learn the user-facing graphical user interfaces of websites to perform actions on behalf of users. (It didn’t do that — it was a scam.) Apple, in contrast, has constructed a much more foolproof solution, but one that will also inevitably be neglected by large app developers for an indefinite amount of time. Here’s why: Developers who integrate App Intents will notice that the amount of time people spend in their apps will drop significantly because to do that is inherently the entire point of virtual assistants. Large developers owned by corporate giants see that as the antithesis of their existence on the App Store entirely — they’re there to make money and advertise while tracking users, and Apple’s latest technology will not let them accomplish that central goal.
For the few apps that support it, it’ll feel like true magic, because in many ways, it is magic. It’s not Apple’s fault: This is just the cost of doing business with humans rather than robots — humans have their own thoughts about how they want to conduct trade, and those thoughts will clash with Apple’s ideas, even if Apple’s approach is more beneficial to the user. For Apple’s apps, which most people use anyway, the new version of Siri will, for the first time in Siri’s 13-year-long career, feel intelligent and remarkable. Just hearing about it makes me excited because of how much technical work went into combining each of these features into harmonic software bliss. But Apple also did what Apple, at times, unfortunately, always does: it put the onus on developers instead of itself. Apple and its users will ask why app developers won’t support true magic because it is magic, but, getting down to brass tacks, the answer is clear: money. When taking into account the greediness of the world’s largest app developers like Meta and Google, I have a tough time imagining this portion of Apple Intelligence will thoroughly change how people use their devices.
What will make a difference in the way people interact with their devices is the chatbot capabilities of Siri alone. Because Siri is now powered by LLMs and the Semantic Index, it’s naturally much smarter. No more will Siri be unable to understand simple questions due to its prior, now current, inability to map complicated, human-like sentences to its corpus of knowledge because it will soon have added context. For example, if someone wants to know what is on their screen — say, they just want to look it up — they can double-tap the bottom of their screen and ask Siri. Siri can then send it to someone, add it to a note, or add it to a note and send it to someone all in one step. It’s an AI chatbot, similar to ChatGPT, except it’s more focused on answering personal questions rather than general knowledge ones. When Siri does need to connect to the internet, as often as it does to answer people’s myriad curiosities, it can either perform a normal web search or integrate with ChatGPT.
By bringing ChatGPT — not its chatbot interface, as leakers have speculated, but just the model1 — into Siri, and by extension, the entire system, it becomes genuinely intelligent. There’s no need to be thrown into an external app or interface because ChatGPT’s answers appear inline, just like other Siri answers from previous versions of iOS, but this time, those results are personalized, useful, and link to the web only when necessary. ChatGPT almost certainly will hallucinate, but (a) Apple provides an on-screen warning when connecting with ChatGPT which states sensitive information should be double-checked manually, and (b) that is simply the limit of this technology in 2024. OpenAI may cut down on hallucinations in the future, probably as part of a new GPT-5 model, but for now, Apple has done everything that it can to make Siri as smart as possible.
Siri will continue to make web searches, but as the web gets worse, the best hope for finding information effortlessly is ChatGPT. Coupled with personal context, having an Apple-made chatbot built into every iPhone in the future will be a feature many millions of people will enjoy. With Apple Intelligence, Apple has fully realized Siri’s potential — the one it architected in 2011. Siri is no longer just an “assistant” unable to understand most human queries while deflecting to Bing anymore. It is the future of computing, a future start-ups like Humane and Rabbit have been trying to conquer before Apple single-handedly put them to shame in two hours on a Monday. While Apple won’t call it a chatbot, it’s an Apple chatbot, building in the privacy and security Apple customers come to expect from Cupertino, all the while enabling the future of computing. This, without a doubt, is the most groundbreaking component of Apple Intelligence.
Summaries

One of the tasks in which LLMs typically succeed is summarization of text, so long as the wall of information fits within the model’s context window. Naturally, Apple has added summarization features to every place in its operating systems imaginable, such as Mail, Notes, Messages, notifications, and Safari. These blurbs are written by Apple’s own foundation models, which Cook, Apple’s chief executive, has said have near a 100 percent success rate, and so Apple doesn’t even bother with adding labels to summarized content. Gianandrea, the Apple ML chief, told Gruber on “The Talk Show” that Apple will also be more permissive in content Apple Intelligence summarizes: While Apple Intelligence will refuse to generate illegal or explicit content, it will not refuse to summarize content it has already been given, even if that content goes against Apple’s creation guidelines. I find this relieving: If a user provides questionable material to ChatGPT and asks it to summarize or rewrite it, for example, it will refuse even when it shouldn’t. AI researchers, such as Gianandrea, work to minimize these so-called “refusals,” which will make the models more helpful.
In Mail and notifications, Apple Intelligence enables new “priority” summaries, handpicking conversations and notifications the system deems important. For example, instead of just showing the first two lines of an email in Mail — or the subject — Apple Intelligence will condense the main points of the correspondence into a sentence that provides just enough information at a glance. It’ll then surface the most important summaries, perhaps from a user’s most important contacts or crucial alerts from companies, at the top of the inbox, complete with an icon indicating that the message has been summarized. Mail will also categorize emails, similar to Gmail, into four discrete sections at the top of the inbox for easy organization. Notifications also receive the same treatment, with priority notifications summarized and placed at the top of the notification stack. If someone sends multiple text messages in a row, for example, they will be condensed and placed in the summary. These small additions will prove handy, especially when a user is away from their devices for a while. I’m a fan.
The same summarization of notifications is also used to power a “Minimize Distractions” Focus, which is offered alongside Do Not Disturb. While Do Not Disturb, by default, silences all notifications, Minimize Distractions queries Apple Intelligence to take into consideration the content and context of a notification to determine if it is important enough to break through the filter or not. While I assume users will be able to manually select contacts and apps that’ll always remain whitelisted, similar to any other Focus, the system does most of the work in this mode. When Apple Intelligence surmises a notification is important, it will label it as “Maybe Important,” akin to “Time Sensitive” labels in current versions of iOS. Messages labeled “Maybe Important” will be summarized and grouped automatically, parallel to “priority” notifications. I think Minimize Distractions should be the new default Do Not Disturb mode for most people — it’s versatile, I think it’ll work well, and it lifts the burden of customizing a Focus from the user to the operating system.
Mail, Phone, and Notes also now feature summarizations at the top of conversations. In Mail, a Summarize button can be tapped to reveal a longer summary — roughly a paragraph — and in Notes and Phone, users can now record a call to generate a summary after it’s over in the Notes app. Without a doubt, the latter feature will be used to create text-only notes for personal use because many jurisdictions require both parties of a call to consent to a recording (this is why iOS has prohibited call recording since its introduction), but I think the feature is clever, and it’ll come in handy for long, information-dense calls. Also in Mail, Smart Reply will scan emails for questions, then prompt a user to answer each one so they don’t miss an important detail. These prompts are in the form of Yes/No questions presented in a modal sheet, and tapping on a suggestion automatically writes the answer into the email.
Safari’s summarization feature, however, is destined to be the most used: Near the Reader icon in the toolbar, users can choose to quickly summarize an article to receive the gist of it. These summaries are created through Reader Mode — the Safari view which allows users to read a clutter-free version of an article — and rely on Apple’s models to provide quick summarization. For once, it’s nice to see an AI tool that interfaces with the web and doesn’t disincentivize going to websites and giving publishers traffic. This is easily one of the best use cases for AI tools, and I’m glad to see Apple embracing it.
More broadly, the central idea of Apple Intelligence begins to crystallize in the case of its text summarization features: AI assistants — whether they be Siri, Google Assistant, Alexa — have always required active engagement to be helpful. Someone asks an assistant a question, but a good human assistant never needs to be asked for help. Assistants should work passively, helping with busy work nobody wants to do. Summarizing notifications, replacing (worthless) two-line previews in the email inbox with one-sentence blurbs, filtering unnecessary messages and whittling them down to the bare minimum, and quickly drafting call notes are all examples of Apple entering into the lives of millions to assist with tasks many don’t even know need to be done. Nobody thinks the two-line message previews in Mail are useless because, from the conception of email and the internet, that was always how they appeared. Now, there’s no need for a subject or preview where the first line is almost always a greeting — AI can make email more enjoyable and quick.
While the new Siri features are, as I said before, examples of active assistance, i.e., a user must first ask for help, Apple Intelligence is also meant to proactively involve itself in its users’ lives — and come to think of it, it’s logical. AI might flub or make up answers confidently, but so would a person; nobody would discard an email just from the summary. They’d use it to determine if it’s worth reading immediately or later. Similarly, by passively engaging users, the system decreases human reliance on AI while simultaneously making a meaningful difference in everyday scut work. This should be a core tenet of AI that other companies should make a note of — while one might think these features are just text summarization, they compose a much broader theme. Apple, chiefly, is leveraging its No. 1 advantage over OpenAI or Microsoft, that it uniquely can blend into people’s lives passively, without interruption or nuisance, while also providing a helpful service. I know the phrase gets overused, but this is something only Apple could do.
Writing Tools

Apple continued its practice of “sherlocking”2 by practically adding a supercharged version of Grammarly into every system-wide native text field in iOS and macOS. What Apple means by “native text field” is unclear, but I have to assume it’s referring to fields made with Apple’s own developer technologies for writing text. Examples presented onstage as supporting Writing Tools, the suite of features, include Bear, Craft, and Apple’s own Pages, Notes, and Keynote. The suite encompasses a summarization tool for users to have their own text summarized, as well as tools to write key bullet points and create tables or lists out of data in paragraph form — a feature I think many will find comforting because of how arduous graphs and tables can be to put together. The two grammar correction features allow users to have the system proofread and rewrite their text — both tools use the language models’ reasoning capabilities to understand the context of the writing and modify it depending on a user’s demands.
One humorous example Apple presenters highlighted onstage was rewriting a résumé more professionally when it was originally casual, but it perfectly illustrated the benefits of having a system-wide, contextually aware writing assistant within cursor’s reach. The proofreading feature underlines parts of the writing that may have grammar mistakes, similar to Grammarly, and suggests how to correct them — Federighi highlighted how all suggestions can be accepted with just one tap or click, too. If none of the pre-made suggestions in Writing Tools are applicable, a user can describe what kind of changes they’d like Apple Intelligence to make using the “Describe your change” item at the top of the menu, which launches a chatbot-like interface for text modifications. The feature set seems well thought-out, and I think it’s a major boon to have a smart, aware grammar checker built into operating systems used by billions.
While Apple’s foundation models — which run on-device and in the cloud via Private Cloud Compute depending on the complexity and length of the text, I surmise — are programmed to assist with modifying already user-generated writing, ChatGPT was demonstrated as able to write stories and other creative works with just the click of a button and prompt in the Writing Tools pane. People who use Apple devices shouldn’t have to go to the ChatGPT app or website anymore to have OpenAI’s chatbot write something or help them conduct research because it’ll be built into the system. I think this is the most useful and clear example of Apple’s ChatGPT introduction shown in the keynote. Apple is opaque with when it is sending a request to ChatGPT; even if a user explicitly asks for ChatGPT to handle the query, the system prompts them one more time to confirm and tells them that ChatGPT’s work may have errors due to hallucinations. Still, I think this specific, intentional integration is more helpful than building a full-on GPT-4o interface into iOS, for instance.
Apple evidently wants to draw a boundary between ChatGPT and its own foundation models while concurrently having the partnership jibe well with the rest of its features. It doesn’t feel out of place, but it’s easily an afterthought; I could envision Apple Intelligence without OpenAI’s help easily. Still, with all of its down-ranking, OpenAI seems more than willing to trade providing free services to Apple customers with the exposure that comes with its logo appearing in front of billions. OpenAI wants to be to generative artificial intelligence what Sharpies are to permanent markers, and since Google is the company’s largest competitor, it’s working on a “the enemy of my enemy is my friend” philosophy. As I’ve said before, OpenAI seems to be in the “spend venture capital like it doesn’t matter” phase of its existence, which is bound to be time-limited, but for now, Apple’s negotiators stroke an amazing deal — free.
Part of me wants to think ChatGPT isn’t Apple Intelligence, but nevertheless, it is — it just happens to be a less-emphasized part of the overall package. I don’t mind that: In fact, I’m impressed Apple is able to handle this much of the processing by itself. In fact, I’m almost certain based on what has been shown this week that Apple will soon3 drop OpenAI as a partner and go all-in by itself once it’s able to generate full blocks of text by itself, something it currently is not very confident in. But since Apple has offloaded the pressure of text generation, it has also coincidentally absolved itself of the difficult task of content moderation. As I wrote earlier in this article, Apple Intelligence will not refuse to improve a text, no matter how egregious or illegal it may be, because Apple understands that it is not the fault of the chatbot if the user decides to write something ostentatious. I favor this approach, and while some naysayers might blame the company for “rogue” responses, I think the onus should be placed on the prompters rather than the robot. If ChatGPT was given the task of summarizing everything a user wrote, it would fail, because the safety engineering is hard-coded into the model. With Apple’s own LLMs, it isn’t.
Image Playground and Genmoji

In the last section, I commended Apple for taking a more laissez-faire approach to content moderation, something I usually wouldn’t commend a technology giant for. I think it is the responsibility of a multi-trillion-dollar corporation like Apple to minimize the social harm its products can do, which is why I’m profoundly both repulsed and irritated by its new image generation features, called Image Playground and Genmoji. Both features are similar in that they (a) primarily handle prompting, i.e., they write a detailed prompt from the user’s simple request for the AI image generator; and (b) refrain from creating human-like imagery for its high susceptibility for misuse. Both features are available system-wide but were primarily advertised in Messages due to their expressiveness, which leads me to believe that Apple felt pressured to create an image generation feature and thought of a semi-sensible place to put it last minute. While Genmoji — terrible name aside — was leaked by Mark Gurman of Bloomberg earlier, Image Playground is novel, and information about it is scarce.
Genmoji — a portmanteau of “generated” and “emoji” — generates AI emojis based on a user’s prompt, then renders it as any text to fit in with other words and emojis in a text message. I believe these synthetic emojis are only available in Messages because they aren’t part of the Unicode emoji standard, so Apple has to do the work to make them render properly and fit within the bounds of text as part of its own proprietary iMessage protocol. If a person sends a Genmoji to an Android user, they will receive it as a normal image attached to the text message. A user can describe any combination of existing emojis, or even new ones entirely, such as a giant cucumber. Genmoji can also be used to create cartoon-like images of people one has in their contacts, so a user can ask for a contact “dressed like a superhero,” for instance. Genmoji typically creates a few icons from a prompt so a user can choose which one they’d like to use.
Image Playground is Apple’s version of DALL-E from OpenAI or Midjourney: Users can create a “novel” image based on their description and choose from a variety of prompt suggestions that appear outside of a unique colorful bubble interface surrounding the generated photo. The feature is verging on a one-to-one copy of other AI image tools on the market, but perhaps with a more appealing, easy-to-use interface that suggests additions to prompts proactively. Users can also choose themes, such as seasons, to further customize the image — from there, they can save it to Photos, Files, or copy it. Image Playground isn’t limited to Messages and can be integrated into third-party apps via an application programming interface Apple has provided developers. There is also a dedicated Image Playground app that will be pre-installed on iOS devices for people to easily describe, modify, generate, and share AI images. Users can also circle pictures they’ve drawn and turn them into AI-generated pieces with a feature called Magic Wand, which is first coming to Notes. Like Genmoji, images made using Image Playground can resemble a person depending on data derived from personal photos.
The entire concept of AI-generated photography is abhorrent to me and many others, especially those who work in creative industries or who draw artwork themselves. While Apple has negated the safety concerns that arise from AI-generated artwork — the four pre-defined styles are intentionally not photorealistic, and each image has internal metadata indicating it is generated by AI — it has not put to ease concerns from artists alarmed by AI’s cheapening of the arts industry. Frankly, AI-generated artwork is disturbing, unrealistic, and not elegant to look at. It looks shoddily designed and of poor quality, with lifeless features and colors. If AI images looked like people had made them, a different problem would be at the forefront of the conversation, but currently, AI images are cheap, filthy creations. They’re not creative; they instead disincentivize and discourage creativity while inundating the internet with deceptive photos that trick people and feel spammy and artificial.
It’s tough to describe the feelings AI images cultivate, but they aren’t pleasant. And furthermore, to add even more insult to injury, Apple hasn’t provided any information as to how its models were trained, leaving open the possibility that real artists’ work was used without permission.4 I expect this kind of behavior from companies like OpenAI and Google, who have both consistently degraded the quality of good artwork and information almost habitually, but not from Apple, whose late founder, Steve Jobs, proclaimed Apple was at the intersection of technology and liberal arts. The company has slowly but surely drifted away from those roots that made it so reputable in the first place, and it’s disheartening to observe. AI-generated art, whether it be presented in a cute bow and ribbon or a desolate webpage littered with obnoxious advertisements, is neither technology nor liberal arts — it is slop, a phrase that at this rate should probably win Word of the Year.
I’m less concerned about the social justice angle many have seemed to stake their beliefs in and more about the feelings this feature creates. Apple users, engineers, and designers all share the conviction that software should be beautiful, elegant, and inspiring, but oftentimes, the wishes of shareholders eclipse that unwaveringly essential ideal. This is one such occurrence of that eclipse — a misstep in the eyes of engineers and designers, but a benison to the pockets of investors. Apple has calculated the potential uproar within a relatively and probably measurably minor slice of its user base isn’t worth it in favor of the deep monetary incentives, and it worked for the C-suite executives. Will Image Playground and Genmoji change the way people use and feel about their devices? Possibly, maybe for the best, or maybe for the worse — but what it will do with resolute certainty is upend the value of digital artwork.
Photos

Apple, alongside all of its image generation efforts, also brought updates to photo editing and searching, similar to Google in May. Users can search their photo libraries by “describing” what they’re looking for using natural language: This differs from Apple’s current implementation where users can search for individual items like lakes, trees, etc., because now people can combine multiple queries and refine searches by adding specific details. Think of it as a chatbot that can use visual processing to categorize photos, because that’s exactly what it is. People can also generate videos called “memory movies,” short clips made from specific moments created by AI, typically complemented with music and effects. The Photos app already creates Memories, which are similar, but this time, users can describe exactly what they’d like the video to be of. Examples include trips, people, or themes from images.
The most appreciated feature ought to be the Clean Up tool, which works exactly like Google’s Magic Eraser, which first debuted with the Pixels 6 and 6 Pro in 2021. Apple Intelligence automatically identifies objects and people in the background of shots that might be distracting and offers to remove them automatically from within the Photos app. Users can then circle the distraction and the image will be recreated just as if it weren’t there. Notably, this does not compete with Adobe’s Generative Fill or other similar features — it doesn’t create what wasn’t already there. As I wrote earlier, Apple’s features aren’t whiz-bang demonstrations, they’re practical applications of AI in the most commonly used apps. I’d assume these features will be powered solely by on-device processors, but they work on photos taken on any camera, not just an iPhone.
Unlike photo generation, photo editing is an area in which generative AI can assist with the more arduous work. Photoshop has been able to remove objects from the backgrounds of photos for decades, but it requires skills and a large, powerful computer. Now, those powerful computers are in the pockets of millions, and thus, there is no need to learn these skills except for when the result truly matters. For the smallest of touch-ups, so many people are going to be empowered by having an assistant that can perform these tasks automatically. Finding photos has always been hard, but now, Apple has essentially added a librarian to the photo library. Editing photos previously required skill and know-how, but now, it’s just one tap. It’s little things like these that make the experience of using technology more delightful, and I’m glad to see Apple finally embracing them.
What Apple announced on Monday might not sound revolutionary at first glance, but keen observers will realize that the announcements and their features change how people use their devices. Technology shouldn’t do my artwork and writing for me so I can do the dishes — it should do the dishes so I can do my writing and artwork. Apple Intelligence isn’t doing anyone’s dishes yet, but it’s one step closer: It’s doing the digital version of the dishes. Apple Intelligence subtly yet conspicuously weaves itself into every corner of Apple’s beloved operating systems for a reason: people shouldn’t have to learn how to use the computer; the computer should learn from the user. For the first time ever, Apple’s computers are truly intelligent. Yes, I believe the company has misstepped in certain areas, like its image generation features, but the broad, overarching theme of Monday was that the computer is now learning from humans. The intelligence no longer lives in a browser tab or an app — it’s everywhere, enveloped in the devices we carry with us everywhere. The future is now, or, I guess, whenever Apple Intelligence goes into beta later this year.
-
Apple said ChatGPT Plus subscribers can sign in with their accounts to gain access to quicker, better models. As I’ve said earlier, this partnership feels a lot like Apple and Google’s deal to bring Google Search, Maps, and YouTube to the iPhone. ↩︎
-
“Sherlocked”: “The phenomenon of Apple releasing a feature that supplants or obviates third-party software…” ↩︎
-
I don’t have a timeline for this prediction, but I believe it’ll happen within the next few years, especially if OpenAI demands payment when it runs out of VC money. That time is coming soon, and I think Apple will be ready to ditch both Google Gemini — if it adds it in the first place; Federighi didn’t confirm anything — and ChatGPT as soon as it owes either company enormous royalties. Apple wants to be independent eventually, unlike with search engines. See: iCloud Mail or Apple Maps. ↩︎
-
Apple says Apple Intelligence was trained on a mix of licensed and public data from the internet. That public data most likely includes most websites since the user agent to disallow was only made public after Monday. Dan Moren of Six Colors wrote about how to disable Applebot-Extended on any website to prevent Apple from scraping its contents. ↩︎
Gurman: Apple AI to Be Called ‘Apple Intelligence’
Mark Gurman, leaker extraordinaire, reporting for Bloomberg:
At its annual Worldwide Developers Conference Monday, Apple will announce plans to deeply integrate AI into its major apps and features — all while reiterating a commitment to privacy and security.
The company’s new AI system will be called Apple Intelligence, and it will come to new versions of the iPhone, iPad, and Mac operating systems, according to people familiar with the plans. There also will be a partnership with OpenAI that powers a ChatGPT-like chatbot.
As John Gruber, the author of Daring Fireball, wrote on Threads, I’m keen to see where this artificial intelligence chatbot will be placed within the operating systems. I speculated in May that the partnership might simply consist of a pre-installed ChatGPT app, but the more that I hear about the deal, I think it’ll be more integrated within iOS. I don’t think it’ll be a part of Siri, however — Apple won’t want to destroy its own brand just to replace it with OpenAI’s chatbot. This is a curious aspect of the deal with OpenAI though because there aren’t that many places I’d want an AI chatbot except in an app or on the web, and OpenAI already has both of those cases covered by itself. My final long-shot guess is that it’s built into Spotlight or the URL field in Safari, which already acts as a pseudo-search engine.
“Apple Intelligence,” as off-putting as the name may sound, is quite clever. Apple knows that people will still call it “AI,” so it might as well be a clever play on words. What’s more noteworthy is that Apple will presumably not be renaming Siri entirely — and neither will it use the Siri name for its AI products. Apple Intelligence is separate from Siri, yet Siri uses Apple Intelligence to answer questions. I assume Apple will market the new version of Siri The Information leaked in May as “Siri powered by Apple Intelligence” because it does not want to destroy the fame of Siri — it just wants to improve it. Confusingly, Siri has always been used as a general moniker for Apple’s machine learning technology, like Siri Suggestions, which don’t even involve the voice assistant at all, which makes the situation all the more peculiar. Here’s how I’d draw the chart: Siri is machine learning and Apple Intelligence is generative artificial intelligence — Siri uses Apple Intelligence but Apple Intelligence doesn’t do the converse. Apple Intelligence is the consumer name for Ajax.
The new capabilities will be opt-in, meaning Apple won’t make users adopt them if they don’t want to. The company will also position them as a beta version. The processing requirements of AI will mean that users need an iPhone 15 Pro or one of the models coming out this year. If they’re using iPads or Macs, they’ll need models with an M1 chip at least.
I hope this will calm the inevitable furor from conservative technology users such as those who still willingly use Mastodon as their primary social network. Microsoft’s new Recall feature, available on its new Copilot+ PCs, has sparked anger from the community over how the feature is enabled by default on all compatible machines, so Apple’s choice to label Apple Intelligence as a beta and have it disabled by default is a good choice. (Microsoft said it would make the feature opt-in on Friday.) I would also guess that Apple will advertise the AI features somewhere in the operating system, such as when setting up a new device for the first time, because they’re the new shiny highlights of Monday’s developer conference. Apple doesn’t want to hide them, it just wants to make them easy to ignore.
The processing requirements are also understandable, and I don’t think anyone was seriously holding out hope that Intel Macs from five years ago would be able to run the new AI features. I’m even surprised M1 Macs are supported. I think macOS 15 — the upcoming version — will finally begin to sunset Intel Macs, though I don’t think they’ll lose support entirely until next year. On the iPhone side, I can already see headlines and posts on social media about how Apple is “ripping off” its consumers by requiring the latest-generation iPhones to run the new features, but I truly do think the A17 Pro is required to run large language models — the technology that powers generative AI — on-device due to memory limitations.
Xcode, Apple’s software for developing apps, is getting a big AI infusion too. It will work similarly to Microsoft Corp.’s GitHub Copilot, which can complete code for programmers automatically. Though Apple has already been using this new developer tool internally, it’s unlikely to release it in its full form to third-party developers until next year.
This is extremely exciting, though I also would like some kind of chatbot interface, perhaps developed by OpenAI, that explains and elaborates on Apple’s Swift and SwiftUI developer documentation. Beta versions of Apple’s latest software development kits are often under-documented and a chatbot could prove quite useful.
The Settings app, which has remained generally unchanged since the first version of the iPhone, is getting updated on iOS, iPadOS, and macOS with a focus on improved navigation, better organization, and more reliable search.
The System Settings app on macOS is one of the worst pieces of software Apple has ever produced, and it is shameful that it took the company two years to rectify it. “Better organization” might be helpful, but there are also many interface tweaks that must be made to SwiftUI on the Mac to make the app feel more intuitive. For example, when typing in a text field in System Settings in English, a left-to-right language, the field is aligned to the right. How is that acceptable? The entire app needs a gut-and-redo with a focus on a normal organizational structure, less modality, and customizable window sizes.
Apple is launching a Passwords app for iOS 18, iPadOS 18, and macOS 15 that will offer an alternative to the 1Password and LastPass services. This will essentially be an app version of the company’s long-existing iCloud Keychain feature, which is currently hidden in the Settings app.
At this point, I’m desperate to switch away from 1Password, so I’m excited to see what Apple has created here.
‘How the Humane Ai Pin Flopped’
Tripp Mickle and Erin Griffith, reporting for The New York Times:
Days before gadget reviewers weighed in on the Humane Ai Pin, a futuristic wearable device powered by artificial intelligence, the founders of the company gathered their employees and encouraged them to brace themselves. The reviews might be disappointing, they warned.
Humane’s founders, Bethany Bongiorno and Imran Chaudhri, were right. In April, reviewers brutally panned the new $699 product, which Humane had marketed for a year with ads and at glitzy events like Paris Fashion Week. The Ai Pin was “totally broken” and had “glaring flaws,” some reviewers said. One declared it “the worst product I’ve ever reviewed.”
It is literally the most embarrassing thing for any company to assume the reviews for its product are going to be “disappointing” before reviewers even publish their work. That’s how confident Humane was: not very confident at all. No good product maker would sell a product it thinks is sub-par from the get-go, but of course, Humane did just that because the Ai Pin is a cheap grift designed to please its venture capitalist investors — and it knows that. Humane never cared about making a product, it just cared about making money. Speaking of money:
About a week after the reviews came out, Humane started talking to HP, the computer and printer company, about selling itself for more than $1 billion, three people with knowledge of the conversations said. Other potential buyers have emerged, though talks have been casual and no formal sales process has begun.
Humane retained Tidal Partners, an investment bank, to help navigate the discussions while also managing a new funding round that would value it at $1.1 billion, three people with knowledge of the plans said.
Humane now wants to sell itself to HP because it has done the job it promised to investors: to deliver a product and sell it successfully. “Successfully” is really only defined in the eye of the beholder, but I assume Humane just thinks that means “scam enough people out of $700 to make a dent in the balance sheets.” Whatever it is, Humane now needs to make money to pay its investors and severance for its employees, and the owners will book it into the woods with whatever is left of the money pot. It’s a classic failed Silicon Valley startup. But why didn’t Humane profit earlier; why isn’t it profitable now?
As of early April, Humane had received around 10,000 orders for the Ai Pin, a small fraction of the 100,000 that it hoped to sell this year, two people familiar with its sales said. In recent months, the company has also grappled with employee departures and changed a return policy to address canceled orders. On Wednesday, it asked customers to stop using the Ai Pin charging case because of a fire risk associated with its battery.
That explains it. To be profitable, or to meet a standard of success, the company needed to make a good product. Instead, the Humane Ai Pin explodes while it’s charging. On top of that, it’s essentially worthless, slow, and expensive, so nobody other than a few enthusiasts bought one. Humane only fulfilled 10 percent of its self-set quota, per se, and that wasn’t enough to be profitable. Profit is a byproduct of making a good product, but it is not the byproduct and certainly shouldn’t be the goal. Humane’s primary objective, as I stated earlier, was not to make a good product — it already knew its device was garbage — but to make money, and when a company works with that ethos, it’s designed to fail.
Many current and former employees said Mr. Chaudhri and Ms. Bongiorno preferred positivity over criticism, leading them to disregard warnings about the Ai Pin’s poor battery life and power consumption. A senior software engineer was dismissed after raising questions about the product, they said, while others left out of frustration.
That doesn’t surprise me because I know why this gadget was even created in the first place: A Buddhist monk led Humane’s founders to an angel investor who persuaded them to build a miraculous phone replacement that would later become the Ai Pin. This is a true story. With directionless la-la-land founders like that, the project was designed to fail. I don’t think the problem is toxic positivity like how The Times says — instead, I believe Chaudhri and Bongiorno’s immense egos prevented staffers from raising questions about the device’s premise. It took the general public a day of mulling over Humane’s horrible launch video to realize Humane was dead in the water, but Humane’s intelligent workers weren’t able to say the same after five years of work? I doubt it.
One was the device’s laser display, which consumed tremendous power and would cause the pin to overheat. Before showing the gadget to prospective partners and investors, Humane executives often chilled it on ice packs so it would last longer, three people familiar with the demonstrations said. Those employees said such measures could be common early in a product development cycle.
Never work for a company whose founders are egotistic.
In January, Humane laid off about 10 employees. A month later, a senior software engineer was let go after she questioned whether the Ai Pin would be ready by April. In a company meeting after the dismissal, Mr. Chaudhri and Ms. Bongiorno said the employee had violated policy by talking negatively about Humane, two attendees said.
That is an unbelievably ridiculous policy, one that I don’t even think Apple has in place. Surely letting go of an employee for speaking negatively about the company they work for internally is illegal.
Humane, as of now, is a disaster. Ai Pins are internally combusting, the company is in financial crisis, its employees are dissatisfied, its founders are listening to Buddhist monks while reprimanding smart people, and it’s now trying to sell to a printer company in the news for disabling printers because customers don’t want to pay subscriptions. All of this, and Humane is still selling its tiny stovetop for $700 while charging users $25 a month. And if Humane does sell to HP and closes up shop, every last Humane Ai Pin will become e-waste, because without the backend subscription Humane operates, the pin is a paperweight.
Now, please, no more about this company.
Apple’s AI Ambitions Become More Clear
Mark Gurman has been slowly leaking Apple’s artificial intelligence ambitions and features to be revealed at June’s Worldwide Developers Conference over the past few months, but his report on Sunday in his Power On newsletter for Bloomberg is the most complete picture we’ve seen yet.
Apple is preparing to spend a good portion of its Worldwide Developers Conference laying out its AI-related features. At the heart of the new strategy is Project Greymatter — a set of AI tools that the company will integrate into core apps like Safari, Photos, and Notes. The push also includes operating system features such as enhanced notifications.
I’m curious to learn more about “enhanced notifications” and how they’ll be powered by AI. Something I want Apple to be careful with is shoving AI into everything — if something doesn’t need AI, it shouldn’t use it. “Project Greymatter” is also an interesting name we haven’t heard before. (I have no idea what “Graymatter” is; Wikipedia says it’s a type of blogging software developed “by Noah Grey in November 2000.”)
The system will work as follows: Much of the processing for less computing-intensive AI features will run entirely on the device. But if a feature requires more horsepower, the work will be pushed to the cloud.
Apple is bringing the new AI features to iOS 18 and macOS 15 — and both operating systems will include software that determines whether a task should be handled on the device or via the cloud.
I’m glad we’re starting to see some clarification on which tasks will be allocated to the on-device chips and which ones will be handled by cloud infrastructure, powered by Apple’s in-house M2 Ultra processors. I assume the “software” that delegates tasks is just an internal daemon that runs to intelligently determine which operations are processor-intensive enough to warrant the extra complexities and network issues that come with sending data to the cloud, but it’s also interesting to know that specific tasks aren’t always set to run on-device or in the cloud. I had assumed tasks that Apple thinks from the get-go are less intensive — like summarization of articles, for example — would be hard-coded to always use internal processors.
One standout feature will bring generative AI to emojis. The company is developing software that can create custom emojis on the fly, based on what users are texting. That means you’ll suddenly have an all-new emoji for any occasion, beyond the catalog of options that Apple currently offers on the iPhone and other devices.
I have two emoji requests: a side-eye emoji and a chef’s kiss emoji. That being said, this rumored feature feels like a gimmick — I don’t think these will actually be legitimate emojis available from the keyboard because they’d all have to be part of the Unicode standard to be viewable on all devices. Instead, they’ll probably just be stickers like the iMessage stickers currently available, exportable to PNGs by dragging and dropping.
Another fun improvement (unrelated to AI) will be the revamped iPhone home screen. That will let users change the color of app icons and put them wherever they want. For instance, you can make all your social icons blue or finance-related ones green — and they won’t need to be placed in the standard grid that has existed since day one in 2007.
I’m having a tough time understanding how this will work. Does that mean users will finally be able to change app icons to whatever image they’d like, just like on Android? If so, I’m excited. But if this feature simply adds a filter to developers’ existing app icons, it’s underwhelming. (I’ve also never understood the craze for being able to place apps anywhere on the Home Screen, but I’m sure someone is excited about it.)
A big part of the effort is creating smart recaps. The technology will be able to provide users with summaries of their missed notifications and individual text messages, as well as of web pages, news articles, documents, notes, and other forms of media.
Gurman indicates that all of these features will be powered by Apple’s own bespoke AI large language model, called “Ajax” internally — though it is unclear if that is the final name; I don’t think it’s very Apple-esque to give the underlying technology a name other than “Siri Intelligence” or something similar — while OpenAI’s generative pre-trained transformer that powers ChatGPT will only be used to power a chatbot, which Apple hasn’t been able to develop yet. As Gurman writes:
There’s also no Apple-designed chatbot, at least not yet. That means the company won’t be competing in the highest-profile area of AI: a market that caught fire after OpenAI released ChatGPT in late 2022.
Though some of Apple’s executives are philosophically opposed to the idea of an in-house chatbot, there’s no getting around the need for one. And the version that Apple has been developing itself is simply not up to snuff.
The solution: a partnership. On that front, the company has held talks with both Google and OpenAI about integrating their chatbots into iOS 18. In March, it seemed like Apple and Google were nearing an agreement, and people on both sides felt like something could be hammered out by WWDC. But Apple ultimately sealed the deal sooner with OpenAI Chief Executive Officer Sam Altman, and their partnership will be a component of the WWDC announcement.
I assume this excerpt is Gurman insinuating the deal between Apple and OpenAI has officially been signed, and that Altman will be presenting the partnership akin to how Hans Vestberg, the chief executive of Verizon, announced the 5G partnership between Verizon and Apple during Apple’s iPhone 12 “Hi, Speed” event in October 2020. I am curious about what a chatbot “built into” iOS 18 means — it’s not powering Siri, as Gurman says in the newsletter, and OpenAI already has ChatGPT apps for iOS and macOS that are native and reliable. How much more integrated could the chatbot be? Will it be contextually aware, similar to Google’s “Circle to Search” feature, or does this just mean the ChatGPT app will be pre-installed on new iPhones, akin to the first YouTube app?
It might also be that ChatGPT won’t generate Siri’s answers per se, but will instead be used to create answers when Siri thinks the query is too complex, like how Apple partnered with Wolfram Alpha in Siri’s earliest days. That would also be potentially interesting, but it also feels like a step back for Apple. (I would take anything to make Siri better, though.)
Altman has grown increasingly controversial in the AI world, even before a spat last week with Scarlett Johansson. OpenAI also has a precarious corporate structure. Altman was briefly ousted as CEO last year, generating a crisis for employees and its chief backer, Microsoft.
In other words, Apple can’t be that comfortable with OpenAI as a single-source supplier for one of iOS’s major new features. That’s why it’s still working to hash out an agreement with Google to provide Gemini as an option, but don’t expect this to be showcased in June.
That secondary agreement with Google — which apparently has not been signed yet — is very unusual, and I’m surprised OpenAI even agreed to its possibility. I guess the contract Apple and OpenAI signed is non-exclusive, meaning Apple can partner with any other company it wants. Even though Gurman cites Apple executives being uncomfortable with Altman’s company’s unpredictability, I don’t think users will care. Apple wants the best technology to be available to Apple consumers, and OpenAI makes the best LLMs — not Google. Yes, Apple prefers reliability and “old faithful” over flashy new companies — which is why it would make sense for it to extend its partnership with Google that it already has for Google Search — but “reliability” also means product reliability.
Google’s Gemini suite of AI products is anything but reliable, generating racially diverse Nazis and using Reddit answers in search summaries. AI, regardless of who it is made by, has generated significant controversy over the past few years, and no matter which company Apple partners with, it will continue to generate media headlines. If I were Apple, I’d opt for the company with the better product, because it’s not like Google’s public relations have been excellent either.
Also, consumers who aren’t attuned to the news every day aren’t going to know whenever there is a new feud at OpenAI. Users want better products, and if Gemini tells Apple users to eat rocks and gasoline pasta, it’s a poor reflection on Apple, not Google.
Google Search Summaries Tell People to Eat Glue
Jason Koebler, reporting for 404 Media:
The complete destruction of Google Search via forced AI adoption and the carnage it is wreaking on the internet is deeply depressing, but there are bright spots. For example, as the prophecy foretold, we are learning exactly what Google is paying Reddit $60 million annually for. And that is to confidently serve its customers ideas like, to make cheese stick on a pizza, “you can also add about 1/8 cup of non-toxic glue” to pizza sauce, which comes directly from the mind of a Reddit user who calls themselves “Fucksmith” and posted about putting glue on pizza 11 years ago.
Here is what I wrote about Google’s artificial intelligence right after the company’s I/O conference earlier in May:
The summaries are also prone to making mistakes and fabricating information, even though they’re placed front-and-center in the usually reliable Google Search interface. This is extremely dangerous: Google users are accustomed to reliable, correct answers appearing in Google Search and might not be able to distinguish between the new AI-generated summaries and the old content snippets, which remain below the Gemini blurb. No matter how many disclaimers Google adds, I think it is still too early to add this feature to a product used by billions. I am not entirely pessimistic about the concept of AI summaries in search — I actually think this is the best use case for generative artificial intelligence — but in its current state, it is best to leave this as a beta feature for savvy or curious users to enable for themselves.
Google in a statement to The Verge claimed that these incidents are simply squabbles for nothing and that they are isolated and appear only in results for uncommon queries. (Sundar Pichai, Google’s chief executive, also said the same in an interview with Nilay Patel, The Verge’s editor in chief, although in a slightly backhanded way.) Meghann Farnsworth, a spokesperson for Google, said the company believes the mistakes come from “generally very uncommon queries” when time and time again that theory has been proven false. Generative artificial intelligence is prone to making mistakes due to the way that large language models — the technology that powers generative AI — are made. Google knows it cannot solve that problem singlehandedly without further research, so it labels AI-generated blurbs at the top of Google search results as “experimental.”
Google’s mission when it announced that it would be bringing AI search summaries to all U.S. users by the end of the year was not to improve search for anyone — it was to motion to shareholders that the company’s AI prowess hasn’t been diminished by OpenAI, its chief rival. All press might be good press, but I truly don’t think this many incidents of Google’s AI flubbing the most basic of tests is very good for the company’s image. Google is known for being reputable and trustworthy, and it has shattered that reputation it so painstakingly created for itself in just a matter of weeks. The public’s perception of Google, and in particular, Google Search, has already been in a steady decline for the past few years, and the findings of people from all over the internet over the past week have further regimented the idea that Google’s main product is no longer as useful or capable as it once was.
These are not isolated incidents, and whenever representatives for Google have been confronted with that fact, they have never once tried to digest it and make improvements, as any sane, fast-moving company with a clear and effective hierarchical organizational structure would. Google does not have effective leadership — proven by Pichai’s nonsensical answer to Patel — so it is instead effectively deflecting the blame and chastising the users for typing in “uncommon queries.” Google itself has boasted about how thousands of new, never-seen-before queries are typed into Google each day, but now it is unable to effectively manage its star, most popular product like how it did once upon a time. Google Search is not dying — Bing and DuckDuckGo had an outage on Thursday and hardly anyone noticed — but it is suffering from incompetent leadership.
For now, Google needs to take the financial and perhaps emotional hit and pull search summaries from the public’s view, because recommending people eat glue is beyond ridiculous. And I think the company needs a fundamental reworking of its organizational structure to address fundamental setbacks and issues that are preventing employees from voicing their concerns. The most employees have been able to do is add a “Web” filter to Google Search for users to be able to view just blue links with no AI cruft. There is no more quality control at Google — just like a Silicon Valley start-up — and there is also no fast-paced innovation, unlike a Silicon Valley start-up. Google is now borrowing the worst limitations from small companies and combining them with the operational headaches of running a large multinational corporation. That can only be attributed to ineffective leadership.
Microsoft Announces ‘Copilot+’ PCs
Umar Shakir, reporting for The Verge:
Microsoft brought Windows, AI, and Arm processors together at a Surface event on May 20th…
The big news of the day was Microsoft’s new class of PCs, dubbed Copilot Plus PCs. These computers have processors with NPUs built in so they can do more AI-oriented tasks directly on the computer instead of the cloud. The AI-oriented tasks include using a new Windows feature called Recall.
Microsoft also announced a new Surface Laptop and Surface Pro Tablet powered by Qualcomm’s Snapdragon X processors. That means they should be thinner, lighter, and have better battery-life while also handling AI and processor heavy tasks. And Microsoft wasn’t the only one at the event showing off new laptops. HP, Asus, Lenovo, Dell, and other laptop makers all have new Copilot Plus PCs.
An important thing to note is that “Copilot+” is not a new software feature — it’s the brand name for Microsoft’s new line of computers, many of which aren’t even made by Microsoft itself through its Surface line of products, either. “Copilot+” computers have specification requirements for RAM and neural processing units, or NPUs for short: 16 gigabytes of RAM, 256 GB of storage, and an NPU rated at 40 trillion operations per second to run the artificial intelligence features built into the latest version of Windows. These new AI features are called “Copilot,” a brand name that has been around for about a year. Here is Andrew Cunningham, reporting for Ars Technica:
At a minimum, systems will need 16GB of RAM and 256GB of storage, to accommodate both the memory requirements and the on-disk storage requirements needed for things like large language models (LLMs; even so-called “small language models” like Microsoft’s Phi-3, still use several billion parameters). Microsoft says that all of the Snapdragon X Plus and Elite-powered PCs being announced today will come with the Copilot+ features pre-installed, and that they’ll begin shipping on June 18th.
But the biggest new requirement, and the blocker for virtually every Windows PC in use today, will be for an integrated neural processing unit, or NPU. Microsoft requires an NPU with performance rated at 40 trillion operations per second (TOPS), a high-level performance figure that Microsoft, Qualcomm, Apple, and others use for NPU performance comparisons. Right now, that requirement can only be met by a single chip in the Windows PC ecosystem, one that isn’t even quite available yet: Qualcomm’s Snapdragon X Elite and X Plus, launching in the new Surface and a number of PCs from the likes of Dell, Lenovo, HP, Asus, Acer, and other major PC OEMs in the next couple of months. All of those chips have NPUs capable of 45 TOPS, just a shade more than Microsoft’s minimum requirement.
These new requirements, as Cunningham writes, essentially exclude most computers with processors made by Intel and Advanced Micro Devices built on the x86 platform. Microsoft and its partners are instead relying on Qualcomm’s Snapdragon Arm-based processors, which have capable NPUs and are more battery-efficient for laptops, to power their latest Copilot+ computers. Microsoft’s two Arm-based machines, the Surface Laptop and Surface Pro Tablet, run up to 58 percent faster than Apple’s newly-released M3 MacBook Air, says Microsoft, though it didn’t provide more specifications on how it measured the performance of the Qualcomm chips. I don’t believe in the company’s numbers, especially since it says the new Surface machines have better battery life than the MacBook Air, which would truly be a feat.
The new processors and specifications power new Copilot features in Windows, which will be coming to Windows 11 — not a new version called Windows 12, unlike some have speculated — in June. Some of the features run on-device to protect privacy, while others run on Microsoft’s Azure servers just like they did before. Microsoft announced that it would be deploying access to GPT-4o, its partner OpenAI’s latest large language model announced earlier in May, as part of the normal version of Copilot later this year, and it also announced new image generation features in certain apps. The new version of Windows, which includes an x86-to-Arm translator called Prism, has been designed for Arm chips, and Microsoft announced that it has collaborated with leading developers, such as Adobe, to bring Arm versions of popular apps to the new version of Windows. (Where have I heard that before?)
The biggest new software feature exclusive to the Copilot+ PCs is called “Recall.” Here is Tom Warren, reporting for The Verge:
Microsoft’s launching Recall for Copilot Plus PCs, a new Windows 11 tool that keeps track of everything you see and do on your computer and, in return, gives you the ability to search and retrieve anything you’ve done on the device.
The scope of Recall, which Microsoft has internally called AI Explorer, is incredibly vast — it includes logging things you do in apps, tracking communications in live meetings, remembering all websites you’ve visited for research, and more. All you need to do is perform a “Recall” action, which is like an AI-powered search, and it’ll present a snapshot of that period of time that gives you context of the memory…
Microsoft is promising users that the Recall index remains local and private on-device. You can pause, stop, or delete captured content or choose to exclude specific apps or websites. Recall won’t take snapshots of InPrivate web browsing sessions in Microsoft Edge and DRM-protected content, either, says Microsoft, but it doesn’t “perform content moderation” and won’t actively hide sensitive information like passwords and financial account numbers.
What makes Recall special — other than that none of the data it captures is sent back to Microsoft’s servers, which would be both incredibly invasive and entirely predictable for Microsoft — is that it only captures screenshots periodically as work is being done on Windows. Users can go to the Recall section of Windows and simply type a query, using semantic natural language reasoning, to prompt an on-device LLM to search the library of automatically captured screenshots. The LLMs search text, videos, and images using multimodal functionality, and even transcribe spoken language using a new feature called “Live Captions,” also announced Monday.
Recall reminds me of Rewind, the Apple silicon-exclusive Mac app touted last year by a group of Silicon Valley entrepreneurs that continuously records one’s Mac screen to allow an LLM to search everything someone does on it. The app sparked privacy concerns because the processing was done in the cloud, not on-device, whereas Microsoft continuously stated that no screenshots leave the device. I think it’s neat, but I’m unsure of its practicality.
Live Captions also translates 44 various languages into English, whether the content is being played in Windows or using the microphones to listen to conversations. It also processes queries entirely on-device, using the NPUs. It also transcribes audio and video content from all apps, not just ones that support it — this means that content from every website and program will be able to receive automatic, mostly accurate subtitles. (This is something I hope Apple adds in iOS 18.)
I think Monday’s announcements are extremely intriguing, especially regarding the bombastic claims by Microsoft as to the new AI PCs’ battery life and performance, and I’m sure reviewers will thoroughly benchmark the new machines when they arrive in June. And the new Copilot features — while I’m still not a fan of the dedicated Copilot Key — also seem interesting, especially “Recall.” I can’t wait to see what people use it for.
Scarlett Johansson: OpenAI Hired a Soundalike Without My Permission
Jacob Kastrenakes, reporting for The Verge:
Scarlett Johansson says that OpenAI asked her to be the voice behind ChatGPT — but that when she declined, the company went ahead and created a voice that sounded just like her. In a statement shared to NPR, Johansson says that she has now been “forced to hire legal counsel” and has sent two letters to OpenAI inquiring how the soundalike ChatGPT voice, known as Sky, was made.
“Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system,” Johansson writes. She says that Altman contacted her agent as recently as two days before the company first demoed the ChatGPT voice asking for her to reconsider.
Altman has made it clear that he admires Johansson’s work. He’s said that Her, which features Johansson as an AI voice assistant, is his favorite film; after the ChatGPT event last week, he posted the word “her,” seemingly in reference to the voice demo the company presented, which featured an assistant that sounded just like Johansson.
OpenAI said this morning that it was pulling the voice of Sky in order to address questions around “how we chose the voices in ChatGPT.” The Verge has reached out to OpenAI for comment.
Johansson says she was “shocked, angered and in disbelief” over how “eerily similar” the voice of Sky sounded to herself. OpenAI said the voice comes from an actor who they hired who is speaking in their normal speaking voice. The company declined to share the actor’s name, citing privacy concerns.
You can read Johansson’s letter here, and I encourage you to do so. Here is the story from her side:
- OpenAI asks Johansson to be the voice for ChatGPT. Johansson refuses, citing personal reasons.
- OpenAI goes out and hires another voice actor who sounds like her in September of last year. The company launches the voice later in the year.
- OpenAI launches a new model earlier in May that is more expressive, highlighting the similarities between the voice, “Sky,” and Johansson.
I have absolutely no idea what Altman, OpenAI’s chief executive, was thinking with this atrocious decision. It clearly shows the company’s lack of regard for copyright laws and exemplifies the need for strong protections for actors in the age of artificial intelligence. As if this sleazy maneuver wasn’t enough to keep under wraps, Altman went ahead and posted on the social media website X: “her” after the Monday “Spring Update” keynote, hosted by Mira Murati, the company’s chief technology officer. Did OpenAI seriously think Johansson, one of Hollywood’s most famous actresses, wouldn’t pursue legal action against this?
Altman could’ve claimed plausible deniability because he wasn’t directly involved in the hiring of the new voice actress, but by posting about the movie, in which Johansson stars, it links him to the chaos. And posting about the movie makes him look even worse from a moral standpoint; it’s almost like a “just because you didn’t agree doesn’t mean I can’t clone your voice” type of sinister thinking, but maybe that’s just me being cynical. Even if Altman didn’t post, I still would’ve believed that he was involved because of his affinity for the film and because the voice sounds so eerily similar to Johansson’s.
Johansson isn’t out to get OpenAI — I don’t even think she’s very upset — but she does want some transparency as to who it hired for the voice and they were chosen. (Clearly, because they sound like Johansson, though I find it difficult that OpenAI will willingly admit that.) I wish to know this information too, because in an age where deepfakes are so prevalent, transparency and openness are crucial. OpenAI, as the leader of the AI revolution, needs to take accountability for this and respect copyright laws.
And no, I hardly doubt this will alter Apple’s negotiations with OpenAI for iOS 18.
Slack Admits It’s Training LLMs on Private Messages
Will Shanklin, reporting for Engadget:
Slack trains machine-learning models on user messages, files, and other content without explicit permission. The training is opt-out, meaning your private data will be leeched by default. Making matters worse, you’ll have to ask your organization’s Slack admin (human resources, IT, etc.) to email the company to ask it to stop. (You can’t do it yourself.) Welcome to the dark side of the new AI training data gold rush.
Corey Quinn, an executive at DuckBill Group, spotted the policy in a blurb in Slack’s Privacy Principles and posted about it on X (via PCMag). The section reads (emphasis ours), “To develop AI/ML models, our systems analyze Customer Data (e.g. messages, content, and files) submitted to Slack as well as Other Information (including usage information) as defined in our Privacy Policy and in your customer agreement.”
The opt-out process requires you to do all the work to protect your data. According to the privacy notice, “To opt out, please have your Org or Workspace Owners or Primary Owner contact our Customer Experience team at feedback@slack.com with your Workspace/Org URL and the subject line ‘Slack Global model opt-out request.’ We will process your request and respond once the opt out has been completed.”
This is horrifying. I’m usually not one to be all too worried about public writing being used for large language models, but private direct messages and conversations within restricted Slacks ought to be off-limits. Slack is covering up here by distinguishing between its official premium Slack LLMs — which cost money — and workspace-specific search tools, but there is no difference. They’re both artificial intelligence products, and they’re both trained on private, presumably encrypted-at-rest data. It is malpractice for Slack to hide this information in a document written by seasoned legal experts that no normal person will ever read, and the entire company should be ashamed of itself. Salesforce continues to pull nonsense like this on its customers for no reason other than maximum profit making, and it is shameful. If there were a better product than Slack in its market, the Slack division of Salesforce would go bankrupt.
What makes matters worse — yes, even worse than training LLMs on private messages — is that customers have no way of opting out unless they ask their Slack administrator to email the company’s Feedback address requesting to opt-out. There are two problems here: individual users can’t opt out of training their own data and administrators have to email the company to prevent their employees’ data from being harvested by Salesforce. How is this kind of behavior legal, especially in Europe? Some rather frustrated Slack users are demanding the company make the default behavior to opt into training rather than opt out, but I wouldn’t even go that far. Slack needs to build a toggle switch for every employee or Slack user to turn data sharing off for themselves — and it needs to do it fast. Anything shallow of that is beyond unacceptable. These are private messages, not public articles or social media posts.
I don’t know how anyone can justify this behavior. It’s sleazy, rude, disrespectful, and probably violating some European privacy regulations. People have been able to trick LLMs into leaking their training data with relative ease and that is not something Salesforce/Slack can mitigate with a couple of lines of code because the flaw is inherent to the design of the models. This bogus statement from Slack’s social media public relations department was written by someone who is absolutely clueless about how these models work and how data can be extracted from them, and that, plainly, is wrong. Private user data should never be used to train any AI model whatsoever, regardless of who can use it or access it. The training, if it happens, should only be constrained to on-device machine learning, like Apple Photos, for example. And moreover, burying the information about data scraping in a few lines in a privacy policy not a single customer will read is irresponsible. Shame on Salesforce, and shame on Slack.
Google Plays Catch-Up to OpenAI at This Year’s I/O
Google threw things at the wall — now, it hopes some will stick

At the opening keynote of its I/O developer conference on Tuesday, Google employed a strategy born of sheer desperation: Throw things at the wall and see what sticks. The company, famed for leading the artificial intelligence revolution within Silicon Valley for years, has been overtaken by none other than a scrappy neighbor with some help from Microsoft, one of its most notable archenemies. That neighbor, OpenAI, stunned the world just a day prior on Monday with the announcement of a new omni-modal large language model, GPT-4o, which features a remarkably capable and humanlike text-to-speech apparatus and state-of-the-art visual recognition technology. OpenAI first took the world by storm in November 2022 with the launch of its chatbot, ChatGPT, which instantly became one of the fastest-growing consumer technology products ever. From there, it has only been smooth sailing for the company, and everyone else has been trying to catch up — including Google.
In a hurry, Google quickly went into overdrive, declaring a “code red” and putting all hands on deck after Microsoft announced a new partnership with OpenAI to bring the new generative pre-trained transformer technology to Bing. Last year, Google announced Bard, its AI chatbot meant to rival OpenAI, only for OpenAI’s latest GPT-4 to run laps around it. Bard would consistently flub answers through hallucinations — phenomena where chatbots confidently provide wrong answers unknowingly due to a quirk in their design — fail to provide references, and ignore commands, placing it dead last in the rankings against its rivals. At Google’s I/O conference last year, Google began trying to add the model hurriedly to its existing Google Workspace products, like Google Docs and Gmail, but most users didn’t find it very useful due to its constant mistakes.
Later in the year, Google announced three new models to better compete with OpenAI: Gemini Nano, Gemini Pro, and Gemini Ultra1. The three models — each with varying parameter and context token sizes — were poised to perform different tasks each, but Google quickly touted how Gemini Pro was comparable to GPT-3.5 and Gemini Ultra even beat GPT-4 in some circumstances. It put out a demonstration showcasing the multimodal features of Gemini Ultra, showed off Gemini Pro’s deep interaction with Google products like YouTube and Google Search, and pre-installed the smaller Gemini Nano model onto Pixel phones in the fall to perform quick on-device tasks. And most importantly of all, to change Bard’s brand reputation, Google changed the name of its AI product and chatbot to Gemini. Eventually, it attempted to put Gemini everywhere: in Google Assistant, in Google Search by way of Search Generative Experience, and in its own app and website. It was a fragmented mess — while the models were average at best, there were too many of them in too many places. They cluttered Google’s already complex ecosystem of products.
So, with the stage set, expectations were high for Tuesday’s I/O event, where Google was poised to clean up the clutter and consolidate the AI mess it had entangled for itself so hastily over the last 16 months. And, in typical Google fashion, the company utterly flopped. Instead, Google leaned in on the mess, throwing Gemini into every Google product imaginable. Google Search now has Gemini built-in for content summaries, replacing SGE for all U.S. users beginning this fall; Gmail now has Gemini search and summaries to shorten threads, find old emails, and draft responses; Android now has a contextually aware version of Gemini which can be asked questions depending on user selections; and every nook and cranny of Google’s services has been dusted with the illustrious sparkles of AI in some capacity. I tried to make some sense out of the muddied features, and here is what I believe Google’s current master plan is:
-
Let developers toy with Gemini however they would like, lowering prices for the Gemini application programming interface and making new open-source LLMs to lead the way in the development and production of AI-focused third-party applications.
-
Bring Gemini to every consumer product for free to increase user engagement and deliver shareholder value to please Wall Street.
-
Unveil new moonshot projects to excite people and sell them on the prospect of AI.
I came up with this thesis after closely observing Google’s announcements on Tuesday, and I think it makes sense from an organizational, business perspective. In practice, however, it just looks desperate. Tuesday was catch-up day for Google — the company did not announce anything genuinely revolutionary or never seen before but rather focused its efforts on reclaiming its top spot in the AI space. Whether the strategy will yield a positive result is to be determined. In the meantime, though, consumers are left with boring, uninteresting, unexciting events that mainly function as shareholder advertisements instead of places to showcase new technology. Google I/O was such an event, with its steam stolen by OpenAI’s presentation just the day prior — and that is entirely the fault of Google, not OpenAI. Here are my takeaways from the keynote this year.
Gemini for the Web
Since the advent of ChatGPT, AI chatbots and their makers have been intent on upending the norms of the web. Publishers have reported frustration due to decreased traffic, users are inundated with cheap AI-generated spam whenever they make a Google search, and it is even harder than ever to ensure answers’ accuracy. Google, without a doubt, bears some responsibility for this after its beta introduction of SGE last year, which automatically queries the web and then quickly writes a summary pinned to the top of the results page. And even before that, Gemini was engineered to search the web to generate its answers, providing citations in line for users to fact-check its responses.
In practice, though, the citations and links to other websites are minuscule and are rarely clicked because most of the time, they’re simply unneeded. Instead of taking steps to address this information conundrum that has plagued the web for over a year, Google leaned into it at I/O this year — both in Google Search and Gemini, the chatbot.
First, Gemini: Gemini had fallen behind in sheer number of features compared to OpenAI’s GPT-4, so Google announced some remedies to better compete in the saturated chatbot market. The company announced it would build a conversation two-way voice mode into Gemini — both the web version and mobile app — similar to OpenAI’s announcements from Monday, allowing users to speak to the robot directly and receive speedy answers. It said the feature, which will become available later this year, will be conversational unlike Google Assistant, which currently only speaks aloud answers to user queries without asking follow-up questions.
However, it is unclear how this differs compared to the Gemini Google Assistant mode available for Pixel users now. Google Assistant on Pixel phones has two modes: the standard Google Assistant mode and Gemini, which uses the chatbot to generate answers. Moreover, there is already feature parity between the Gemini app and Google Assistant on Android, further muddling feature sets between Google’s AI products. This is what I mean by Gemini coming to every nook and cranny of Google’s software. Google needs to clean up this product line.
The new version of Gemini will also allow users to create custom, task-specific mini chatbots called “Gems,” a clever play on “Gemini.” This feature is meant to rival OpenAI’s “GPTs,” customizable GPT-4-powered chatbots that can be individually given instructions to perform a specific task. For example, a GPT can be programmed to search for grammar mistakes whenever a user uploads a file — that way, there is no need to describe what to do with every file that is uploaded on the user’s end as someone would have to do with the normal version of ChatGPT. Gems are a one-to-one knockoff of GPTs — users can make their own Gems and program them to perform specific tasks beforehand. Gems will be able to access the web, potentially becoming useful research tools, and they will also have multimodal functionality for paying Gemini Advanced users, allowing image and video uploads. Google says Gems will be available sometime in the summer for all users in the Gemini app on Android, Google app on iOS, and on the web.
And then, there is Google Search: Since the winter, Google has been slowly rolling out its SGE summaries to all web users on Google. The summaries appear with an “Experimental” badge and big, bold answers, and typically generate a second or two after the search has been made. The company now has fully renamed the experimental feature to “search summaries,” removing the feature from beta testing (it was only available through Google’s “Labs” portal) and vowing to expand it to all U.S. users by the end of the year. The change has the potential to entirely rewrite the internet, killing traffic to publishers that rely on Google Search to survive and sell advertisements on their pages, as well as disincentivizing high-quality handwritten answers on the web. The Gemini-powered search summaries do provide sources, but they are often buried below the summary and seldom clicked on by users, who are commonly content with the short AI-generated blurb.
The summaries are also prone to making mistakes and fabricating information, even though they’re placed front-and-center in the usually reliable Google Search interface. This is extremely dangerous: Google users are accustomed to reliable, correct answers appearing in Google Search and might not be able to distinguish between the new AI-generated summaries and the old content snippets, which remain below the Gemini blurb. No matter how many disclaimers Google adds, I think it is still too early to add this feature to a product used by billions. I am not entirely pessimistic about the concept of AI summaries in search — I actually think this is the best use case for generative artificial intelligence — but in its current state, it is best to leave this as a beta feature for savvy or curious users to enable for themselves. The expansion and improvement of the summaries were a marquee feature of Tuesday’s presentation, taking up a decent chunk of the address, and yet Google made an egregious error in its promotional video for the product, as spotted by Nilay Patel, the editor in chief of The Verge. That says a lot.
Google improved its summaries feature before beginning the mass rollout, though: it touted what it called “multi-step reasoning,” allowing Google Search to essentially function as the Gemini chatbot itself so users can enter multiple questions at once into the search bar. Most Google searches aren’t typically conversational; most people perform several searches in a row to fully learn something. This practice, as Casey Newton wrote for Platformer, once upon a time, used to be enjoyable. Finding an answer, repeating the search with more information, and clicking another one of the 10 blue links is a ritual practiced by hundreds of millions of people daily, and Google seems intent on destroying it.
Why the company has decided to upend its core search product is obvious: Google Search is bad now. Nowadays, Google recommends AI-generated pages engineered for maximum clicks and advertising revenue rather than useful, human-written sites, leading users to append “Reddit” or “Twitter” to their queries to find real answers written by real people. Google has tacitly shown that it has no interest in fixing the core problem at hand — instead, it is just closing up shop and redirecting users to an inferior product.
Google’s objective at I/O was to circumvent the problem of the internet no longer being helpful by making AI perform searches automatically. Google showcased queries that notably included the word “and” in them — for example: “What is the best Pilates studio in Boston and how long would it take to walk there from Bacon Hill?” Before Tuesday, one would have to split that question into two: “What is the best Pilates studio in Boston?” and “Travel time between the studio and home.” (The latter would probably be a Google Maps search.)
It is a highly specific yet somehow absolutely relevant example of Google throwing in the towel on web search. When Google detects a multi-step query, it does not present 10 blue links that might have the answer to both questions, because that would be all but impossible. (Very few websites would have such specific information.) It instead generates an AI summary of information pulled from all over the web — including from Google Maps — effectively negating the need to do further research. While this might sound positive, it in reality kills the usefulness of the internet by relegating the task of searching for information to a robot.
People will learn less from this technology, they will enjoy using the internet less, and as a result, publishers will be less incentivized to add to the corpus of information Gemini uses to provide answers. The new AI features are good short-term solutions to improve the usefulness of the world’s information superhighway, but they cause a major chicken-and-egg problem that Google has continuously either ignored or chosen to purposefully neglect. This pressing issue does not fit well in the quick pace of a presentation, but it will cause an already noticeable decline in high-quality information on the web. It is a short-term bandage over the wound that is lazy, money-hungry analytics firms — once the bandage withers and expires, the wound will still be there.
That is not to say that Google should not invest in AI at all, because AI pessimism is a conservative, cowardly ideology not rooted in fact. Instead, Google should use AI to remedy the major problem at hand, which it caused itself. AI can be used to find good information, improve recommendation algorithms, and help users find answers to their questions in fewer words. Google is more than capable of taking a thoughtful approach to this glitch in the information ecosystem, and that is apparent because of its latest enhancement to its traditional search product: ask with video and Circle to Search.
Asking questions with video is exactly the type of enhancement AI can bring without uprooting the vast library of information on the web. The new search feature is built into Google Lens but utilizes Google’s multimodal generative AI to analyze video clips recorded through the Google mobile app along with a quick voice prompt. When a recording is initiated, the app asks users to describe a problem, such as why a pictured record player isn’t working. It then uses AI to understand the prompt and video, then generate an answer with sources pulled from the web.
The reason this is more groundbreaking than worrisome is because it (a) enables people to learn more than they would otherwise, (b) adds a qualitative improvement to the user experience, and (c) encourages authors to contribute information to be featured as one of the sources for the explanation. It is just enough of a change to the habits of the internet where the result is a net positive. Google is doing more than simply performing Google searches by itself, then paraphrasing the answers — it is understanding a query using a neural network, gathering sources, and then explaining them while also providing credit. In other words, it isn’t a summary; it’s a new, remarkable piece of work.
It is safe to say that for now, I am pessimistic about Google’s rethinking of the web. Google’s chatbots consistently provide incorrect answers to prompts, the summaries’ placement alongside the 10 blue links — which aren’t even 10 blue links anymore — can be confusing to non-savvy users, and the new features feel more like ignorant, soulless bets on an illustrious “new internet” rather than true innovations that will improve people’s lives. But that isn’t to say there is no future for generative AI in search — there is in myriad ways. But the sheer unwillingness on Google’s end to truly embrace generative AI’s quirks is astonishing.
Gemini for Users
Google’s apparent attempt to reinvent the internet does not just stop at the web — it also extends to its personal services, like Google Photos and Gmail. This extension first took place last year at Google I/O, and many of Tuesday’s announcements seemed like déjà vu, but this year the company seemed more intent on utilizing the multimodal capabilities and larger context lengths of its latest LLMs to improve search capabilities and provide better summaries, an advantage it hadn’t developed last May.
First, Google Photos, which the company opened the event with, surprisingly. Google described a limitation of basic optical character recognition-based search: Say someone wanted to find their license plate number in a sea of images of various cars and other vehicles. Previously, they would have to sift through the photos until they found one of their car, but with multimodal AI, Gemini can locate the photos of one’s car automatically, and then display the license plate number in a cropped format. This enhanced, contextual search functions like a chatbot within Google Photos to make searching and categorizing photos easier. The AI, which uses Gemini under the hood, uses data from a user’s photo library, such as facial recognition data and geolocation, to find photos that might fit specific parameters or a theme. (One of the examples shown onstage was a user asking for photos of their daughter growing up.)
In Gmail, Google announced new email summarization features to “catch up” on threads via Gemini-written synopses. Additionally, the search bar in Gmail will allow users to sift through messages from a particular sender to find specified bits of information, such as a date for an event or a deadline for a task, without having to enumerate each email individually. The new features — while not improving the traditional Gmail search experience used to find attachments and sort between categories like the sender and send date — do fill the role of a personal assistant in many ways. And they’re also present in the Gemini chatbot interface, so users can ask Gemini to fetch emails about a given subject in the middle of a pre-existing chat conversation. Google said the new features would roll out to all users beginning Tuesday.
The new additions are reminiscent of Microsoft’s Outlook / Microsoft 365 features first debuted last year, and I surmise that is the point. Google’s flagship Gmail service had next to zero AI features, whereas now it can summarize emails and write drafts for new ones, all inline. However, these new Gemini-powered AI features create an interesting paradox I outlined last year: Users will send emails using AI only for the receiver to summarize them using AI and draft responses synthetically, which the sender will receive and summarize using AI. It is an endless, unnecessary cycle that exists due to the quirks of human communication. I do not think this is the fault of Google — it’s just interesting to see why these tools were developed in the first place and to observe how they might be used in the real world.
My favorite addition, however, is what settles the AI hardware debate that has become a hot topic of debate in recent weeks: Gemini in Circle to Search. Circle to Search — first announced earlier this year — allows users to capture a screenshot of sorts, then circle a subject for Google Lens to analyze. Now, Circle to Search adds the multimodal version of Gemini, Gemini Ultra, as well as Gemini Nano, which runs locally on Pixel phones for smaller, more lightweight queries. This one, simple-on-paper addition to Circle to Search, an already non-sophisticated feature, nearly kills both the Rabbit R1 and Humane Ai Pin. With just a simple swipe gesture, any object — physical or virtual — can be analyzed and researched by an intelligent, capable LLM. It’s novel, inventive, and eliminates the often substantial barrier between trying to understand something in the spur of the moment and accessing information.” It makes the process of searching simple, which is exactly Google’s mission statement.
Circle to Search does not summarize the web in the way other Gemini features do because it is mostly powered by a lightweight model with a smaller context window that runs on-device. Instead, it falls back to the web in most instances, but what it does do is perform the task of writing the Google search. Instead of having to enter into Google a query like “orange box with AI designed by Teenage Engineering,” a simple screenshot can automatically write that search and present links to the Rabbit R1. It is a perfect, elegant, amazing implementation of AI now supercharged by an LLM. Google says this type of searching is context-aware, which is a crucial tenant of useful information gathering because there is no use to information if it is not contextual. On Google, that awareness must be manually entered or inferred, but with Circle to Search, the system knows precisely what is happening on a user’s screen.
This might sound like the standard Google Lens, but it is much more advanced than that. It can summarize text, explain a topic, or use existing user data, such as calendar events or notes, to personalize its responses. And because it has the advantage of context awareness, it can be more personal, succinct, and knowledgeable — exactly what the AI devices from Rabbit and Humane lack. Circle to Search with Gemini is built into the most important technological device, and it is exactly the best use for AI. Yes, it might reduce the number of Google searches typed in, upsetting publishers, but it makes using computers more intuitive and personal. Google should run with Circle to Search — it is a winner.
Circle to Search is also powered by a new LLM Google announced during its presentation2, called LearnLM, designed for educational settings and based on Gemini. LearnLM was demonstrated with a Circle to Search query where some algebra homework was presented — the chatbot was able to explain the answer thoroughly using the correct typography and notation, too. Presenters also described the LLM as available on Google Classroom, Google’s learning management software, and YouTube, to explain “educational videos.” The YouTube chatbot interface, which was first beta tested amongst select YouTube Premium subscribers last year, will be available more broadly and will enable users to ask questions about certain videos and find comments more easily. It is unclear what the difference is between LearnLM and Gemini exactly, but I assume LearnLM has less, more specific training data to address hallucinations.
Here are some miscellaneous additions also announced Tuesday:
-
NotebookLM, Google’s LLM-powered research tool that users can upload custom training data to, now uses Gemini to provide responses. The tool is mainly used to study for tests or better understand notes; it was first released to the general public last year. The most noteworthy addition, however, was the new conversation mode, which simulates two virtual characters having a faux conversation about a topic using the user-provided training data. Users can then interject with a question of their own by clicking a button, which pauses the “conversation” — when a question is asked, the computer-generated voices answer it within the context of the training data.
-
On-device AI, powered by Gemini Nano, will now alert users when a phone call might be a scam. This feature will, without a doubt, be helpful for seniors and the less technically inclined. Gemini will listen to calls — even ones it doesn’t automatically flag as spam — and show an alert if it detects it might be malicious.
Google, for years, has excelled at making the smartest smartphones, and this year is no exception. While the company’s web AI features have left me frustrated and skeptical, the user-end features are much more Google-like, adding delight and usefulness while also putting to rest AI grifts with no value. Many of these features might be Android-exclusive, but that makes me even more excited for the Worldwide Developers Conference when Apple is rumored to announce similar enhancements and additions to iOS. The on-device AI feature announcements at Google I/O this year were the only times I felt somewhat excited about what Google had to announce Tuesday, though it might have also helped that those features were revealed toward the beginning of the keynote.
Gemini for Investors
Project Astra is Google’s name for Silicon Valley’s next AI grift. By itself, the technology is quite impressive in the same way that Monday’s OpenAI event was: a presenter showcased how Project Astra could, in real-time, identify objects it looked at via a smartphone camera, then answer questions about them. It was able to read text from a whiteboard, identify Schrödinger’s cat, and name a place just from looking outside a window. It’s a real-time, multimodal AI apparatus, just like OpenAI’s, but there is only one problem: we don’t know if it will ever exist.
Google has a history of announcing products to do nothing more than hike its stock price, like Google Duplex, a voice-to-text AI model that was poised to be able to make calls to secure reservations or perform other mundane tasks with a simple text prompt. Project Astra feels exactly like one of those products because of how vague the demonstration was: The company did not provide a release date, more details on what it may be able to do, or even what LLMs it might be powered by. (It doesn’t even have a proper name.) All the audience received on a sunny spring morning in Mountain View, California, was a video of a smartphone, and later some smart glasses, identifying physical objects while answering questions in an eccentric voice.
The world had already received that video just a day prior, except that time, it received a release date too. And that is a perfect place to circle back to the original point I made at the very beginning of this article: OpenAI stole Google’s thunder, ate its lunch, took its money, and got all the fame. That was not OpenAI’s fault — it was Google’s fault for failing to predict the artificial intelligence revolution. For being so disorganized and unmotivated, for having such an incompetent leader, for being unfocused, and for not realizing the potential of its own employees. Google failed, and now the company is in overdrive mode, throwing everything at the wall and seeing what sticks. Tuesday’s event was the final show — it’s summit or bust.
More than to please users, Tuesday’s Google I/O served the purpose of pleasing investors. It was painfully evident in every scene how uninspired and apathetic the presenters were. None of them had any ambition or excitement to present their work — they were just there because they had to. And they were right: Google had to be there on Tuesday, lest its tenure as the leader of AI come to an end. I’d argue that has already happened — Microsoft and OpenAI have already won, and the only way for Google to make a comeback is by fixing itself first. Put on your oxygen mask before helping others; address your pitfalls before running the marathon.
Google desperately needs a new chief executive, new leadership, and some new life. Mountain View is aimless, and for now, hopeless. The mud is not sticking, Google.
-
Gemini Nano, Gemini Pro, and Gemini Ultra are Google’s last-generation models. Gemini 1.5 Pro is the latest, and performs equally to Gemini Ultra, though without multimodal capability. Google also announced Gemini Flash on Tuesday, which is smaller than Gemini Nano. It is unclear if Gemini Flash is built on the 1.5 architecture or the 1.0 one. ↩︎
-
Here is a handy list of Google’s current LLMs. ↩︎