Ilya Sutskever and Friends Found Safe SuperIntelligence Inc.

Ilya Sutskever, Daniel Gross, and Daniel Levy, writing on the website of their new company:

Building safe superintelligence (SSI) is the most important technical problem of our​​ time.

We have started the world’s first straight-shot SSI lab, with one goal and one product: a safe superintelligence.

It’s called Safe Superintelligence Inc…

Our singular focus means no distraction by management overhead or product cycles, and our business model means safety, security, and progress are all insulated from short-term commercial pressures.

“Superintelligence” is not a word in the dictionary, but it’s meant to be a catch-all, alternative term for artificial general intelligence, a term for a computer system as smart as or even smarter than humans. Sutskever is one of OpenAI’s co-founders, and he served as its chief scientist until he suddenly resigned in May. Gross and Levy are also expatriates of OpenAI, whose mission is to “ensure that artificial general intelligence benefits all of humanity,” as posted on its website. I assume Sutskever’s new company is using “superintelligence” instead of “AGI” or simply “artificial intelligence” because he tried to accomplish that with OpenAI and apparently failed — so now, the mission has to be slightly modified to try it all again.

The last line I quoted about “distraction by management overhead” seemingly alludes to OpenAI’s obvious loss of direction. It’s true that OpenAI has become commercialized, which is potentially concerning for the safe development of AGI — OpenAI’s mission — but I guess the mission doesn’t matter anymore if Sam Altman, the chief executive, wants to eliminate board oversight of his company in the near future. So, thus, Safe Superintelligence — a boring name for a potentially boring company. Safe Superintelligence probably won’t create the next GPT-4 — the large language model that powers ChatGPT — or advance major research projects because it’ll struggle to raise the capital OpenAI has. It won’t have deals with Apple or Microsoft and certainly won’t be motivated by profit in the same way Altman’s company now is. Safe Superintelligence is the new OpenAI, whereas the real OpenAI is more akin to “Commercial AI.”

Is the commercialization of AI a bad thing? Probably not, but there are some doomsayers who believe it is because AI could “go rogue” and destroy humanity. I think the likelihood of such an event is minimal, but nonetheless, I also believe AI research institutes like Safe Superintelligence should exist to study the effects of powerful computer systems on society. I don’t think Safe Superintelligence should build anything new like how OpenAI did — it’s best to leave the building to the companies with capital — but the oversight should exist in a well-balanced industry. If OpenAI cooks up a contraption that has the potential to do harm, Safe Superintelligence should be able to probe it and understand how it works. It’s best to think of Safe Superintelligence and OpenAI as collaborators, not just competitors, especially if OpenAI truly does disband its board.

Let’s hope Safe Superintelligence actually lives up to its name, unlike OpenAI, though. AI is like drugs for the business industry right now: OpenAI dabbled with making a consumer product, ChatGPT — which was intended to be a limited research preview when it launched in November 2022 — the product went viral, and its entire corporate strategy shifted from safe AGI development to money making. If Safe Superintelligence, contrary to my prediction, achieves a scientific breakthrough and a hit consumer product, it’s quite possible it’ll get carried away just like OpenAI. Either Safe Superintelligence has more self-restraint than OpenAI (probably the case), or it’ll suffer the same fate.

Apple Rejects Non-JIT Version of UTM via Notarization

Also from Benjamin Mayo for 9to5Mac:

App Review has rejected a submission from the developers of UTM, a generic PC system emulator for iPhone and iPad.

The open source app was submitted to the store, given the recent rule change that allows retro game console emulators, like Delta or Folium. App Review rejected UTM, deciding that a “PC is not a console”. What is more surprising, is the fact that UTM says that Apple is also blocking the app from being listed in third-party app stores in the EU.

As written in the App Review Guidelines, Rule 4.7 covers “mini apps, mini games, streaming games, chatbots, plug-ins and game emulators”.

UTM says Apple refused to notarize the app because of the violation of rule 4.7, as that is included in Notarization Review Guidelines. However, the App Review Guidelines page disagrees. It does not annotate rule 4.7 as being part of the Notarization Review Guidelines. Indeed, if you select the “Show Notarization Review Guidelines Only” toggle, rule 4.7 is greyed out as not being applicable.

Michael Tsai:

This is confusing, but I think what Apple is saying is that, even with notarization, apps are not allowed to “download executable code.” Rule 2.5.2 says apps may not “download, install, or execute code” except for limited educational purposes. Rule 4.7 makes an exception to this so that retro game emulators and some other app types can run code “that is not embedded in the binary.” This is grayed out when you select Show Notarization Review Guidelines Only, meaning that the exception only applies within the App Store. Thus, the general prohibition remains in effect for App Marketplaces and Web Distribution.

This is a clear instance of Apple itself being confused by its own perplexing guidelines. Rule 4.7 says:

Apps may offer certain software that is not embedded in the binary, specifically HTML5 mini apps and mini games, streaming games, chatbots, and plug-ins. Additionally, retro game console emulator apps can offer to download games. You are responsible for all such software offered in your app, including ensuring that such software complies with these Guidelines and all applicable laws.

Apple later “clarified” to UTM that it was not being barred from the App Store because of Rule 4.7, but because of Rule 2.5.2, which bans just-in-time compilation. Rule 4.7 purports to be an exception to Rule 2.5.2 for “retro game console emulator apps,” but it is not in practice, because no app with a JIT compiler has been able to make it through App Review. Delta, a retro game console emulator by Riley Testut, also had a JIT compiler, but Testut had to remove it in the App Store and third-party app marketplace versions of the app — Rule 4.7 didn’t give him an exception like how it hints it may.

What Rule 4.7 allows, however, is “retro game console emulator apps” on the App Store — and thus, disallows any that aren’t “game console” emulators. But crucially, this only applies to apps submitted to the App Store, not third-party app marketplaces, meaning that any emulator should be allowed on a third-party app marketplace even if it can’t be on the App Store because Rule 4.7 is not part of the “Notarization Review Guidelines,” which govern third-party app marketplaces. (Apps distributed through those marketplaces must be notarized by Apple, but their content is not reviewed.) In other words, there’s no restriction on PC emulators in third-party app marketplaces. Apple applied Rule 4.7 to both third-party app marketplaces and the App Store, which is incorrect.

Tsai is correct: Apple most likely forbids any just-in-time compilers from running on iOS, period, regardless of if the app is a game emulator or not. But I don’t think the disagreement should involve Rule 2.5.2 at all because that rule is most likely a blanket, general ban on JIT compilers, regardless of if the app is on the App Store or not; hence why only Rule 4.7 is excluded from the Notarization Review Guidelines, not Rule 2.5.2. Instead, Apple originally said it was barring UTM from operating on iOS outright because a PC is not a “console” — a rule 4.7 infraction.

2.5.2 would have applied if UTM uses a JIT compiler, but here’s the kicker: it doesn’t. Instead, because Apple realized its original decision of applying Rule 4.7 was incorrect, it quickly switched to blaming 2.5.2, which doesn’t even apply in this scenario — if anything, 4.7 does, but only to the App Store version, not the one submitted for notarization for third-party distribution. In the case of Rule 4.7, the semantics of “console” and “PC” would matter because that one change in wording determines if an app is allowed on the App Store or not.

What Tsai argues is that for apps that (a) aren’t console emulators and (b) aren’t on the App Store, Apple prohibits JIT compilation as per 2.5.2, which the European Union allows Apple to enforce as part of the clause in the Digital Markets Act that allows gatekeepers to bar apps that might be a security risk. But that guideline doesn’t even matter in this context because (a) UTM SE — the version of the app UTM submitted — doesn’t include a JIT compiler, and (b) Apple barred UTM from operating on both the App Store and third-party app marketplaces on the basis of wording, not the JIT compiler, before it backtracked. Now, Apple wants to conveniently ignore its original flawed reasoning.

Apple can’t apply Rule 4.7 to apps that want access to a third-party marketplace because it is not a notarization guideline, only an App Store one. This behavior is illegal under the DMA: Apple applied its ability to bar UTM’s access to the App Store to third-party app marketplaces as well, which it can’t do. When it got caught red-handed, it defaulted to an unrelated rule UTM SE already passed. Because App Review can’t read, it backtracked, was incorrect in its backtracking, UTM got rejected, and Apple’s two given reasons for rejecting the app were both abysmally false. This kerfuffle should have been unrelated to Rule 2.5.2, which would only apply if UTM SE used a just-in-time compiler, which, again, it doesn’t. If it did, yes, the rules would fall back to 2.5.2, which applies throughout iOS — but the only rule that matters is 4.7, which was applied incorrectly the first time.

I’m sure the European Commission will cite this mess when it fines Apple.

Sources: Apple Preparing Cheaper Vision Pro for 2025

Benjamin Mayo, reporting for 9to5Mac:

Apple is reportedly working on a cheaper, cut-down version of the Apple Vision Pro, scheduled to arrive by the end of 2025, according to The Information. At the same time, the publication says development work on a second-generation high-end model of the Vision Pro has been shelved, seemingly to prioritize the cheaper hardware path…

The Information says it is possible Apple could resume work on a high-end second-gen Vision Pro at some point, but it seems relatively confident that the move reflects a change in strategy for the time being…

The Information says the number of employees assigned to the second-gen Vision Pro had been gradually declining over the course of the last year, as attention turned to the cheaper model.

Many news outlets are running with the headline, “Apple Halts Work on Second-Generation Vision Pro.” While I guess that’s technically true, the Apple Vision Pro team at Apple is still relatively small. They’re only going to focus on one core product for the lineup at a time, and I think switching attention to the cheaper version now that the full-features “professional” model is out is a better strategy. If Apple instead went full speed ahead on developing another marginally improved Apple Vision Pro, as it does for its already segmented products, it would never be able to break into new markets. The incremental year-over-year upgrades should come once there is already a market for the product, but until the user base is stabilized, it should focus on bringing the price down. After that, it can use what it learned from the cheaper product to shape the true “next-generation” high-end Apple Vision Pro.

I don’t think the cheaper “Apple Vision” product will eclipse Apple Vision Pro in Apple’s lineup for now, but it will eclipse the older version in sales. That’s precisely the point, unlike with product lines like the iPhone or iPad. When the first iPhone was introduced in 2007, Apple immediately went to work on iPhone 3G; the same went for the iPad. But Apple Vision Pro isn’t like either of those products because it’s so astronomically expensive. It’s more akin to the Macintosh — if February’s Apple Vision Pro is the Macintosh 128K from January 1984, the low-cost headset is the iMac. The “Classic Macintosh” line of Macs is no more, and the same will be true for the first-generation Apple Vision Pro. It’s better to think of the Apple Vision Pro product line as a new generation of computers for Apple rather than accessories to the Mac like the iPod or iPhone once originally were.

The bottom line is this: I wouldn’t be too worried about this first-generation Apple Vision Pro fading into obscurity quickly. And neither do I think Apple Vision Pro buyers should buy the cheaper headset when it comes out — it’s destined to be worse. But it’s important to note that the first generation of this all-new platform doesn’t exist to be a consumer product, it’s there for developers and video producers to make content for the overall platform at large. Once the content and apps exist, Apple needs to sell a total package in a palatable product for most normal buyers, probably priced at $1,000 to $1,500. That’s exactly what we’re seeing here, and I think it’s a good strategic move. Once it makes the iMac of the Vision line, it can make the Mac Pro — and that actually good Apple Vision Pro will eventually cost much less than $3,500 because Apple has mastered producing the product at scale.

E.U. Will Fine Apple for Violating DMA

Javier Espinoza and Michael Acton, reporting for The Financial Times:

Brussels is set to charge Apple over allegedly stifling competition on its mobile app store, the first time EU regulators have used new digital rules to target a Big Tech group.

The European Commission has determined that the iPhone maker is not complying with obligations to allow app developers to “steer” users to offers outside its App Store without imposing fees on them, according to three people with close knowledge of its investigation.

The charges would be the first brought against a tech company under the Digital Markets Act, landmark legislation designed to force powerful “online gatekeepers” to open up their businesses to competition in the EU…

If found to be breaking the DMA, Apple faces daily penalties for non-compliance of up to 5 per cent of its average daily worldwide turnover, which is currently just over $1bn.

Firstly, it’s hilarious that this was leaked by Europe to The Financial Times.

Secondly, this is entirely unsurprising to anyone who understands how the European Commission, the European Union’s executive branch, functions. The reason the DMA was written was to punish “Big Tech” companies — specifically American ones — not regulate them. Moreover, the commission’s enforcement of the DMA has continuously proven to be draconian because it’s bending the rules however it wants to levy whatever punishments it wants. The DMA was just a facade for democracy, to show the world that the commission wouldn’t “regulate” the technology industry autocratically; and that regulating Apple, Google, Meta, etc., was in the interest and wishes of Europeans. The DMA, in reality, works as a free pass for the European Commission to do whatever it wants — it’s a badly written law with no real footing in legal doctrine and only exists to further strengthen the commission’s control over the market.

When the commission fully reveals why it’s fining Apple, it’ll point to a clause in the DMA that doesn’t exist, just like it did to Meta when it began its investigation of the Facebook parent. In the case of Meta, it forced the company to offer a free way for users to opt out of tracking on its services, when the DMA only required “gatekeepers” to offer a way for users to opt-out entirely, even if that way cost money. Meta’s lawyers aren’t stupid or incompetent: they knew the DMA was written only for gatekeepers to offer a tracking-free service, so they advised Meta to offer a paid, ad-free subscription. The commission didn’t like that for some reason, so it launched an investigation. That’s not a fair application of the law — it’s an application of a law that doesn’t exist.

Just as it did with Meta, the commission will probably target the Core Technology Fee, which Apple has modified so that only large companies have to pay it. But because the commission didn’t think of a per-download fee as even an option a gatekeeper could employ, it’ll erroneously target it with a law that doesn’t exist. By every measure, the Core Technology Fee — especially the amended version from May — is within the scope of the DMA and follows the laws of Europe. Apple wouldn’t risk violating the law because it knows what’s at stake here — its lawyers are competent in E.U. law and aren’t going to tell Apple to be sly about obeying. But the commission is treating Apple as if it has no interest in complying, which leads me to believe that maybe Apple shouldn’t comply.

The European Commission will fine Apple, Google, Amazon, Meta, and the rest of its long list of gatekeepers indeterminate amounts of money however it pleases because it gave itself the keys to the antitrust kingdom. These companies are dealing with a branch of government with an unchecked amount of power: it writes the law, it enforces the law, and it chooses how to enforce it. The law does not act as a check on the commission as it does in the United States, so why should Apple even comply? Apple has no chance of winning this fight against one of the most powerful regulatory bodies in the world, so it just shouldn’t. In fact, I’d say Apple should go rogue entirely and see what happens. It should increase its In-App Purchase fee to 50 percent in the European Union, tighten anti-steering rules, and subject E.U. apps to extra scrutiny in the App Review process.

What would the European Commission do in response to this blatant, unapologetic defiance of the law? Fine Apple 5 percent, which it was going to do anyway even after Apple put in all the work to comply. It’s a lose-lose situation for Apple no matter what it does because the commission has gone rogue. When your boss goes rogue and you can stand the consequences — and I’m sure Apple can; 5 percent of global daily revenue isn’t much — you should go rogue, too. Instead of applying the principle of malicious compliance, Apple should apply malicious defiance. What would Europe do, ban Apple devices from the bloc? Europeans would travel to Brussels to riot because that would be undemocratic. Would Europe pass more laws? That’s also possible, but if it fines Apple too much, Apple should just leave Europe and let the riots ensue.

I wasn’t all that supportive of the DMA when it was first passed and applied, but I never thought I’d tell Apple to break the laws of a region in which it operates. Now, that seems like the best course of action, because no matter what, it’s destined to lose.

Gurman: Apple Following in Ive’s Footsteps

Mark Gurman, reporting in his Power On newsletter for Bloomberg:

Over the past several years, Apple appeared to be shifting away from making devices as thin and light as possible. The MacBook Pro got thicker to accommodate bigger batteries, more powerful processors, and more ports. The Apple Watch got a heftier option as well: an Ultra model with more features and a longer life. And the iPhone was fattened up a bit too, making room for better cameras and more battery power.

When Apple unveiled the new iPad Pro in May, it marked a return to form. The company rolled out a super-thin tablet with the same battery life as prior models, an impressive screen, and an M4 chip that made it as powerful as a desktop computer. In other words, Apple has figured out how make its devices thinner again while still adding major new features. And I expect this approach to filter down to other devices over the next couple of years.

I’m told that Apple is now focused on developing a significantly skinnier phone in time for the iPhone 17 line in 2025. It’s also working to make the MacBook Pro and Apple Watch thinner. The plan is for the latest iPad Pro to be the beginning of a new class of Apple devices that should be the thinnest and lightest products in their categories across the whole tech industry.

We do not need this. I’d much rather take extra battery life, which has suffered in recent years, on most of Apple’s product lines than thinness, which doesn’t make sense to obsess over on “professional” products. While I do support making the MacBook Air or Apple Watch thinner, the MacBook Pro should be off-limits because there’s always more to add to that product. Imagine a thicker MacBook Pro with a larger battery and M4 Ultra processor, for example — or perhaps better cooling or improved speakers. The entire premise of the “Pro” lineup is inherently to pack the maximum amount of features into the product as possible.

Jony Ive, Apple’s former design chief who obsessed over thinness to the point where Apple’s products began to suffer severely, is slowly inching his way back into the company, albeit not directly. He clearly still has influence over the top designers, and now that Evans Hankey, who succeeded Ive, has also left the company, there’s a lack of direction from within. Take the iPhones 17 Pro, for example: Last year, Apple already thinned the phone down significantly, but now it wants to do that again, even when battery life has suffered. No iPhone has had better battery life than iPhone 13 Pro Max, and that was not a fluke. That model was one of the thickest iPhones Apple had offered before 2021, but users loved it.

I shouldn’t need to reiterate this basic design principle to Apple’s engineers over and over again. There should be a limit to sleekness, and when every other company is focusing on adding more features and larger batteries to their products each year, Apple should do the same — not go in the other direction. I don’t want the MacBook Pro to become thinner, even though I think it’s heavy and cumbersome to carry around, because its power will inevitably suffer. The reaction to this statement is always something like: “Apple made the iPad Pro thinner and it still works fine,” but that’s a misunderstanding. If Apple kept the thickness the same — the iPad Pro was already thin enough, in my opinion — but added the organic-LED display, which is more compact, it could’ve added a larger battery which would address the iPad’s abysmal standby time.

I’m not frustrated by Apple’s thinness spiel with the iPad mostly because I don’t think of the iPad as a “professional” device. I do, however, take offense to Apple applying the same flawed mentality to arguably its most professional product, the MacBook Pro. Apple can do what it wants to the MacBook Air, the lower-end iPhones, or even the iPad — but it shouldn’t think in even remotely the same direction in relation to the high-end important products.

Why Apple Intelligence is the Future of Apple Platforms

Apple’s suite of AI tools is here. How will it change how people use their devices?

An image of various Apple Intelligence features running on Apple devices. Apple Intelligence. Image: Apple.

Apple on Monday announced a new suite of artificial intelligence features at its Worldwide Developers Conference, held from its Apple Park headquarters in Cupertino, California. The new features, together called “Apple Intelligence,” allow users to summarize articles, emails, text messages, and notifications; improve and generate new writing in system-wide text fields; pull data from across their apps like Mail, Photos, and Contacts to power a wide range of natural language processing features; and interact with a new version of Siri, which can now be typed to and can perform actions within apps using an improved version of a technology called App Intents.

It also allows users to generate new AI images and emojis with features like “Genmoji” and “Image Playground” integrated into Messages and other third-party apps, as well as have AI create videos of photos coupled together with motion effects and music — a feature called “memory movies.” Users can also remove unwanted objects from the background of photos, search their libraries using natural language, and edit images with effects and filters automatically. Apple Intelligence runs both on-device and in the cloud depending on what Apple’s internal logic believes is necessary for the task. It leverages a breakthrough called Private Cloud Compute, utilizing the security of Apple silicon processors to handle sensitive user data — ensuring it remains end-to-end encrypted. Private Cloud Computer servers run an operating system that can be inspected by outside security researchers, Apple said, via software images that can be verified to ensure they are the ones running on Apple’s servers. Greg Joswiak, Apple’s marketing chief, said the servers run on 100 percent renewable energy. These servers were easily the most intriguing technical demonstration of the day.

Apple also announced a partnership with OpenAI to bring ChatGPT, its flagship large language model, to iOS 18, iPadOS 18, and macOS 15 Sequoia — the new operating systems coming to Apple devices this fall — via Apple Intelligence, powering general knowledge queries and complicated creative writing assignments Apple deems are too intensive for its own LLMs, both in the cloud and on-device. The integration — also coming in the fall — does not build a chatbot into the operating systems, but rather is used as a fallback for Apple Intelligence when it needs to search the web or generate more lengthy pieces of text. When ChatGPT is used, a user’s IP address is obscured and Apple makes the call to ChatGPT directly, asking a user to confirm if it is OK to use the external service to handle the query. Apple stressed that the feature would be turned off by default and that no personal data would be handed over to ChatGPT, a marked difference from its own foundation models. It also announced that more models would become available soon, presumably as the company signs more contracts with other AI makers, such as Google.

Together, the new features, which will be enabled in the fall for beta testers, finally catch Apple up to the AI buzz that has engulfed the technology industry since the launch of ChatGPT in November 2022. Investors have quizzed Tim Cook, Apple’s chief executive, on every post-earnings call since then about when Apple would join the AI frenzy, and now, its answer is officially here. Apple Intelligence does things differently, however, due to the ethics of who it’s made by: Apple Intelligence focuses on privacy and on-device intelligence more than fancy gimmicks other tech companies like Google and Microsoft have launched. Yes, by adding AI to its flagship operating systems used by billions around the world, Apple becomes vulnerable to hallucinations — phenomena where chatbots confidently provide incorrect answers — and involves itself in the difficult business of content moderation. But it also sets a new gold standard for privacy, security, and safety in the industry while bringing novel technology to its widest audience yet.

That being said, no technology comes without reservations. For one, Apple Intelligence’s Image Playground features look cheaply made, generating poor-quality images that most artists would rather do without. The systems will also easily be subjected to abuse by their users, including being asked to synthesize illegal, sexually explicit, and immoral content that Apple Intelligence may be tricked into creating even if prohibited by Apple. But Apple has said that it has thought of these issues: In response to a question from John Gruber, the author of Daring Fireball, Apple executives said Apple Intelligence isn’t made to be a general-purpose AI tool as much as it is a personal assistant that uses people’s personal data to provide helpful, customized data and answers. One example a presenter demonstrated onstage was the question, “When should I leave to pick up my mom from the airport?” Siri, in this case, was able to surface the appropriate information in Messages, track the flight, and then use geolocation and traffic data to map directions and receive the estimated travel time. Apple Intelligence is not meant to answer questions about the world — it’s intended to act as a companion in iOS and macOS.

Apple Intelligence has one glaring compromise above all, though: It only works on iPhones 15 Pro or later, iPads with the M1 chip or later, and Apple silicon Mac computers. The narrow compatibility list will inevitably cause furor within broader communities outside of the tech media, with cynicism that Apple artificially created the limitation to boost sales of new devices already spiraling on social media — but the reason for why this bottleneck exists is rather simple: AI requires significant computing power. Intel Macs don’t have neural processing units called “Neural Engines” specialized for LLMs, and older iPhones — or current-generation iPhones with less powerful processors — lack enough “grunt,” as John Gianandrea, Apple’s machine learning chief, put it Tuesday at “The Talk Show,” live from WWDC. Add to that the enormous memory constraints that come with running an entire language model on a mobile device, and the requirement begins to make sense: When an LLM needs to answer a question, the whole model — which can be many gigabytes in size — needs to fit in a computer’s volatile memory.

After mulling over the announcements from Monday for a few days, I have thoughts on each of the integrations and how users might use them. I think Monday was one of the most impressive, remarkable, and noteworthy developer conferences Apple has hosted in recent years — at least since 2020 — and while I haven’t tried Apple Intelligence yet, I’m very intrigued to learn more about its capabilities and how it will shape the nascent future of Apple’s platforms. Here are my takeaways from the Apple Intelligence portion of Monday’s keynote.


Siri and App Intents

An image of the new Siri running on a variety of Apple devices. The new Siri. Image: Apple.

Siri finally received a much-needed update, further integrating the assistant within the system and allowing it to perform actions within apps. The new version of Siri uses “richer natural language understanding,” powered by Apple Intelligence, to allow users to query the assistant just as they would a person, adding pauses in speech, correcting mistakes, and more. It also can transform into what is essentially an AI chatbot by allowing users to type into a text field by double-tapping at the bottom of their iPhone or iPad screen, featuring a new, rounded interface and animation that wraps around the device’s bezel and using Apple Intelligence to parse questions. Siri also knows exactly what is on the screen of someone’s device at a given moment; instead of having to ask Siri about a particular show, for example, a user can ask: “Who stars in this?” If a notification pops up, Siri knows of its contents and can perform actions based on the newfound context.

Siri now utilizes personal information from all apps, adding emails, text messages, phone call summaries, notes, and calendar events — all information stored on iCloud or someone’s phone — to what amounts to a knowledge graph part of the foundation models’ training data, which Apple calls the Semantic Index. This information is used as personal context for Siri, and any app can contribute its data to the context pool. The current version of Siri in iOS 17 does perform searches, but those searches are only keyword-based, i.e., if someone asks for a specific detail from an old text message thread, Siri wouldn’t be able to find it. The new version leverages its own intuition to search through user-generated content, going beyond basic regular expressions and keywords and using semantic searches instead. Additionally, Apple Intelligence can use its summary capabilities to catch users up on messages, emails, and notes, similar to the Humane Ai Pin and Rabbit R1’s ambitions.

The most remarkable new feature is Siri’s ability to take action in apps. Using a technology called App Intents, which exposes actions from apps to the system, Siri can use a prompt to decide what actions to run without intervention from a user. Because Siri has the advantage of personal context, it already knows what data is available to be acted upon, so if a user wants to, say, send a note made earlier as an email, they can simply instruct Siri to do so without having to name the note or where it is located in the system, such as what app it’s in. Siri also uses its vision capability to use what is on the screen as context — a user can ask Siri to fetch a particular photo simply by describing it, and then ask for it to be inserted into the current document. It’s a perfect example of “late but still great” that Apple perfectly achieves: Apple is combining four features — LLMs, personal context, on-screen context, and App Intents — into one without even notifying the user of each step. It’s nothing short of magic.

Developers of apps that belong to any category in Apple’s predefined list — examples include word processing, browsing, and camera apps — can add App Intents for the Apple Intelligence-powered version of Siri to use with some modifications to their code, just as they would to add support for interactive widgets or Shortcuts. Somewhat interestingly, apps that aren’t part of Apple’s list aren’t eligible to be used with the new Apple Intelligence version of Siri. They can still expose shortcuts to Siri, just as they did in previous versions of Apple’s operating systems, but Siri will be unable to interface with other apps to perform actions in one step. Apple says it’ll be adding more app categories in the coming months, but some niche apps inevitably won’t be supported at all, which is a shame. Skimming the rumors over the past year, I expected Apple would be using a more visually focused approach, learning the behavior of user-facing buttons and controls within apps, but Siri’s actions are all programmatic.

Either way, the new version of Siri amounts to two things: an AI chatbot with a voice mode, and a “large action model.” That combination will sound familiar to keen observers because it’s exactly what Rabbit aimed to achieve with the R1 in April — except that time, it “relied” heavily on vision to learn the user-facing graphical user interfaces of websites to perform actions on behalf of users. (It didn’t do that — it was a scam.) Apple, in contrast, has constructed a much more foolproof solution, but one that will also inevitably be neglected by large app developers for an indefinite amount of time. Here’s why: Developers who integrate App Intents will notice that the amount of time people spend in their apps will drop significantly because to do that is inherently the entire point of virtual assistants. Large developers owned by corporate giants see that as the antithesis of their existence on the App Store entirely — they’re there to make money and advertise while tracking users, and Apple’s latest technology will not let them accomplish that central goal.

For the few apps that support it, it’ll feel like true magic, because in many ways, it is magic. It’s not Apple’s fault: This is just the cost of doing business with humans rather than robots — humans have their own thoughts about how they want to conduct trade, and those thoughts will clash with Apple’s ideas, even if Apple’s approach is more beneficial to the user. For Apple’s apps, which most people use anyway, the new version of Siri will, for the first time in Siri’s 13-year-long career, feel intelligent and remarkable. Just hearing about it makes me excited because of how much technical work went into combining each of these features into harmonic software bliss. But Apple also did what Apple, at times, unfortunately, always does: it put the onus on developers instead of itself. Apple and its users will ask why app developers won’t support true magic because it is magic, but, getting down to brass tacks, the answer is clear: money. When taking into account the greediness of the world’s largest app developers like Meta and Google, I have a tough time imagining this portion of Apple Intelligence will thoroughly change how people use their devices.

What will make a difference in the way people interact with their devices is the chatbot capabilities of Siri alone. Because Siri is now powered by LLMs and the Semantic Index, it’s naturally much smarter. No more will Siri be unable to understand simple questions due to its prior, now current, inability to map complicated, human-like sentences to its corpus of knowledge because it will soon have added context. For example, if someone wants to know what is on their screen — say, they just want to look it up — they can double-tap the bottom of their screen and ask Siri. Siri can then send it to someone, add it to a note, or add it to a note and send it to someone all in one step. It’s an AI chatbot, similar to ChatGPT, except it’s more focused on answering personal questions rather than general knowledge ones. When Siri does need to connect to the internet, as often as it does to answer people’s myriad curiosities, it can either perform a normal web search or integrate with ChatGPT.

By bringing ChatGPT — not its chatbot interface, as leakers have speculated, but just the model1 — into Siri, and by extension, the entire system, it becomes genuinely intelligent. There’s no need to be thrown into an external app or interface because ChatGPT’s answers appear inline, just like other Siri answers from previous versions of iOS, but this time, those results are personalized, useful, and link to the web only when necessary. ChatGPT almost certainly will hallucinate, but (a) Apple provides an on-screen warning when connecting with ChatGPT which states sensitive information should be double-checked manually, and (b) that is simply the limit of this technology in 2024. OpenAI may cut down on hallucinations in the future, probably as part of a new GPT-5 model, but for now, Apple has done everything that it can to make Siri as smart as possible.

Siri will continue to make web searches, but as the web gets worse, the best hope for finding information effortlessly is ChatGPT. Coupled with personal context, having an Apple-made chatbot built into every iPhone in the future will be a feature many millions of people will enjoy. With Apple Intelligence, Apple has fully realized Siri’s potential — the one it architected in 2011. Siri is no longer just an “assistant” unable to understand most human queries while deflecting to Bing anymore. It is the future of computing, a future start-ups like Humane and Rabbit have been trying to conquer before Apple single-handedly put them to shame in two hours on a Monday. While Apple won’t call it a chatbot, it’s an Apple chatbot, building in the privacy and security Apple customers come to expect from Cupertino, all the while enabling the future of computing. This, without a doubt, is the most groundbreaking component of Apple Intelligence.


Summaries

An image of priority notifications in iOS 18. Priority notification summaries in iOS 18. Image: Apple.

One of the tasks in which LLMs typically succeed is summarization of text, so long as the wall of information fits within the model’s context window. Naturally, Apple has added summarization features to every place in its operating systems imaginable, such as Mail, Notes, Messages, notifications, and Safari. These blurbs are written by Apple’s own foundation models, which Cook, Apple’s chief executive, has said have near a 100 percent success rate, and so Apple doesn’t even bother with adding labels to summarized content. Gianandrea, the Apple ML chief, told Gruber on “The Talk Show” that Apple will also be more permissive in content Apple Intelligence summarizes: While Apple Intelligence will refuse to generate illegal or explicit content, it will not refuse to summarize content it has already been given, even if that content goes against Apple’s creation guidelines. I find this relieving: If a user provides questionable material to ChatGPT and asks it to summarize or rewrite it, for example, it will refuse even when it shouldn’t. AI researchers, such as Gianandrea, work to minimize these so-called “refusals,” which will make the models more helpful.

In Mail and notifications, Apple Intelligence enables new “priority” summaries, handpicking conversations and notifications the system deems important. For example, instead of just showing the first two lines of an email in Mail — or the subject — Apple Intelligence will condense the main points of the correspondence into a sentence that provides just enough information at a glance. It’ll then surface the most important summaries, perhaps from a user’s most important contacts or crucial alerts from companies, at the top of the inbox, complete with an icon indicating that the message has been summarized. Mail will also categorize emails, similar to Gmail, into four discrete sections at the top of the inbox for easy organization. Notifications also receive the same treatment, with priority notifications summarized and placed at the top of the notification stack. If someone sends multiple text messages in a row, for example, they will be condensed and placed in the summary. These small additions will prove handy, especially when a user is away from their devices for a while. I’m a fan.

The same summarization of notifications is also used to power a “Minimize Distractions” Focus, which is offered alongside Do Not Disturb. While Do Not Disturb, by default, silences all notifications, Minimize Distractions queries Apple Intelligence to take into consideration the content and context of a notification to determine if it is important enough to break through the filter or not. While I assume users will be able to manually select contacts and apps that’ll always remain whitelisted, similar to any other Focus, the system does most of the work in this mode. When Apple Intelligence surmises a notification is important, it will label it as “Maybe Important,” akin to “Time Sensitive” labels in current versions of iOS. Messages labeled “Maybe Important” will be summarized and grouped automatically, parallel to “priority” notifications. I think Minimize Distractions should be the new default Do Not Disturb mode for most people — it’s versatile, I think it’ll work well, and it lifts the burden of customizing a Focus from the user to the operating system.

Mail, Phone, and Notes also now feature summarizations at the top of conversations. In Mail, a Summarize button can be tapped to reveal a longer summary — roughly a paragraph — and in Notes and Phone, users can now record a call to generate a summary after it’s over in the Notes app. Without a doubt, the latter feature will be used to create text-only notes for personal use because many jurisdictions require both parties of a call to consent to a recording (this is why iOS has prohibited call recording since its introduction), but I think the feature is clever, and it’ll come in handy for long, information-dense calls. Also in Mail, Smart Reply will scan emails for questions, then prompt a user to answer each one so they don’t miss an important detail. These prompts are in the form of Yes/No questions presented in a modal sheet, and tapping on a suggestion automatically writes the answer into the email.

Safari’s summarization feature, however, is destined to be the most used: Near the Reader icon in the toolbar, users can choose to quickly summarize an article to receive the gist of it. These summaries are created through Reader Mode — the Safari view which allows users to read a clutter-free version of an article — and rely on Apple’s models to provide quick summarization. For once, it’s nice to see an AI tool that interfaces with the web and doesn’t disincentivize going to websites and giving publishers traffic. This is easily one of the best use cases for AI tools, and I’m glad to see Apple embracing it.

More broadly, the central idea of Apple Intelligence begins to crystallize in the case of its text summarization features: AI assistants — whether they be Siri, Google Assistant, Alexa — have always required active engagement to be helpful. Someone asks an assistant a question, but a good human assistant never needs to be asked for help. Assistants should work passively, helping with busy work nobody wants to do. Summarizing notifications, replacing (worthless) two-line previews in the email inbox with one-sentence blurbs, filtering unnecessary messages and whittling them down to the bare minimum, and quickly drafting call notes are all examples of Apple entering into the lives of millions to assist with tasks many don’t even know need to be done. Nobody thinks the two-line message previews in Mail are useless because, from the conception of email and the internet, that was always how they appeared. Now, there’s no need for a subject or preview where the first line is almost always a greeting — AI can make email more enjoyable and quick.

While the new Siri features are, as I said before, examples of active assistance, i.e., a user must first ask for help, Apple Intelligence is also meant to proactively involve itself in its users’ lives — and come to think of it, it’s logical. AI might flub or make up answers confidently, but so would a person; nobody would discard an email just from the summary. They’d use it to determine if it’s worth reading immediately or later. Similarly, by passively engaging users, the system decreases human reliance on AI while simultaneously making a meaningful difference in everyday scut work. This should be a core tenet of AI that other companies should make a note of — while one might think these features are just text summarization, they compose a much broader theme. Apple, chiefly, is leveraging its No. 1 advantage over OpenAI or Microsoft, that it uniquely can blend into people’s lives passively, without interruption or nuisance, while also providing a helpful service. I know the phrase gets overused, but this is something only Apple could do.


Writing Tools

An image of the writing tools menu in macOS 15 Sequoia. Writing Tools in macOS 15 Sequoia. Image: Apple.

Apple continued its practice of “Sherlocking”2 by practically adding a supercharged version of Grammarly into every system-wide native text field in iOS and macOS. What Apple means by “native text field” is unclear, but I have to assume it’s referring to fields made with Apple’s own developer technologies for writing text. Examples presented onstage as supporting Writing Tools, the suite of features, include Bear, Craft, and Apple’s own Pages, Notes, and Keynote. The suite encompasses a summarization tool for users to have their own text summarized, as well as tools to write key bullet points and create tables or lists out of data in paragraph form — a feature I think many will find comforting because of how arduous graphs and tables can be to put together. The two grammar correction features allow users to have the system proofread and rewrite their text — both tools use the language models’ reasoning capabilities to understand the context of the writing and modify it depending on a user’s demands.

One humorous example Apple presenters highlighted onstage was rewriting a résumé more professionally when it was originally casual, but it perfectly illustrated the benefits of having a system-wide, contextually aware writing assistant within cursor’s reach. The proofreading feature underlines parts of the writing that may have grammar mistakes, similar to Grammarly, and suggests how to correct them — Federighi highlighted how all suggestions can be accepted with just one tap or click, too. If none of the pre-made suggestions in Writing Tools are applicable, a user can describe what kind of changes they’d like Apple Intelligence to make using the “Describe your change” item at the top of the menu, which launches a chatbot-like interface for text modifications. The feature set seems well thought-out, and I think it’s a major boon to have a smart, aware grammar checker built into operating systems used by billions.

While Apple’s foundation models — which run on-device and in the cloud via Private Cloud Compute depending on the complexity and length of the text, I surmise — are programmed to assist with modifying already user-generated writing, ChatGPT was demonstrated as able to write stories and other creative works with just the click of a button and prompt in the Writing Tools pane. People who use Apple devices shouldn’t have to go to the ChatGPT app or website anymore to have OpenAI’s chatbot write something or help them conduct research because it’ll be built into the system. I think this is the most useful and clear example of Apple’s ChatGPT introduction shown in the keynote. Apple is opaque with when it is sending a request to ChatGPT; even if a user explicitly asks for ChatGPT to handle the query, the system prompts them one more time to confirm and tells them that ChatGPT’s work may have errors due to hallucinations. Still, I think this specific, intentional integration is more helpful than building a full-on GPT-4o interface into iOS, for instance.

Apple evidently wants to draw a boundary between ChatGPT and its own foundation models while concurrently having the partnership jibe well with the rest of its features. It doesn’t feel out of place, but it’s easily an afterthought; I could envision Apple Intelligence without OpenAI’s help easily. Still, with all of its down-ranking, OpenAI seems more than willing to trade providing free services to Apple customers with the exposure that comes with its logo appearing in front of billions. OpenAI wants to be to generative artificial intelligence what Sharpies are to permanent markers, and since Google is the company’s largest competitor, it’s working on a “the enemy of my enemy is my friend” philosophy. As I’ve said before, OpenAI seems to be in the “spend venture capital like it doesn’t matter” phase of its existence, which is bound to be time-limited, but for now, Apple’s negotiators stroke an amazing deal — free.

Part of me wants to think ChatGPT isn’t Apple Intelligence, but nevertheless, it is — it just happens to be a less-emphasized part of the overall package. I don’t mind that: In fact, I’m impressed Apple is able to handle this much of the processing by itself. In fact, I’m almost certain based on what has been shown this week that Apple will soon3 drop OpenAI as a partner and go all-in by itself once it’s able to generate full blocks of text by itself, something it currently is not very confident in. But since Apple has offloaded the pressure of text generation, it has also coincidentally absolved itself of the difficult task of content moderation. As I wrote earlier in this article, Apple Intelligence will not refuse to improve a text, no matter how egregious or illegal it may be, because Apple understands that it is not the fault of the chatbot if the user decides to write something ostentatious. I favor this approach, and while some naysayers might blame the company for “rogue” responses, I think the onus should be placed on the prompters rather than the robot. If ChatGPT was given the task of summarizing everything a user wrote, it would fail, because the safety engineering is hard-coded into the model. With Apple’s own LLMs, it isn’t.


Image Playground and Genmoji

An image of the Image Playground app running in iPadOS 18. The Image Playground app in iPadOS 18. Image: Apple.

In the last section, I commended Apple for taking a more laissez-faire approach to content moderation, something I usually wouldn’t commend a technology giant for. I think it is the responsibility of a multi-trillion-dollar corporation like Apple to minimize the social harm its products can do, which is why I’m profoundly both repulsed and irritated by its new image generation features, called Image Playground and Genmoji. Both features are similar in that they (a) primarily handle prompting, i.e., they write a detailed prompt from the user’s simple request for the AI image generator; and (b) refrain from creating human-like imagery for its high susceptibility for misuse. Both features are available system-wide but were primarily advertised in Messages due to their expressiveness, which leads me to believe that Apple felt pressured to create an image generation feature and thought of a semi-sensible place to put it last minute. While Genmoji — terrible name aside — was leaked by Mark Gurman of Bloomberg earlier, Image Playground is novel, and information about it is scarce.

Genmoji — a portmanteau of “generated” and “emoji” — generates AI emojis based on a user’s prompt, then renders it as any text to fit in with other words and emojis in a text message. I believe these synthetic emojis are only available in Messages because they aren’t part of the Unicode emoji standard, so Apple has to do the work to make them render properly and fit within the bounds of text as part of its own proprietary iMessage protocol. If a person sends a Genmoji to an Android user, they will receive it as a normal image attached to the text message. A user can describe any combination of existing emojis, or even new ones entirely, such as a giant cucumber. Genmoji can also be used to create cartoon-like images of people one has in their contacts, so a user can ask for a contact “dressed like a superhero,” for instance. Genmoji typically creates a few icons from a prompt so a user can choose which one they’d like to use.

Image Playground is Apple’s version of DALL-E from OpenAI or Midjourney: Users can create a “novel” image based on their description and choose from a variety of prompt suggestions that appear outside of a unique colorful bubble interface surrounding the generated photo. The feature is verging on a one-to-one copy of other AI image tools on the market, but perhaps with a more appealing, easy-to-use interface that suggests additions to prompts proactively. Users can also choose themes, such as seasons, to further customize the image — from there, they can save it to Photos, Files, or copy it. Image Playground isn’t limited to Messages and can be integrated into third-party apps via an application programming interface Apple has provided developers. There is also a dedicated Image Playground app that will be pre-installed on iOS devices for people to easily describe, modify, generate, and share AI images. Users can also circle pictures they’ve drawn and turn them into AI-generated pieces with a feature called Magic Wand, which is first coming to Notes. Like Genmoji, images made using Image Playground can resemble a person depending on data derived from personal photos.

The entire concept of AI-generated photography is abhorrent to me and many others, especially those who work in creative industries or who draw artwork themselves. While Apple has negated the safety concerns that arise from AI-generated artwork — the four pre-defined styles are intentionally not photorealistic, and each image has internal metadata indicating it is generated by AI — it has not put to ease concerns from artists alarmed by AI’s cheapening of the arts industry. Frankly, AI-generated artwork is disturbing, unrealistic, and not elegant to look at. It looks shoddily designed and of poor quality, with lifeless features and colors. If AI images looked like people had made them, a different problem would be at the forefront of the conversation, but currently, AI images are cheap, filthy creations. They’re not creative; they instead disincentivize and discourage creativity while inundating the internet with deceptive photos that trick people and feel spammy and artificial.

It’s tough to describe the feelings AI images cultivate, but they aren’t pleasant. And furthermore, to add even more insult to injury, Apple hasn’t provided any information as to how its models were trained, leaving open the possibility that real artists’ work was used without permission.4 I expect this kind of behavior from companies like OpenAI and Google, who have both consistently degraded the quality of good artwork and information almost habitually, but not from Apple, whose late founder, Steve Jobs, proclaimed Apple was at the intersection of technology and liberal arts. The company has slowly but surely drifted away from those roots that made it so reputable in the first place, and it’s disheartening to observe. AI-generated art, whether it be presented in a cute bow and ribbon or a desolate webpage littered with obnoxious advertisements, is neither technology nor liberal arts — it is slop, a phrase that at this rate should probably win Word of the Year.

I’m less concerned about the social justice angle many have seemed to stake their beliefs in and more about the feelings this feature creates. Apple users, engineers, and designers all share the conviction that software should be beautiful, elegant, and inspiring, but oftentimes, the wishes of shareholders eclipse that unwaveringly essential ideal. This is one such occurrence of that eclipse — a misstep in the eyes of engineers and designers, but a benison to the pockets of investors. Apple has calculated the potential uproar within a relatively and probably measurably minor slice of its user base isn’t worth it in favor of the deep monetary incentives, and it worked for the C-suite executives. Will Image Playground and Genmoji change the way people use and feel about their devices? Possibly, maybe for the best, or maybe for the worse — but what it will do with resolute certainty is upend the value of digital artwork.


Photos

An image of the Photos app in iOS 18. The Photos app in iOS 18. Image: Apple.

Apple, alongside all of its image generation efforts, also brought updates to photo editing and searching, similar to Google in May. Users can search their photo libraries by “describing” what they’re looking for using natural language: This differs from Apple’s current implementation where users can search for individual items like lakes, trees, etc., because now people can combine multiple queries and refine searches by adding specific details. Think of it as a chatbot that can use visual processing to categorize photos, because that’s exactly what it is. People can also generate videos called “memory movies,” short clips made from specific moments created by AI, typically complemented with music and effects. The Photos app already creates Memories, which are similar, but this time, users can describe exactly what they’d like the video to be of. Examples include trips, people, or themes from images.

The most appreciated feature ought to be the Clean Up tool, which works exactly like Google’s Magic Eraser, which first debuted with the Pixels 6 and 6 Pro in 2021. Apple Intelligence automatically identifies objects and people in the background of shots that might be distracting and offers to remove them automatically from within the Photos app. Users can then circle the distraction and the image will be recreated just as if it weren’t there. Notably, this does not compete with Adobe’s Generative Fill or other similar features — it doesn’t create what wasn’t already there. As I wrote earlier, Apple’s features aren’t whiz-bang demonstrations, they’re practical applications of AI in the most commonly used apps. I’d assume these features will be powered solely by on-device processors, but they work on photos taken on any camera, not just an iPhone.

Unlike photo generation, photo editing is an area in which generative AI can assist with the more arduous work. Photoshop has been able to remove objects from the backgrounds of photos for decades, but it requires skills and a large, powerful computer. Now, those powerful computers are in the pockets of millions, and thus, there is no need to learn these skills except for when the result truly matters. For the smallest of touch-ups, so many people are going to be empowered by having an assistant that can perform these tasks automatically. Finding photos has always been hard, but now, Apple has essentially added a librarian to the photo library. Editing photos previously required skill and know-how, but now, it’s just one tap. It’s little things like these that make the experience of using technology more delightful, and I’m glad to see Apple finally embracing them.


What Apple announced on Monday might not sound revolutionary at first glance, but keen observers will realize that the announcements and their features change how people use their devices. Technology shouldn’t do my artwork and writing for me so I can do the dishes — it should do the dishes so I can do my writing and artwork. Apple Intelligence isn’t doing anyone’s dishes yet, but it’s one step closer: It’s doing the digital version of the dishes. Apple Intelligence subtly yet conspicuously weaves itself into every corner of Apple’s beloved operating systems for a reason: people shouldn’t have to learn how to use the computer; the computer should learn from the user. For the first time ever, Apple’s computers are truly intelligent. Yes, I believe the company has misstepped in certain areas, like its image generation features, but the broad, overarching theme of Monday was that the computer is now learning from humans. The intelligence no longer lives in a browser tab or an app — it’s everywhere, enveloped in the devices we carry with us everywhere. The future is now, or, I guess, whenever Apple Intelligence goes into beta later this year.


  1. Apple said ChatGPT Plus subscribers can sign in with their accounts to gain access to quicker, better models. As I’ve said earlier, this partnership feels a lot like Apple and Google’s deal to bring Google Search, Maps, and YouTube to the iPhone. ↩︎

  2. “Sherlocked”: “The phenomenon of Apple releasing a feature that supplants or obviates third-party software…” ↩︎

  3. I don’t have a timeline for this prediction, but I believe it’ll happen within the next few years, especially if OpenAI demands payment when it runs out of VC money. That time is coming soon, and I think Apple will be ready to ditch both Google Gemini — if it adds it in the first place; Federighi didn’t confirm anything — and ChatGPT as soon as it owes either company enormous royalties. Apple wants to be independent eventually, unlike with search engines. See: iCloud Mail or Apple Maps. ↩︎

  4. Apple says Apple Intelligence was trained on a mix of licensed and public data from the internet. That public data most likely includes most websites since the user agent to disallow was only made public after Monday. Dan Moren of Six Colors wrote about how to disable Applebot-Extended on any website to prevent Apple from scraping its contents. ↩︎

Gurman: Apple AI to Be Called ‘Apple Intelligence’

Mark Gurman, leaker extraordinaire, reporting for Bloomberg:

At its annual Worldwide Developers Conference Monday, Apple will announce plans to deeply integrate AI into its major apps and features — all while reiterating a commitment to privacy and security.

The company’s new AI system will be called Apple Intelligence, and it will come to new versions of the iPhone, iPad, and Mac operating systems, according to people familiar with the plans. There also will be a partnership with OpenAI that powers a ChatGPT-like chatbot.

As John Gruber, the author of Daring Fireball, wrote on Threads, I’m keen to see where this artificial intelligence chatbot will be placed within the operating systems. I speculated in May that the partnership might simply consist of a pre-installed ChatGPT app, but the more that I hear about the deal, I think it’ll be more integrated within iOS. I don’t think it’ll be a part of Siri, however — Apple won’t want to destroy its own brand just to replace it with OpenAI’s chatbot. This is a curious aspect of the deal with OpenAI though because there aren’t that many places I’d want an AI chatbot except in an app or on the web, and OpenAI already has both of those cases covered by itself. My final long-shot guess is that it’s built into Spotlight or the URL field in Safari, which already acts as a pseudo-search engine.

“Apple Intelligence,” as off-putting as the name may sound, is quite clever. Apple knows that people will still call it “AI,” so it might as well be a clever play on words. What’s more noteworthy is that Apple will presumably not be renaming Siri entirely — and neither will it use the Siri name for its AI products. Apple Intelligence is separate from Siri, yet Siri uses Apple Intelligence to answer questions. I assume Apple will market the new version of Siri The Information leaked in May as “Siri powered by Apple Intelligence” because it does not want to destroy the fame of Siri — it just wants to improve it. Confusingly, Siri has always been used as a general moniker for Apple’s machine learning technology, like Siri Suggestions, which don’t even involve the voice assistant at all, which makes the situation all the more peculiar. Here’s how I’d draw the chart: Siri is machine learning and Apple Intelligence is generative artificial intelligence — Siri uses Apple Intelligence but Apple Intelligence doesn’t do the converse. Apple Intelligence is the consumer name for Ajax.

The new capabilities will be opt-in, meaning Apple won’t make users adopt them if they don’t want to. The company will also position them as a beta version. The processing requirements of AI will mean that users need an iPhone 15 Pro or one of the models coming out this year. If they’re using iPads or Macs, they’ll need models with an M1 chip at least.

I hope this will calm the inevitable furor from conservative technology users such as those who still willingly use Mastodon as their primary social network. Microsoft’s new Recall feature, available on its new Copilot+ PCs, has sparked anger from the community over how the feature is enabled by default on all compatible machines, so Apple’s choice to label Apple Intelligence as a beta and have it disabled by default is a good choice. (Microsoft said it would make the feature opt-in on Friday.) I would also guess that Apple will advertise the AI features somewhere in the operating system, such as when setting up a new device for the first time, because they’re the new shiny highlights of Monday’s developer conference. Apple doesn’t want to hide them, it just wants to make them easy to ignore.

The processing requirements are also understandable, and I don’t think anyone was seriously holding out hope that Intel Macs from five years ago would be able to run the new AI features. I’m even surprised M1 Macs are supported. I think macOS 15 — the upcoming version — will finally begin to sunset Intel Macs, though I don’t think they’ll lose support entirely until next year. On the iPhone side, I can already see headlines and posts on social media about how Apple is “ripping off” its consumers by requiring the latest-generation iPhones to run the new features, but I truly do think the A17 Pro is required to run large language models — the technology that powers generative AI — on-device due to memory limitations.

Xcode, Apple’s software for developing apps, is getting a big AI infusion too. It will work similarly to Microsoft Corp.’s GitHub Copilot, which can complete code for programmers automatically. Though Apple has already been using this new developer tool internally, it’s unlikely to release it in its full form to third-party developers until next year.

This is extremely exciting, though I also would like some kind of chatbot interface, perhaps developed by OpenAI, that explains and elaborates on Apple’s Swift and SwiftUI developer documentation. Beta versions of Apple’s latest software development kits are often under-documented and a chatbot could prove quite useful.

The Settings app, which has remained generally unchanged since the first version of the iPhone, is getting updated on iOS, iPadOS, and macOS with a focus on improved navigation, better organization, and more reliable search.

The System Settings app on macOS is one of the worst pieces of software Apple has ever produced, and it is shameful that it took the company two years to rectify it. “Better organization” might be helpful, but there are also many interface tweaks that must be made to SwiftUI on the Mac to make the app feel more intuitive. For example, when typing in a text field in System Settings in English, a left-to-right language, the field is aligned to the right. How is that acceptable? The entire app needs a gut-and-redo with a focus on a normal organizational structure, less modality, and customizable window sizes.

Apple is launching a Passwords app for iOS 18, iPadOS 18, and macOS 15 that will offer an alternative to the 1Password and LastPass services. This will essentially be an app version of the company’s long-existing iCloud Keychain feature, which is currently hidden in the Settings app.

At this point, I’m desperate to switch away from 1Password, so I’m excited to see what Apple has created here.

‘How the Humane Ai Pin Flopped’

Tripp Mickle and Erin Griffith, reporting for The New York Times:

Days before gadget reviewers weighed in on the Humane Ai Pin, a futuristic wearable device powered by artificial intelligence, the founders of the company gathered their employees and encouraged them to brace themselves. The reviews might be disappointing, they warned.

Humane’s founders, Bethany Bongiorno and Imran Chaudhri, were right. In April, reviewers brutally panned the new $699 product, which Humane had marketed for a year with ads and at glitzy events like Paris Fashion Week. The Ai Pin was “totally broken” and had “glaring flaws,” some reviewers said. One declared it “the worst product I’ve ever reviewed.”

It is literally the most embarrassing thing for any company to assume the reviews for its product are going to be “disappointing” before reviewers even publish their work. That’s how confident Humane was: not very confident at all. No good product maker would sell a product it thinks is sub-par from the get-go, but of course, Humane did just that because the Ai Pin is a cheap grift designed to please its venture capitalist investors — and it knows that. Humane never cared about making a product, it just cared about making money. Speaking of money:

About a week after the reviews came out, Humane started talking to HP, the computer and printer company, about selling itself for more than $1 billion, three people with knowledge of the conversations said. Other potential buyers have emerged, though talks have been casual and no formal sales process has begun.

Humane retained Tidal Partners, an investment bank, to help navigate the discussions while also managing a new funding round that would value it at $1.1 billion, three people with knowledge of the plans said.

Humane now wants to sell itself to HP because it has done the job it promised to investors: to deliver a product and sell it successfully. “Successfully” is really only defined in the eye of the beholder, but I assume Humane just thinks that means “scam enough people out of $700 to make a dent in the balance sheets.” Whatever it is, Humane now needs to make money to pay its investors and severance for its employees, and the owners will book it into the woods with whatever is left of the money pot. It’s a classic failed Silicon Valley startup. But why didn’t Humane profit earlier; why isn’t it profitable now?

As of early April, Humane had received around 10,000 orders for the Ai Pin, a small fraction of the 100,000 that it hoped to sell this year, two people familiar with its sales said. In recent months, the company has also grappled with employee departures and changed a return policy to address canceled orders. On Wednesday, it asked customers to stop using the Ai Pin charging case because of a fire risk associated with its battery.

That explains it. To be profitable, or to meet a standard of success, the company needed to make a good product. Instead, the Humane Ai Pin explodes while it’s charging. On top of that, it’s essentially worthless, slow, and expensive, so nobody other than a few enthusiasts bought one. Humane only fulfilled 10 percent of its self-set quota, per se, and that wasn’t enough to be profitable. Profit is a byproduct of making a good product, but it is not the byproduct and certainly shouldn’t be the goal. Humane’s primary objective, as I stated earlier, was not to make a good product — it already knew its device was garbage — but to make money, and when a company works with that ethos, it’s designed to fail.

Many current and former employees said Mr. Chaudhri and Ms. Bongiorno preferred positivity over criticism, leading them to disregard warnings about the Ai Pin’s poor battery life and power consumption. A senior software engineer was dismissed after raising questions about the product, they said, while others left out of frustration.

That doesn’t surprise me because I know why this gadget was even created in the first place: A Buddhist monk led Humane’s founders to an angel investor who persuaded them to build a miraculous phone replacement that would later become the Ai Pin. This is a true story. With directionless la-la-land founders like that, the project was designed to fail. I don’t think the problem is toxic positivity like how The Times says — instead, I believe Chaudhri and Bongiorno’s immense egos prevented staffers from raising questions about the device’s premise. It took the general public a day of mulling over Humane’s horrible launch video to realize Humane was dead in the water, but Humane’s intelligent workers weren’t able to say the same after five years of work? I doubt it.

One was the device’s laser display, which consumed tremendous power and would cause the pin to overheat. Before showing the gadget to prospective partners and investors, Humane executives often chilled it on ice packs so it would last longer, three people familiar with the demonstrations said. Those employees said such measures could be common early in a product development cycle.

Never work for a company whose founders are egotistic.

In January, Humane laid off about 10 employees. A month later, a senior software engineer was let go after she questioned whether the Ai Pin would be ready by April. In a company meeting after the dismissal, Mr. Chaudhri and Ms. Bongiorno said the employee had violated policy by talking negatively about Humane, two attendees said.

That is an unbelievably ridiculous policy, one that I don’t even think Apple has in place. Surely letting go of an employee for speaking negatively about the company they work for internally is illegal.

Humane, as of now, is a disaster. Ai Pins are internally combusting, the company is in financial crisis, its employees are dissatisfied, its founders are listening to Buddhist monks while reprimanding smart people, and it’s now trying to sell to a printer company in the news for disabling printers because customers don’t want to pay subscriptions. All of this, and Humane is still selling its tiny stovetop for $700 while charging users $25 a month. And if Humane does sell to HP and closes up shop, every last Humane Ai Pin will become e-waste, because without the backend subscription Humane operates, the pin is a paperweight.

Now, please, no more about this company.

Apple’s AI Ambitions Become More Clear

Mark Gurman has been slowly leaking Apple’s artificial intelligence ambitions and features to be revealed at June’s Worldwide Developers Conference over the past few months, but his report on Sunday in his Power On newsletter for Bloomberg is the most complete picture we’ve seen yet.

Apple is preparing to spend a good portion of its Worldwide Developers Conference laying out its AI-related features. At the heart of the new strategy is Project Greymatter — a set of AI tools that the company will integrate into core apps like Safari, Photos, and Notes. The push also includes operating system features such as enhanced notifications.

I’m curious to learn more about “enhanced notifications” and how they’ll be powered by AI. Something I want Apple to be careful with is shoving AI into everything — if something doesn’t need AI, it shouldn’t use it. “Project Greymatter” is also an interesting name we haven’t heard before. (I have no idea what “Graymatter” is; Wikipedia says it’s a type of blogging software developed “by Noah Grey in November 2000.”)

The system will work as follows: Much of the processing for less computing-intensive AI features will run entirely on the device. But if a feature requires more horsepower, the work will be pushed to the cloud.

Apple is bringing the new AI features to iOS 18 and macOS 15 — and both operating systems will include software that determines whether a task should be handled on the device or via the cloud.

I’m glad we’re starting to see some clarification on which tasks will be allocated to the on-device chips and which ones will be handled by cloud infrastructure, powered by Apple’s in-house M2 Ultra processors. I assume the “software” that delegates tasks is just an internal daemon that runs to intelligently determine which operations are processor-intensive enough to warrant the extra complexities and network issues that come with sending data to the cloud, but it’s also interesting to know that specific tasks aren’t always set to run on-device or in the cloud. I had assumed tasks that Apple thinks from the get-go are less intensive — like summarization of articles, for example — would be hard-coded to always use internal processors.

One standout feature will bring generative AI to emojis. The company is developing software that can create custom emojis on the fly, based on what users are texting. That means you’ll suddenly have an all-new emoji for any occasion, beyond the catalog of options that Apple currently offers on the iPhone and other devices.

I have two emoji requests: a side-eye emoji and a chef’s kiss emoji. That being said, this rumored feature feels like a gimmick — I don’t think these will actually be legitimate emojis available from the keyboard because they’d all have to be part of the Unicode standard to be viewable on all devices. Instead, they’ll probably just be stickers like the iMessage stickers currently available, exportable to PNGs by dragging and dropping.

Another fun improvement (unrelated to AI) will be the revamped iPhone home screen. That will let users change the color of app icons and put them wherever they want. For instance, you can make all your social icons blue or finance-related ones green — and they won’t need to be placed in the standard grid that has existed since day one in 2007.

I’m having a tough time understanding how this will work. Does that mean users will finally be able to change app icons to whatever image they’d like, just like on Android? If so, I’m excited. But if this feature simply adds a filter to developers’ existing app icons, it’s underwhelming. (I’ve also never understood the craze for being able to place apps anywhere on the Home Screen, but I’m sure someone is excited about it.)

A big part of the effort is creating smart recaps. The technology will be able to provide users with summaries of their missed notifications and individual text messages, as well as of web pages, news articles, documents, notes, and other forms of media.

Gurman indicates that all of these features will be powered by Apple’s own bespoke AI large language model, called “Ajax” internally — though it is unclear if that is the final name; I don’t think it’s very Apple-esque to give the underlying technology a name other than “Siri Intelligence” or something similar — while OpenAI’s generative pre-trained transformer that powers ChatGPT will only be used to power a chatbot, which Apple hasn’t been able to develop yet. As Gurman writes:

There’s also no Apple-designed chatbot, at least not yet. That means the company won’t be competing in the highest-profile area of AI: a market that caught fire after OpenAI released ChatGPT in late 2022.

Though some of Apple’s executives are philosophically opposed to the idea of an in-house chatbot, there’s no getting around the need for one. And the version that Apple has been developing itself is simply not up to snuff.

The solution: a partnership. On that front, the company has held talks with both Google and OpenAI about integrating their chatbots into iOS 18. In March, it seemed like Apple and Google were nearing an agreement, and people on both sides felt like something could be hammered out by WWDC. But Apple ultimately sealed the deal sooner with OpenAI Chief Executive Officer Sam Altman, and their partnership will be a component of the WWDC announcement.

I assume this excerpt is Gurman insinuating the deal between Apple and OpenAI has officially been signed, and that Altman will be presenting the partnership akin to how Hans Vestberg, the chief executive of Verizon, announced the 5G partnership between Verizon and Apple during Apple’s iPhone 12 “Hi, Speed” event in October 2020. I am curious about what a chatbot “built into” iOS 18 means — it’s not powering Siri, as Gurman says in the newsletter, and OpenAI already has ChatGPT apps for iOS and macOS that are native and reliable. How much more integrated could the chatbot be? Will it be contextually aware, similar to Google’s “Circle to Search” feature, or does this just mean the ChatGPT app will be pre-installed on new iPhones, akin to the first YouTube app?

It might also be that ChatGPT won’t generate Siri’s answers per se, but will instead be used to create answers when Siri thinks the query is too complex, like how Apple partnered with Wolfram Alpha in Siri’s earliest days. That would also be potentially interesting, but it also feels like a step back for Apple. (I would take anything to make Siri better, though.)

Altman has grown increasingly controversial in the AI world, even before a spat last week with Scarlett Johansson. OpenAI also has a precarious corporate structure. Altman was briefly ousted as CEO last year, generating a crisis for employees and its chief backer, Microsoft.

In other words, Apple can’t be that comfortable with OpenAI as a single-source supplier for one of iOS’s major new features. That’s why it’s still working to hash out an agreement with Google to provide Gemini as an option, but don’t expect this to be showcased in June.

That secondary agreement with Google — which apparently has not been signed yet — is very unusual, and I’m surprised OpenAI even agreed to its possibility. I guess the contract Apple and OpenAI signed is non-exclusive, meaning Apple can partner with any other company it wants. Even though Gurman cites Apple executives being uncomfortable with Altman’s company’s unpredictability, I don’t think users will care. Apple wants the best technology to be available to Apple consumers, and OpenAI makes the best LLMs — not Google. Yes, Apple prefers reliability and “old faithful” over flashy new companies — which is why it would make sense for it to extend its partnership with Google that it already has for Google Search — but “reliability” also means product reliability.

Google’s Gemini suite of AI products is anything but reliable, generating racially diverse Nazis and using Reddit answers in search summaries. AI, regardless of who it is made by, has generated significant controversy over the past few years, and no matter which company Apple partners with, it will continue to generate media headlines. If I were Apple, I’d opt for the company with the better product, because it’s not like Google’s public relations have been excellent either.

Also, consumers who aren’t attuned to the news every day aren’t going to know whenever there is a new feud at OpenAI. Users want better products, and if Gemini tells Apple users to eat rocks and gasoline pasta, it’s a poor reflection on Apple, not Google.

Google Search Summaries Tell People to Eat Glue

Jason Koebler, reporting for 404 Media:

The complete destruction of Google Search via forced AI adoption and the carnage it is wreaking on the internet is deeply depressing, but there are bright spots. For example, as the prophecy foretold, we are learning exactly what Google is paying Reddit $60 million annually for. And that is to confidently serve its customers ideas like, to make cheese stick on a pizza, “you can also add about 1/8 cup of non-toxic glue” to pizza sauce, which comes directly from the mind of a Reddit user who calls themselves “Fucksmith” and posted about putting glue on pizza 11 years ago.

Here is what I wrote about Google’s artificial intelligence right after the company’s I/O conference earlier in May:

The summaries are also prone to making mistakes and fabricating information, even though they’re placed front-and-center in the usually reliable Google Search interface. This is extremely dangerous: Google users are accustomed to reliable, correct answers appearing in Google Search and might not be able to distinguish between the new AI-generated summaries and the old content snippets, which remain below the Gemini blurb. No matter how many disclaimers Google adds, I think it is still too early to add this feature to a product used by billions. I am not entirely pessimistic about the concept of AI summaries in search — I actually think this is the best use case for generative artificial intelligence — but in its current state, it is best to leave this as a beta feature for savvy or curious users to enable for themselves.

Google in a statement to The Verge claimed that these incidents are simply squabbles for nothing and that they are isolated and appear only in results for uncommon queries. (Sundar Pichai, Google’s chief executive, also said the same in an interview with Nilay Patel, The Verge’s editor in chief, although in a slightly backhanded way.) Meghann Farnsworth, a spokesperson for Google, said the company believes the mistakes come from “generally very uncommon queries” when time and time again that theory has been proven false. Generative artificial intelligence is prone to making mistakes due to the way that large language models — the technology that powers generative AI — are made. Google knows it cannot solve that problem singlehandedly without further research, so it labels AI-generated blurbs at the top of Google search results as “experimental.”

Google’s mission when it announced that it would be bringing AI search summaries to all U.S. users by the end of the year was not to improve search for anyone — it was to motion to shareholders that the company’s AI prowess hasn’t been diminished by OpenAI, its chief rival. All press might be good press, but I truly don’t think this many incidents of Google’s AI flubbing the most basic of tests is very good for the company’s image. Google is known for being reputable and trustworthy, and it has shattered that reputation it so painstakingly created for itself in just a matter of weeks. The public’s perception of Google, and in particular, Google Search, has already been in a steady decline for the past few years, and the findings of people from all over the internet over the past week have further regimented the idea that Google’s main product is no longer as useful or capable as it once was.

These are not isolated incidents, and whenever representatives for Google have been confronted with that fact, they have never once tried to digest it and make improvements, as any sane, fast-moving company with a clear and effective hierarchical organizational structure would. Google does not have effective leadership — proven by Pichai’s nonsensical answer to Patel — so it is instead effectively deflecting the blame and chastising the users for typing in “uncommon queries.” Google itself has boasted about how thousands of new, never-seen-before queries are typed into Google each day, but now it is unable to effectively manage its star, most popular product like how it did once upon a time. Google Search is not dying — Bing and DuckDuckGo had an outage on Thursday and hardly anyone noticed — but it is suffering from incompetent leadership.

For now, Google needs to take the financial and perhaps emotional hit and pull search summaries from the public’s view, because recommending people eat glue is beyond ridiculous. And I think the company needs a fundamental reworking of its organizational structure to address fundamental setbacks and issues that are preventing employees from voicing their concerns. The most employees have been able to do is add a “Web” filter to Google Search for users to be able to view just blue links with no AI cruft. There is no more quality control at Google — just like a Silicon Valley start-up — and there is also no fast-paced innovation, unlike a Silicon Valley start-up. Google is now borrowing the worst limitations from small companies and combining them with the operational headaches of running a large multinational corporation. That can only be attributed to ineffective leadership.

Microsoft Announces ‘Copilot+’ PCs

Umar Shakir, reporting for The Verge:

Microsoft brought Windows, AI, and Arm processors together at a Surface event on May 20th…

The big news of the day was Microsoft’s new class of PCs, dubbed Copilot Plus PCs. These computers have processors with NPUs built in so they can do more AI-oriented tasks directly on the computer instead of the cloud. The AI-oriented tasks include using a new Windows feature called Recall.

Microsoft also announced a new Surface Laptop and Surface Pro Tablet powered by Qualcomm’s Snapdragon X processors. That means they should be thinner, lighter, and have better battery-life while also handling AI and processor heavy tasks. And Microsoft wasn’t the only one at the event showing off new laptops. HP, Asus, Lenovo, Dell, and other laptop makers all have new Copilot Plus PCs.

An important thing to note is that “Copilot+” is not a new software feature — it’s the brand name for Microsoft’s new line of computers, many of which aren’t even made by Microsoft itself through its Surface line of products, either. “Copilot+” computers have specification requirements for RAM and neural processing units, or NPUs for short: 16 gigabytes of RAM, 256 GB of storage, and an NPU rated at 40 trillion operations per second to run the artificial intelligence features built into the latest version of Windows. These new AI features are called “Copilot,” a brand name that has been around for about a year. Here is Andrew Cunningham, reporting for Ars Technica:

At a minimum, systems will need 16GB of RAM and 256GB of storage, to accommodate both the memory requirements and the on-disk storage requirements needed for things like large language models (LLMs; even so-called “small language models” like Microsoft’s Phi-3, still use several billion parameters). Microsoft says that all of the Snapdragon X Plus and Elite-powered PCs being announced today will come with the Copilot+ features pre-installed, and that they’ll begin shipping on June 18th.

But the biggest new requirement, and the blocker for virtually every Windows PC in use today, will be for an integrated neural processing unit, or NPU. Microsoft requires an NPU with performance rated at 40 trillion operations per second (TOPS), a high-level performance figure that Microsoft, Qualcomm, Apple, and others use for NPU performance comparisons. Right now, that requirement can only be met by a single chip in the Windows PC ecosystem, one that isn’t even quite available yet: Qualcomm’s Snapdragon X Elite and X Plus, launching in the new Surface and a number of PCs from the likes of Dell, Lenovo, HP, Asus, Acer, and other major PC OEMs in the next couple of months. All of those chips have NPUs capable of 45 TOPS, just a shade more than Microsoft’s minimum requirement.

These new requirements, as Cunningham writes, essentially exclude most computers with processors made by Intel and Advanced Micro Devices built on the x86 platform. Microsoft and its partners are instead relying on Qualcomm’s Snapdragon Arm-based processors, which have capable NPUs and are more battery-efficient for laptops, to power their latest Copilot+ computers. Microsoft’s two Arm-based machines, the Surface Laptop and Surface Pro Tablet, run up to 58 percent faster than Apple’s newly-released M3 MacBook Air, says Microsoft, though it didn’t provide more specifications on how it measured the performance of the Qualcomm chips. I don’t believe in the company’s numbers, especially since it says the new Surface machines have better battery life than the MacBook Air, which would truly be a feat.

The new processors and specifications power new Copilot features in Windows, which will be coming to Windows 11 — not a new version called Windows 12, unlike some have speculated — in June. Some of the features run on-device to protect privacy, while others run on Microsoft’s Azure servers just like they did before. Microsoft announced that it would be deploying access to GPT-4o, its partner OpenAI’s latest large language model announced earlier in May, as part of the normal version of Copilot later this year, and it also announced new image generation features in certain apps. The new version of Windows, which includes an x86-to-Arm translator called Prism, has been designed for Arm chips, and Microsoft announced that it has collaborated with leading developers, such as Adobe, to bring Arm versions of popular apps to the new version of Windows. (Where have I heard that before?)

The biggest new software feature exclusive to the Copilot+ PCs is called “Recall.” Here is Tom Warren, reporting for The Verge:

Microsoft’s launching Recall for Copilot Plus PCs, a new Windows 11 tool that keeps track of everything you see and do on your computer and, in return, gives you the ability to search and retrieve anything you’ve done on the device.

The scope of Recall, which Microsoft has internally called AI Explorer, is incredibly vast — it includes logging things you do in apps, tracking communications in live meetings, remembering all websites you’ve visited for research, and more. All you need to do is perform a “Recall” action, which is like an AI-powered search, and it’ll present a snapshot of that period of time that gives you context of the memory…

Microsoft is promising users that the Recall index remains local and private on-device. You can pause, stop, or delete captured content or choose to exclude specific apps or websites. Recall won’t take snapshots of InPrivate web browsing sessions in Microsoft Edge and DRM-protected content, either, says Microsoft, but it doesn’t “perform content moderation” and won’t actively hide sensitive information like passwords and financial account numbers.

What makes Recall special — other than that none of the data it captures is sent back to Microsoft’s servers, which would be both incredibly invasive and entirely predictable for Microsoft — is that it only captures screenshots periodically as work is being done on Windows. Users can go to the Recall section of Windows and simply type a query, using semantic natural language reasoning, to prompt an on-device LLM to search the library of automatically captured screenshots. The LLMs search text, videos, and images using multimodal functionality, and even transcribe spoken language using a new feature called “Live Captions,” also announced Monday.

Recall reminds me of Rewind, the Apple silicon-exclusive Mac app touted last year by a group of Silicon Valley entrepreneurs that continuously records one’s Mac screen to allow an LLM to search everything someone does on it. The app sparked privacy concerns because the processing was done in the cloud, not on-device, whereas Microsoft continuously stated that no screenshots leave the device. I think it’s neat, but I’m unsure of its practicality.

Live Captions also translates 44 various languages into English, whether the content is being played in Windows or using the microphones to listen to conversations. It also processes queries entirely on-device, using the NPUs. It also transcribes audio and video content from all apps, not just ones that support it — this means that content from every website and program will be able to receive automatic, mostly accurate subtitles. (This is something I hope Apple adds in iOS 18.)

I think Monday’s announcements are extremely intriguing, especially regarding the bombastic claims by Microsoft as to the new AI PCs’ battery life and performance, and I’m sure reviewers will thoroughly benchmark the new machines when they arrive in June. And the new Copilot features — while I’m still not a fan of the dedicated Copilot Key — also seem interesting, especially “Recall.” I can’t wait to see what people use it for.

Scarlett Johansson: OpenAI Hired a Soundalike Without My Permission

Jacob Kastrenakes, reporting for The Verge:

Scarlett Johansson says that OpenAI asked her to be the voice behind ChatGPT — but that when she declined, the company went ahead and created a voice that sounded just like her. In a statement shared to NPR, Johansson says that she has now been “forced to hire legal counsel” and has sent two letters to OpenAI inquiring how the soundalike ChatGPT voice, known as Sky, was made.

“Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system,” Johansson writes. She says that Altman contacted her agent as recently as two days before the company first demoed the ChatGPT voice asking for her to reconsider.

Altman has made it clear that he admires Johansson’s work. He’s said that Her, which features Johansson as an AI voice assistant, is his favorite film; after the ChatGPT event last week, he posted the word “her,” seemingly in reference to the voice demo the company presented, which featured an assistant that sounded just like Johansson.

OpenAI said this morning that it was pulling the voice of Sky in order to address questions around “how we chose the voices in ChatGPT.” The Verge has reached out to OpenAI for comment.

Johansson says she was “shocked, angered and in disbelief” over how “eerily similar” the voice of Sky sounded to herself. OpenAI said the voice comes from an actor who they hired who is speaking in their normal speaking voice. The company declined to share the actor’s name, citing privacy concerns.

You can read Johansson’s letter here, and I encourage you to do so. Here is the story from her side:

  1. OpenAI asks Johansson to be the voice for ChatGPT. Johansson refuses, citing personal reasons.
  2. OpenAI goes out and hires another voice actor who sounds like her in September of last year. The company launches the voice later in the year.
  3. OpenAI launches a new model earlier in May that is more expressive, highlighting the similarities between the voice, “Sky,” and Johansson.

I have absolutely no idea what Altman, OpenAI’s chief executive, was thinking with this atrocious decision. It clearly shows the company’s lack of regard for copyright laws and exemplifies the need for strong protections for actors in the age of artificial intelligence. As if this sleazy maneuver wasn’t enough to keep under wraps, Altman went ahead and posted on the social media website X: “her” after the Monday “Spring Update” keynote, hosted by Mira Murati, the company’s chief technology officer. Did OpenAI seriously think Johansson, one of Hollywood’s most famous actresses, wouldn’t pursue legal action against this?

Altman could’ve claimed plausible deniability because he wasn’t directly involved in the hiring of the new voice actress, but by posting about the movie, in which Johansson stars, it links him to the chaos. And posting about the movie makes him look even worse from a moral standpoint; it’s almost like a “just because you didn’t agree doesn’t mean I can’t clone your voice” type of sinister thinking, but maybe that’s just me being cynical. Even if Altman didn’t post, I still would’ve believed that he was involved because of his affinity for the film and because the voice sounds so eerily similar to Johansson’s.

Johansson isn’t out to get OpenAI — I don’t even think she’s very upset — but she does want some transparency as to who it hired for the voice and they were chosen. (Clearly, because they sound like Johansson, though I find it difficult that OpenAI will willingly admit that.) I wish to know this information too, because in an age where deepfakes are so prevalent, transparency and openness are crucial. OpenAI, as the leader of the AI revolution, needs to take accountability for this and respect copyright laws.

And no, I hardly doubt this will alter Apple’s negotiations with OpenAI for iOS 18.

Slack Admits It’s Training LLMs on Private Messages

Will Shanklin, reporting for Engadget:

Slack trains machine-learning models on user messages, files, and other content without explicit permission. The training is opt-out, meaning your private data will be leeched by default. Making matters worse, you’ll have to ask your organization’s Slack admin (human resources, IT, etc.) to email the company to ask it to stop. (You can’t do it yourself.) Welcome to the dark side of the new AI training data gold rush.

Corey Quinn, an executive at DuckBill Group, spotted the policy in a blurb in Slack’s Privacy Principles and posted about it on X (via PCMag). The section reads (emphasis ours), “To develop AI/ML models, our systems analyze Customer Data (e.g. messages, content, and files) submitted to Slack as well as Other Information (including usage information) as defined in our Privacy Policy and in your customer agreement.”

The opt-out process requires you to do all the work to protect your data. According to the privacy notice, “To opt out, please have your Org or Workspace Owners or Primary Owner contact our Customer Experience team at feedback@slack.com with your Workspace/Org URL and the subject line ‘Slack Global model opt-out request.’ We will process your request and respond once the opt out has been completed.”

This is horrifying. I’m usually not one to be all too worried about public writing being used for large language models, but private direct messages and conversations within restricted Slacks ought to be off-limits. Slack is covering up here by distinguishing between its official premium Slack LLMs — which cost money — and workspace-specific search tools, but there is no difference. They’re both artificial intelligence products, and they’re both trained on private, presumably encrypted-at-rest data. It is malpractice for Slack to hide this information in a document written by seasoned legal experts that no normal person will ever read, and the entire company should be ashamed of itself. Salesforce continues to pull nonsense like this on its customers for no reason other than maximum profit making, and it is shameful. If there were a better product than Slack in its market, the Slack division of Salesforce would go bankrupt.

What makes matters worse — yes, even worse than training LLMs on private messages — is that customers have no way of opting out unless they ask their Slack administrator to email the company’s Feedback address requesting to opt-out. There are two problems here: individual users can’t opt out of training their own data and administrators have to email the company to prevent their employees’ data from being harvested by Salesforce. How is this kind of behavior legal, especially in Europe? Some rather frustrated Slack users are demanding the company make the default behavior to opt into training rather than opt out, but I wouldn’t even go that far. Slack needs to build a toggle switch for every employee or Slack user to turn data sharing off for themselves — and it needs to do it fast. Anything shallow of that is beyond unacceptable. These are private messages, not public articles or social media posts.

I don’t know how anyone can justify this behavior. It’s sleazy, rude, disrespectful, and probably violating some European privacy regulations. People have been able to trick LLMs into leaking their training data with relative ease and that is not something Salesforce/Slack can mitigate with a couple of lines of code because the flaw is inherent to the design of the models. This bogus statement from Slack’s social media public relations department was written by someone who is absolutely clueless about how these models work and how data can be extracted from them, and that, plainly, is wrong. Private user data should never be used to train any AI model whatsoever, regardless of who can use it or access it. The training, if it happens, should only be constrained to on-device machine learning, like Apple Photos, for example. And moreover, burying the information about data scraping in a few lines in a privacy policy not a single customer will read is irresponsible. Shame on Salesforce, and shame on Slack.

Google Plays Catch-Up to OpenAI at This Year’s I/O

Google threw things at the wall — now, it hopes some will stick

An image of Sundar Pichai, Google’s chief executive, onstage at Google I/O 2024. Sundar Pichai, Google’s chief executive, onstage at Google I/O 2024. Image: Google.

At the opening keynote of its I/O developer conference on Tuesday, Google employed a strategy born of sheer desperation: Throw things at the wall and see what sticks. The company, famed for leading the artificial intelligence revolution within Silicon Valley for years, has been overtaken by none other than a scrappy neighbor with some help from Microsoft, one of its most notable archenemies. That neighbor, OpenAI, stunned the world just a day prior on Monday with the announcement of a new omni-modal large language model, GPT-4o, which features a remarkably capable and humanlike text-to-speech apparatus and state-of-the-art visual recognition technology. OpenAI first took the world by storm in November 2022 with the launch of its chatbot, ChatGPT, which instantly became one of the fastest-growing consumer technology products ever. From there, it has only been smooth sailing for the company, and everyone else has been trying to catch up — including Google.

In a hurry, Google quickly went into overdrive, declaring a “code red” and putting all hands on deck after Microsoft announced a new partnership with OpenAI to bring the new generative pre-trained transformer technology to Bing. Last year, Google announced Bard, its AI chatbot meant to rival OpenAI, only for OpenAI’s latest GPT-4 to run laps around it. Bard would consistently flub answers through hallucinations — phenomena where chatbots confidently provide wrong answers unknowingly due to a quirk in their design — fail to provide references, and ignore commands, placing it dead last in the rankings against its rivals. At Google’s I/O conference last year, Google began trying to add the model hurriedly to its existing Google Workspace products, like Google Docs and Gmail, but most users didn’t find it very useful due to its constant mistakes.

Later in the year, Google announced three new models to better compete with OpenAI: Gemini Nano, Gemini Pro, and Gemini Ultra1. The three models — each with varying parameter and context token sizes — were poised to perform different tasks each, but Google quickly touted how Gemini Pro was comparable to GPT-3.5 and Gemini Ultra even beat GPT-4 in some circumstances. It put out a demonstration showcasing the multimodal features of Gemini Ultra, showed off Gemini Pro’s deep interaction with Google products like YouTube and Google Search, and pre-installed the smaller Gemini Nano model onto Pixel phones in the fall to perform quick on-device tasks. And most importantly of all, to change Bard’s brand reputation, Google changed the name of its AI product and chatbot to Gemini. Eventually, it attempted to put Gemini everywhere: in Google Assistant, in Google Search by way of Search Generative Experience, and in its own app and website. It was a fragmented mess — while the models were average at best, there were too many of them in too many places. They cluttered Google’s already complex ecosystem of products.

So, with the stage set, expectations were high for Tuesday’s I/O event, where Google was poised to clean up the clutter and consolidate the AI mess it had entangled for itself so hastily over the last 16 months. And, in typical Google fashion, the company utterly flopped. Instead, Google leaned in on the mess, throwing Gemini into every Google product imaginable. Google Search now has Gemini built-in for content summaries, replacing SGE for all U.S. users beginning this fall; Gmail now has Gemini search and summaries to shorten threads, find old emails, and draft responses; Android now has a contextually aware version of Gemini which can be asked questions depending on user selections; and every nook and cranny of Google’s services has been dusted with the illustrious sparkles of AI in some capacity. I tried to make some sense out of the muddied features, and here is what I believe Google’s current master plan is:

  1. Let developers toy with Gemini however they would like, lowering prices for the Gemini application programming interface and making new open-source LLMs to lead the way in the development and production of AI-focused third-party applications.

  2. Bring Gemini to every consumer product for free to increase user engagement and deliver shareholder value to please Wall Street.

  3. Unveil new moonshot projects to excite people and sell them on the prospect of AI.

I came up with this thesis after closely observing Google’s announcements on Tuesday, and I think it makes sense from an organizational, business perspective. In practice, however, it just looks desperate. Tuesday was catch-up day for Google — the company did not announce anything genuinely revolutionary or never seen before but rather focused its efforts on reclaiming its top spot in the AI space. Whether the strategy will yield a positive result is to be determined. In the meantime, though, consumers are left with boring, uninteresting, unexciting events that mainly function as shareholder advertisements instead of places to showcase new technology. Google I/O was such an event, with its steam stolen by OpenAI’s presentation just the day prior — and that is entirely the fault of Google, not OpenAI. Here are my takeaways from the keynote this year.


Gemini for the Web

Since the advent of ChatGPT, AI chatbots and their makers have been intent on upending the norms of the web. Publishers have reported frustration due to decreased traffic, users are inundated with cheap AI-generated spam whenever they make a Google search, and it is even harder than ever to ensure answers’ accuracy. Google, without a doubt, bears some responsibility for this after its beta introduction of SGE last year, which automatically queries the web and then quickly writes a summary pinned to the top of the results page. And even before that, Gemini was engineered to search the web to generate its answers, providing citations in line for users to fact-check its responses.

In practice, though, the citations and links to other websites are minuscule and are rarely clicked because most of the time, they’re simply unneeded. Instead of taking steps to address this information conundrum that has plagued the web for over a year, Google leaned into it at I/O this year — both in Google Search and Gemini, the chatbot.

First, Gemini: Gemini had fallen behind in sheer number of features compared to OpenAI’s GPT-4, so Google announced some remedies to better compete in the saturated chatbot market. The company announced it would build a conversation two-way voice mode into Gemini — both the web version and mobile app — similar to OpenAI’s announcements from Monday, allowing users to speak to the robot directly and receive speedy answers. It said the feature, which will become available later this year, will be conversational unlike Google Assistant, which currently only speaks aloud answers to user queries without asking follow-up questions.

However, it is unclear how this differs compared to the Gemini Google Assistant mode available for Pixel users now. Google Assistant on Pixel phones has two modes: the standard Google Assistant mode and Gemini, which uses the chatbot to generate answers. Moreover, there is already feature parity between the Gemini app and Google Assistant on Android, further muddling feature sets between Google’s AI products. This is what I mean by Gemini coming to every nook and cranny of Google’s software. Google needs to clean up this product line.

The new version of Gemini will also allow users to create custom, task-specific mini chatbots called “Gems,” a clever play on “Gemini.” This feature is meant to rival OpenAI’s “GPTs,” customizable GPT-4-powered chatbots that can be individually given instructions to perform a specific task. For example, a GPT can be programmed to search for grammar mistakes whenever a user uploads a file — that way, there is no need to describe what to do with every file that is uploaded on the user’s end as someone would have to do with the normal version of ChatGPT. Gems are a one-to-one knockoff of GPTs — users can make their own Gems and program them to perform specific tasks beforehand. Gems will be able to access the web, potentially becoming useful research tools, and they will also have multimodal functionality for paying Gemini Advanced users, allowing image and video uploads. Google says Gems will be available sometime in the summer for all users in the Gemini app on Android, Google app on iOS, and on the web.

And then, there is Google Search: Since the winter, Google has been slowly rolling out its SGE summaries to all web users on Google. The summaries appear with an “Experimental” badge and big, bold answers, and typically generate a second or two after the search has been made. The company now has fully renamed the experimental feature to “search summaries,” removing the feature from beta testing (it was only available through Google’s “Labs” portal) and vowing to expand it to all U.S. users by the end of the year. The change has the potential to entirely rewrite the internet, killing traffic to publishers that rely on Google Search to survive and sell advertisements on their pages, as well as disincentivizing high-quality handwritten answers on the web. The Gemini-powered search summaries do provide sources, but they are often buried below the summary and seldom clicked on by users, who are commonly content with the short AI-generated blurb.

The summaries are also prone to making mistakes and fabricating information, even though they’re placed front-and-center in the usually reliable Google Search interface. This is extremely dangerous: Google users are accustomed to reliable, correct answers appearing in Google Search and might not be able to distinguish between the new AI-generated summaries and the old content snippets, which remain below the Gemini blurb. No matter how many disclaimers Google adds, I think it is still too early to add this feature to a product used by billions. I am not entirely pessimistic about the concept of AI summaries in search — I actually think this is the best use case for generative artificial intelligence — but in its current state, it is best to leave this as a beta feature for savvy or curious users to enable for themselves. The expansion and improvement of the summaries were a marquee feature of Tuesday’s presentation, taking up a decent chunk of the address, and yet Google made an egregious error in its promotional video for the product, as spotted by Nilay Patel, the editor in chief of The Verge. That says a lot.

Google improved its summaries feature before beginning the mass rollout, though: it touted what it called “multi-step reasoning,” allowing Google Search to essentially function as the Gemini chatbot itself so users can enter multiple questions at once into the search bar. Most Google searches aren’t typically conversational; most people perform several searches in a row to fully learn something. This practice, as Casey Newton wrote for Platformer, once upon a time, used to be enjoyable. Finding an answer, repeating the search with more information, and clicking another one of the 10 blue links is a ritual practiced by hundreds of millions of people daily, and Google seems intent on destroying it.

Why the company has decided to upend its core search product is obvious: Google Search is bad now. Nowadays, Google recommends AI-generated pages engineered for maximum clicks and advertising revenue rather than useful, human-written sites, leading users to append “Reddit” or “Twitter” to their queries to find real answers written by real people. Google has tacitly shown that it has no interest in fixing the core problem at hand — instead, it is just closing up shop and redirecting users to an inferior product.

Google’s objective at I/O was to circumvent the problem of the internet no longer being helpful by making AI perform searches automatically. Google showcased queries that notably included the word “and” in them — for example: “What is the best Pilates studio in Boston and how long would it take to walk there from Bacon Hill?” Before Tuesday, one would have to split that question into two: “What is the best Pilates studio in Boston?” and “Travel time between the studio and home.” (The latter would probably be a Google Maps search.)

It is a highly specific yet somehow absolutely relevant example of Google throwing in the towel on web search. When Google detects a multi-step query, it does not present 10 blue links that might have the answer to both questions, because that would be all but impossible. (Very few websites would have such specific information.) It instead generates an AI summary of information pulled from all over the web — including from Google Maps — effectively negating the need to do further research. While this might sound positive, it in reality kills the usefulness of the internet by relegating the task of searching for information to a robot.

People will learn less from this technology, they will enjoy using the internet less, and as a result, publishers will be less incentivized to add to the corpus of information Gemini uses to provide answers. The new AI features are good short-term solutions to improve the usefulness of the world’s information superhighway, but they cause a major chicken-and-egg problem that Google has continuously either ignored or chosen to purposefully neglect. This pressing issue does not fit well in the quick pace of a presentation, but it will cause an already noticeable decline in high-quality information on the web. It is a short-term bandage over the wound that is lazy, money-hungry analytics firms — once the bandage withers and expires, the wound will still be there.

That is not to say that Google should not invest in AI at all, because AI pessimism is a conservative, cowardly ideology not rooted in fact. Instead, Google should use AI to remedy the major problem at hand, which it caused itself. AI can be used to find good information, improve recommendation algorithms, and help users find answers to their questions in fewer words. Google is more than capable of taking a thoughtful approach to this glitch in the information ecosystem, and that is apparent because of its latest enhancement to its traditional search product: ask with video and Circle to Search.

Asking questions with video is exactly the type of enhancement AI can bring without uprooting the vast library of information on the web. The new search feature is built into Google Lens but utilizes Google’s multimodal generative AI to analyze video clips recorded through the Google mobile app along with a quick voice prompt. When a recording is initiated, the app asks users to describe a problem, such as why a pictured record player isn’t working. It then uses AI to understand the prompt and video, then generate an answer with sources pulled from the web.

The reason this is more groundbreaking than worrisome is because it (a) enables people to learn more than they would otherwise, (b) adds a qualitative improvement to the user experience, and (c) encourages authors to contribute information to be featured as one of the sources for the explanation. It is just enough of a change to the habits of the internet where the result is a net positive. Google is doing more than simply performing Google searches by itself, then paraphrasing the answers — it is understanding a query using a neural network, gathering sources, and then explaining them while also providing credit. In other words, it isn’t a summary; it’s a new, remarkable piece of work.

It is safe to say that for now, I am pessimistic about Google’s rethinking of the web. Google’s chatbots consistently provide incorrect answers to prompts, the summaries’ placement alongside the 10 blue links — which aren’t even 10 blue links anymore — can be confusing to non-savvy users, and the new features feel more like ignorant, soulless bets on an illustrious “new internet” rather than true innovations that will improve people’s lives. But that isn’t to say there is no future for generative AI in search — there is in myriad ways. But the sheer unwillingness on Google’s end to truly embrace generative AI’s quirks is astonishing.


Gemini for Users

Google’s apparent attempt to reinvent the internet does not just stop at the web — it also extends to its personal services, like Google Photos and Gmail. This extension first took place last year at Google I/O, and many of Tuesday’s announcements seemed like déjà vu, but this year the company seemed more intent on utilizing the multimodal capabilities and larger context lengths of its latest LLMs to improve search capabilities and provide better summaries, an advantage it hadn’t developed last May.

First, Google Photos, which the company opened the event with, surprisingly. Google described a limitation of basic optical character recognition-based search: Say someone wanted to find their license plate number in a sea of images of various cars and other vehicles. Previously, they would have to sift through the photos until they found one of their car, but with multimodal AI, Gemini can locate the photos of one’s car automatically, and then display the license plate number in a cropped format. This enhanced, contextual search functions like a chatbot within Google Photos to make searching and categorizing photos easier. The AI, which uses Gemini under the hood, uses data from a user’s photo library, such as facial recognition data and geolocation, to find photos that might fit specific parameters or a theme. (One of the examples shown onstage was a user asking for photos of their daughter growing up.)

In Gmail, Google announced new email summarization features to “catch up” on threads via Gemini-written synopses. Additionally, the search bar in Gmail will allow users to sift through messages from a particular sender to find specified bits of information, such as a date for an event or a deadline for a task, without having to enumerate each email individually. The new features — while not improving the traditional Gmail search experience used to find attachments and sort between categories like the sender and send date — do fill the role of a personal assistant in many ways. And they’re also present in the Gemini chatbot interface, so users can ask Gemini to fetch emails about a given subject in the middle of a pre-existing chat conversation. Google said the new features would roll out to all users beginning Tuesday.

The new additions are reminiscent of Microsoft’s Outlook / Microsoft 365 features first debuted last year, and I surmise that is the point. Google’s flagship Gmail service had next to zero AI features, whereas now it can summarize emails and write drafts for new ones, all inline. However, these new Gemini-powered AI features create an interesting paradox I outlined last year: Users will send emails using AI only for the receiver to summarize them using AI and draft responses synthetically, which the sender will receive and summarize using AI. It is an endless, unnecessary cycle that exists due to the quirks of human communication. I do not think this is the fault of Google — it’s just interesting to see why these tools were developed in the first place and to observe how they might be used in the real world.

My favorite addition, however, is what settles the AI hardware debate that has become a hot topic of debate in recent weeks: Gemini in Circle to Search. Circle to Search — first announced earlier this year — allows users to capture a screenshot of sorts, then circle a subject for Google Lens to analyze. Now, Circle to Search adds the multimodal version of Gemini, Gemini Ultra, as well as Gemini Nano, which runs locally on Pixel phones for smaller, more lightweight queries. This one, simple-on-paper addition to Circle to Search, an already non-sophisticated feature, nearly kills both the Rabbit R1 and Humane Ai Pin. With just a simple swipe gesture, any object — physical or virtual — can be analyzed and researched by an intelligent, capable LLM. It’s novel, inventive, and eliminates the often substantial barrier between trying to understand something in the spur of the moment and accessing information.” It makes the process of searching simple, which is exactly Google’s mission statement.

Circle to Search does not summarize the web in the way other Gemini features do because it is mostly powered by a lightweight model with a smaller context window that runs on-device. Instead, it falls back to the web in most instances, but what it does do is perform the task of writing the Google search. Instead of having to enter into Google a query like “orange box with AI designed by Teenage Engineering,” a simple screenshot can automatically write that search and present links to the Rabbit R1. It is a perfect, elegant, amazing implementation of AI now supercharged by an LLM. Google says this type of searching is context-aware, which is a crucial tenant of useful information gathering because there is no use to information if it is not contextual. On Google, that awareness must be manually entered or inferred, but with Circle to Search, the system knows precisely what is happening on a user’s screen.

This might sound like the standard Google Lens, but it is much more advanced than that. It can summarize text, explain a topic, or use existing user data, such as calendar events or notes, to personalize its responses. And because it has the advantage of context awareness, it can be more personal, succinct, and knowledgeable — exactly what the AI devices from Rabbit and Humane lack. Circle to Search with Gemini is built into the most important technological device, and it is exactly the best use for AI. Yes, it might reduce the number of Google searches typed in, upsetting publishers, but it makes using computers more intuitive and personal. Google should run with Circle to Search — it is a winner.

Circle to Search is also powered by a new LLM Google announced during its presentation2, called LearnLM, designed for educational settings and based on Gemini. LearnLM was demonstrated with a Circle to Search query where some algebra homework was presented — the chatbot was able to explain the answer thoroughly using the correct typography and notation, too. Presenters also described the LLM as available on Google Classroom, Google’s learning management software, and YouTube, to explain “educational videos.” The YouTube chatbot interface, which was first beta tested amongst select YouTube Premium subscribers last year, will be available more broadly and will enable users to ask questions about certain videos and find comments more easily. It is unclear what the difference is between LearnLM and Gemini exactly, but I assume LearnLM has less, more specific training data to address hallucinations.

Here are some miscellaneous additions also announced Tuesday:

  • NotebookLM, Google’s LLM-powered research tool that users can upload custom training data to, now uses Gemini to provide responses. The tool is mainly used to study for tests or better understand notes; it was first released to the general public last year. The most noteworthy addition, however, was the new conversation mode, which simulates two virtual characters having a faux conversation about a topic using the user-provided training data. Users can then interject with a question of their own by clicking a button, which pauses the “conversation” — when a question is asked, the computer-generated voices answer it within the context of the training data.

  • On-device AI, powered by Gemini Nano, will now alert users when a phone call might be a scam. This feature will, without a doubt, be helpful for seniors and the less technically inclined. Gemini will listen to calls — even ones it doesn’t automatically flag as spam — and show an alert if it detects it might be malicious.

Google, for years, has excelled at making the smartest smartphones, and this year is no exception. While the company’s web AI features have left me frustrated and skeptical, the user-end features are much more Google-like, adding delight and usefulness while also putting to rest AI grifts with no value. Many of these features might be Android-exclusive, but that makes me even more excited for the Worldwide Developers Conference when Apple is rumored to announce similar enhancements and additions to iOS. The on-device AI feature announcements at Google I/O this year were the only times I felt somewhat excited about what Google had to announce Tuesday, though it might have also helped that those features were revealed toward the beginning of the keynote.


Gemini for Investors

Project Astra is Google’s name for Silicon Valley’s next AI grift. By itself, the technology is quite impressive in the same way that Monday’s OpenAI event was: a presenter showcased how Project Astra could, in real-time, identify objects it looked at via a smartphone camera, then answer questions about them. It was able to read text from a whiteboard, identify Schrödinger’s cat, and name a place just from looking outside a window. It’s a real-time, multimodal AI apparatus, just like OpenAI’s, but there is only one problem: we don’t know if it will ever exist.

Google has a history of announcing products to do nothing more than hike its stock price, like Google Duplex, a voice-to-text AI model that was poised to be able to make calls to secure reservations or perform other mundane tasks with a simple text prompt. Project Astra feels exactly like one of those products because of how vague the demonstration was: The company did not provide a release date, more details on what it may be able to do, or even what LLMs it might be powered by. (It doesn’t even have a proper name.) All the audience received on a sunny spring morning in Mountain View, California, was a video of a smartphone, and later some smart glasses, identifying physical objects while answering questions in an eccentric voice.

The world had already received that video just a day prior, except that time, it received a release date too. And that is a perfect place to circle back to the original point I made at the very beginning of this article: OpenAI stole Google’s thunder, ate its lunch, took its money, and got all the fame. That was not OpenAI’s fault — it was Google’s fault for failing to predict the artificial intelligence revolution. For being so disorganized and unmotivated, for having such an incompetent leader, for being unfocused, and for not realizing the potential of its own employees. Google failed, and now the company is in overdrive mode, throwing everything at the wall and seeing what sticks. Tuesday’s event was the final show — it’s summit or bust.


More than to please users, Tuesday’s Google I/O served the purpose of pleasing investors. It was painfully evident in every scene how uninspired and apathetic the presenters were. None of them had any ambition or excitement to present their work — they were just there because they had to. And they were right: Google had to be there on Tuesday, lest its tenure as the leader of AI come to an end. I’d argue that has already happened — Microsoft and OpenAI have already won, and the only way for Google to make a comeback is by fixing itself first. Put on your oxygen mask before helping others; address your pitfalls before running the marathon.

Google desperately needs a new chief executive, new leadership, and some new life. Mountain View is aimless, and for now, hopeless. The mud is not sticking, Google.


  1. Gemini Nano, Gemini Pro, and Gemini Ultra are Google’s last-generation models. Gemini 1.5 Pro is the latest, and performs equally to Gemini Ultra, though without multimodal capability. Google also announced Gemini Flash on Tuesday, which is smaller than Gemini Nano. It is unclear if Gemini Flash is built on the 1.5 architecture or the 1.0 one. ↩︎

  2. Here is a handy list of Google’s current LLMs. ↩︎

OpenAI Launches ChatGPT 4o, New Voice Mode, and Mac App

OpenAI on Monday announced a slew of new additions to ChatGPT, its artificial intelligence chatbot, in a “Spring Update” event streamed in front of a live audience of employees in its San Francisco office. Mira Murati, the company’s chief technology officer, led the announcements alongside some engineers who worked on their development while Sam Altman, OpenAI’s chief executive, live-posted from the audience on the social media website X. I highly recommend watching the entire presentation, as it is truly one of the most mind-blowing demonstrations one will ever see. It is just 26 minutes long and is available for free on OpenAI’s website. But here is the rundown of the main announcements:

  1. A new large language model, called GPT-4o, with “O” standing for “omni.” It is significantly speedier at producing responses than GPT-4 while being as intelligent as the older version of the generative pre-trained transformer.
  2. A new, improved voice mode that integrates a live camera so ChatGPT can see and speak concurrently. Users can interrupt the robot while it speaks, and the model acts more expressively, tuning its responses to the user’s emotions.
  3. A native ChatGPT application for macOS with which users can ask the chatbot questions with a keyboard shortcut, share their screen for questions, and ask ChatGPT about clipboard contents.

Again, the video presentation is compulsory viewing, even for the less technically inclined. No written summary will be able to describe the emotional rush felt while watching a robot act like a human being. The most compelling portion of the demonstration was when the two engineers spoke to the chatbot on an iPhone, through the app, and watched it rattle off eloquent, human-like responses to questions asked naturally. It really is something to behold.

However, something stuck out to me throughout the banter between the humans and the chatbot: the expressiveness. Virtual assistants, no matter how good their text-to-speech capabilities may be, still speak like inanimate non-player characters, in a way. Their responses are tailored specifically to questions posed by the users, but they still sound pre-written and artificial due to the way they speak. Humans use filler words, like “um,” “uh,” and “like” frequently; they take long pauses to finish thoughts before speaking them aloud; and they read and speak expressively, with each word sounding different each time. Emphasis might be placed on different parts of the word, it might be said at different speeds — the point is, humans do not speak perfectly. They speak like humans.

The new voice mode version of ChatGPT, ChatGPT 4o, speaks just like a real person would. It laughs, it takes pauses, it places emphasis on different parts of words and sentences, and it speaks loosely. It acts more like a compassionate friend than a professional assistant — it does not aim to be formal in any way, but it also tries to maintain some degree of clarity. For example, it won’t meander like a person may, but it does sound like it may meander. For example, when the chatbot viewed a piece of paper with the words “I ♥ ChatGPT,” it responded oddly carefully: “Oh, stop it, you’re making me blush!” Aside from the fact that robots cannot blush, the way it said “oh” and the space that came after it had the same expression and emotion that it would carry if a human had said it. The chatbot sounded surprised, befuddled, and flustered, even though it had prepared that response after solving essentially what was just a tough algebra problem.

Other instances, however, seemed pretty awkward: ChatGPT seemed very talkative in the demonstration, such as when the presenters made mistakes or asked the robot to wait a second. Instead of simply replying “Sure” or just firing back with an “mhmm” as a person would, it gave an annoyingly verbose answer: “Sure, I’d love to see it whenever you’re ready!” No person would speak like that unless they were trying to be extra flattering or appear overly attentive. It could be that ChatGPT’s makers programmed the robot to perform this way for the presentation just so that the audience could hear more of the Scarlett Johansson-esque voice straight from the movie “Her,” but the constant talkativeness broke the immersion and made me want to frankly tell it to quiet down a bit.

The robot also seemed oddly witty, as if it carried some sass in its responses. It wasn’t rude, of course, but it sounded like a very confident salesperson when it should’ve been more subdued. It liked to use words like “Whoops!” and added some small humor to its responses — again, signs of wordiness. I assume the reason for this is to make the robot sound more humanlike because awkward silences are unpleasant and may lead users to think ChatGPT is processing information or not ready to receive a request. In fact, while in voice mode, it’s always processing information and ready to receive requests. It can be interrupted with no qualms, it can be asked different questions, and it can wait on standby for more information. Because GPT-4o is so quick at generating responses, there is zero latency between questions, which is jarring to adjust to but also mimics personal interactions.

Because ChatGPT has no facial expressions, it has to rely on sometimes annoying audio cues to keep the conversation flowing. That doesn’t mean ChatGPT can’t sense users’ emotions or feelings, though — the “O” in GPT-4o enables it to understand tacit intricacies in speech. It can also use the camera to detect facial expressions, but the more interesting use was what it could do with its virtual apparatus. Not only can users speak to ChatGPT while it is looking at something by way of its “omni-modal” capabilities, but users can share their computer screens and make selections on the fly to receive guidance from ChatGPT as if it were a friend looking over their shoulder. An intriguing demonstration was when the robot was able to guide a user through solving a math equation, identifying mistakes as they were made on the paper without any additional input. That was seriously impressive. Another example was with writing code: ChatGPT could look at some code in a document and describe what it did, then make modifications to it.

ChatGPT 4o’s underlying technology is still OpenAI’s flagship GPT-4 LLM, which is still available for paying customers — though I wouldn’t know why one would use it as it’s worse and has lower usage limits. But the new LLM is now trained on audio and visual data in addition to text. Previously, as Murati described during the event, ChatGPT would have to perform a dance of transcribing speech, describing images, processing the information like a normal LLM text query, and then finally running the answer through a text-to-speech model. GPT-4o performs all of those steps inherently as part of its processing pipeline. It natively supports multimodal input and processes it naturally without performing any modifications. It knows what objects are in real life, it knows how people speak, and it knows how to speak like them. It is truly advanced technology, and I can’t wait to use it when it launches “in the coming weeks.”

While the concept of a truly humanlike chatbot is still unsettling to me, I feel like we’ll all become accustomed to assistants such as the one OpenAI announced on Monday. And I also believe they’ll be more intertwined with our daily lives due to their deep integration with our current technology like iPhones and Macs, unlike AI-focused devices (grifts) like the ones from Humane and Rabbit. (The new Mac app is awesome.) It’s an exciting, amazing time for technology.

Good Riddance to that ‘Crush!’ Ad

Tim Nudd, reporting for Ad Age:

Apple apologized Thursday for a new iPad Pro commercial that was met with fierce criticism from creatives for depicting an array of creative tools and objects—from a piano, to a camera, to cans of paint—being destroyed by an industrial crusher.

The tech giant no longer plans to run the commercial on TV…

But many viewers had a more chilling interpretation, seeing the spot as a grim representation of technology crushing the history of human creativity—something the creative industry is already existentially worried about with the rise of AI.

In an exclusive statement obtained by Ad Age, Apple apologized for the “Crush” spot and said it didn’t mean to cause offense among its creative audience.

“Creativity is in our DNA at Apple, and it’s incredibly important to us to design products that empower creatives all over the world,” said Tor Myhren, the company’s VP of marketing communications. “Our goal is to always celebrate the myriad of ways users express themselves and bring their ideas to life through iPad. We missed the mark with this video, and we’re sorry.”

The spot rolled out on Apple’s YouTube and CEO Tim Cook’s X account on Tuesday, but had not received any paid media. Plans for a TV run have now been scrapped.

This is the video in question. Two things:

  1. This is the first time I have seen Apple pull an advertisement from the airwaves in recent memory. The backlash was fierce this time around, with many feeling frustrated and upset at the (terrible) visual of these beautiful pieces of technology and instruments being crushed by what looked like a hydraulic press. I understand what Apple was aiming for here — that the new iPad Pro is powerful enough to replace all of these tools while being remarkably thin — and in a way, the imagery fits the theme. But in practice, looking at the commercial is just sad. I understand why so many professionals — the target market for the advertisement, too — were disturbed by this video, and I think Apple made the right decision here. I appreciate how the company has handled this situation; it takes courage to remove the main commercial for a star product just a day after it was announced.

  2. When I first viewed the advertisement during Apple’s Tuesday event, I wasn’t very perturbed by it, but that was mostly because I wasn’t paying much attention. But after Cook posted the video on the social media website X, I watched it again after reading some posts from filmmakers and other creators about how it made them feel, and I was suddenly uneasy. This commercial comes at a time when much of the creative industry is alarmed by the advent of generative artificial intelligence. For their precious tools, like guitars, pianos, and paints, to be destroyed and replaced by a slender tablet marketed as an “AI-focused” device is cruel. I think Apple could’ve instead offered a brighter picture of how the new iPad Pro could be used, featuring creators in their working spaces using the iPad to enhance their workflows. Nobody is seriously going to throw out their drum kit to replace it with the AI-powered drummer in the new version of Logic Pro announced Tuesday, so why advertise the device like that?

Apple, in the words of Myhren, the company spokesperson, truly did “miss the mark.” It’s unusual coming from Cupertino, which typically makes the very best awe-inspiring advertisements. For example, I thought the digital campaign that followed the event comparing the new iPad Pro to a teal iPod nano was great — it is peak Apple; just as Steve Jobs would’ve intended. I know Apple values and loves physical, antique objects, even if they’re from another era — just look at how much the company celebrates its history in so many of its advertisements. I don’t know why the team tasked with producing this commercial chose to portray the new iPad Pro this way in a stunning deviation of decorum.

Thoughts on Apple’s ‘Let Loose’ Event

The thinnest, most powerful iPads take center stage

An artistic graphic made by Apple of a bunch of hand-drawn Apple logos, used as promotional material for the “Let loose” event. Let loose. Image: Apple.

Apple on Tuesday announced updates to its iPad lineup, including refreshed iPad Air and iPad Pro models, adding a new, larger size to the iPads Air and new screen technology and processors to both new iPads Pro. The company also announced new accessories, such as a new Apple Pencil Pro and Magic Keyboard for the iPads, as well as software updates to its Pro apps on iPadOS. The new announcements come at a time when Apple’s iPad lineup has remained stagnant — the company has not announced new tablets since October 2022, when the iPad Pro was last updated with the M2 chip. On Tuesday, Apple gave the iPad Air the M2 — an upgrade from the previous M1 from when it was last updated in 2022 — and the iPad Pro the M4, a new processor with more cores, a custom Display Engine, and enhanced Neural Engine for artificial intelligence tasks.

Most iPad announcements as of late aren’t particularly groundbreaking — more often than not, iPad refreshes typically feature marginal improvements to battery life and processors, and Apple usually resorts to rehashing old iPadOS feature announcements during its keynotes to fill the time. Tuesday’s event, however, was a notable exception: Apple packed the 38-minute virtual address chock full of feature enhancements to the high-end iPads, with Tim Cook, the company’s chief executive, calling Tuesday “the biggest day for iPad since its introduction” at the very beginning of the event. I tend to agree with that statement: The iPad Pro, for the first time ever, debuted with a new Mac Apple silicon processor before the Mac itself; it now features a “tandem” organic-LED display with two panels squished together to appear brighter; and it’s now thinner and lighter than ever before. These are not minor changes.

But, as I’ve said many times before, I think the biggest limitation to the iPad’s success is not the lack of killer hardware, but the lack of professional software that allows people to create and invent with the iPad. While Apple’s “magical sheet of glass” is now “impossibly thin” and more powerful than Cupertino’s lowest-end $1,600 MacBook Pro announced just last October, its software, iPadOS, continues to be worthless for anything more than basic computing tasks, like checking email or browsing the web. And while the new accessories, like the new Magic Keyboard made out of aluminum featuring a function row, are more professional and sturdy, they still don’t do anything to make the device more capable for professional users. Add to that the $200 price increase — the base-model 11-inch iPad Pro now starts at $1,000, while the larger 13-inch model starts at $1,300 — and the new high-end iPads feel disappointing. I don’t think the new iPads Pro are bad — they’re hardly so — or even a bad value, knowing how magical the iPad feels, but I wish they did more software-wise.

Here are my takeaways from Tuesday’s “Let loose” Apple event.


iPads Air

The easiest-to-cover announcement was the new iPads Air — plural. Before Tuesday, the iPad Air — Apple’s mid-range tablet — only came in one size: 10.9 inches. Now, the device comes in two sizes: the same 11-inch smaller version, and a new 13-inch form factor. Aside from the size, the two models are identical in their specifications. Both models feature M2 chips, their cameras have been relocated to the horizontal edge to make framing easier due to how most users hold iPads, and new storage options have been added now up to 1 terabyte. The smaller model’s prices also remain the same, starting at $600, and the 13-inch version sells for $750. Starting storage has also been increased to 128 gigabytes, and there is now a 512-GB variant.

The new iPads Air, otherwise, are identical to the last-generation model, with the same camera and screen resolutions and mostly identical accessories support. The first-generation Magic Keyboard from 2020 remains compatible, but the second-generation Apple Pencil from 2018 that worked with the previous model is not. (More on this later.) They both come in four colors — Space Gray, Blue, Purple, and Starlight — and ship May 15, with pre-orders open on Tuesday.

I am perplexed by the iPads Air, particularly the smaller version, which is often more expensive than a refurbished last-generation iPad Pro of the same size. Choosing to buy the latter would be more cost-effective, and the iPads Pro also have Face ID and a 120-hertz ProMotion display. Add to that the better camera system and identical processor, and I truly don’t see a reason to purchase a new (smaller) iPad Air. The larger model is a bit of a different case, since buying a larger refurbished iPad Pro would presumably be more expensive, so I can understand if buyers might want to buy the newer 13-inch iPad Air for its larger screen, but the low-end model continues to be a fantastically bad value.


The M4

Rather than use October’s M3 processor in the new iPads Pro, Apple revealed a new system-on-a-chip to power the new high-end tablet: the M4. Exactly as predicted by Mark Gurman, a reporter at Bloomberg with an astonishing track record for Apple leaks1, the new M4 is built on Taiwan Semiconductor Manufacturing Company’s enhanced second-generation 3-nanometer fabrication process called N3E. The new process will presumably provide efficiency and speed enhancements, but I think they will be negligible due to iPadOS’ limited feature set and software bottlenecks. The processor, by default, is binned2 to a nine-core central processing unit — with three performance cores and six efficiency cores — and a 10-core graphics processor, but users who buy the 1- or 2-TB models will receive a non-binned 10-core CPU with four performance cores. The low-end storage tiers also only have 8 GB of memory, whereas the high-end versions have 16 GB, though both versions still have the same memory bandwidth at 120 gigabytes per second.

John Ternus, Apple’s senior vice president of hardware engineering, repeatedly mentioned during the event that the new iPad Pro would not be “possible” without the M4 chip, but I struggle to see how that is true. The new processor has what Apple calls a “Display Engine,” which Apple only made a passing reference to, presumably because it is not very impressive. As far as I know, the M3’s “Display Engine,” so to speak — which is already present in MacBooks Pro with the M3 — powers two external displays, so I’m having a hard time understanding what is so special about the OLED display found in the new iPads that warrants the upgraded, dedicated Display Engine. (It isn’t even listed on Apple’s “tech specs” page for the iPads Pro, for what it’s worth.)

Whatever the Display Engine’s purpose may be, Apple claims the M4 is “1.5 times faster” in CPU performance than the M2, though, once again, I don’t see a reason for the performance improvements because iPadOS is so neutered compared to macOS. I have never had a performance issue with my M2 iPad Pro, and I don’t think I will notice any difference when I use the M4 model. Other than for the cynical reason of trying to shift more iPad sales during Apple’s next fiscal quarter, I don’t see a reason for the M4’s existence at all. I’m unsurprised by its announcement, but also awfully confused. Expect to see this processor in refreshed Mac laptops in the fall, too.


iPads Pro

The star of the show, per se, was the new iPad Pro lineup, both the 11-inch and 13-inch models. (There is no longer a “12.9-inch” model, which I am grateful for.) Both models have been “completely redesigned” and feature new displays, cases, processors, and accessories. The update is the largest since the complete redesign and nixing of the Home Button and Lightning port in 2018, but it isn’t as monumental as that year’s revamp. From afar, the new models look identical to 2022’s versions, aside from the redesigned camera arrangement, which is now color-matched to the device’s aluminum body à la iPhones, whereas it was previously just made out of black glass. The displays are now “tandem OLED” panels, which use a special technology to fuse two OLED panels for maximum brightness and earn the display a new name of “Ultra Retina XDR.” (The iPhone’s non-tandem OLED display is called “Super Retina XDR,” and the previous generation’s 12.9-inch model’s mini-LED display was called the “Liquid Retina XDR” display.) And just like the iPads Air, the iPads Pro’s front-facing camera has been relocated to the horizontal position.

Most impressively of all, Apple managed to thin the iPads down significantly from their previous girth. Apple, in a Jony Ive-like move, called the new 13-inch model the “thinnest device” it has “ever made” — even thinner than the iPod nano, which held the title previously. Ternus, the Apple executive, also assured that the device didn’t compromise on build quality or durability, though I would imagine the new model is easier to bend and break than before. (Tough feat.) I do not understand the obsession over thinness here, but the new model is also lighter than ever before due to the more compact OLED display. The new iPads Pro are so thin that the Apple Pencil hangs off the edge when magnetically attached to the side, which may be inconvenient when the iPad is set on a table; Thunderbolt cables plugged into the iPad also protrude upward from the body, a consequence of the sheer thinness. One thing is for certain, however: The new iPads Pro do look slick, especially in the new Space Black finish.

The thinness is a byproduct — or consequence, rather — of the new beautiful OLED display found on both models, replacing the LED “Liquid Retina” display of the last-generation 11-inch model and mini-LED display of the 12.9-inch version. While the mini-LED display was able to reproduce high-dynamic-range content with high brightness levels down to a specific “zone” of the panel, it also suffered from a phenomenon called “blooming,” where bright objects on a dark background would display a glowing halo just outside of the object. OLED displays feature individually lit pixels, allowing for precise control over the image, alleviating this issue. The panel’s specifications are impressive on their own: 1,000 nits of peak brightness when displaying standard-dynamic-range content, 1,600 nits of peak localized brightness when content is in HDR, a two-million-to-one contrast ratio, and a ProMotion refresh rate from 10 hertz to 120 hertz. The new display, as Apple says, truly is “the most advanced display in a device” of the iPad’s kind. I would argue it’s one of the most advanced displays in a consumer electronics device, period, aside from probably Apple’s own Vision Pro. It truly is a marvel of technological prowess, and Apple should be proud of its work.

Apple allows buyers who purchase a 1- or 2-TB model the option to coat the display in a nano-texture finish for a $100 premium, which will virtually eliminate glare and provide for a smoother writing and drawing experience when using the Apple Pencil. The finish is the same as found on the Pro Display XDR and Studio Display, and while I don’t think it is for me, I appreciate the option. (I do wonder how wiping away fingerprints would work, though, since this is the first time Apple has applied the coating to a touch device.) One quirk of the nano-texture coating, however, is that it cannot cover the Face ID sensors, located at the side of the iPad Pro, so the finish stops at the edge of the screen itself, displaying a glossy bezel around the display. I think it looks strange, but this problem couldn’t possibly be alleviated without redesigning Face ID entirely.

Apple has made some noteworthy omissions to the product, however. Most distinctly of all, it has removed the ultra-wide lens at the back of the iPad, a lens it added in the product’s 2020 refresh. Personally, I have never once touched the ultra-wide camera, and I don’t know of anyone who did, but it might be missed by some. To compensate, Apple has added a new AI-powered shadow remover to the document scanner in iPadOS, powered by the M4’s improved Neural Engine and a new ambient light sensor, which takes a prominent space in the iPad’s new camera arrangement. I’m unsure about how I feel about its physical attractiveness — there are only so many ways to design a camera on a tablet computer before it gets boring — but I think the swap is worth the trade-off. (The ultra-wide camera at the front added in 2021, which powers Center Stage, has not been removed.) The SIM card slot has also been removed from cellular-equipped models, mirroring its omission from 2022’s iPhone 14 Pro, and the 5G millimeter-wave antenna located at the side has also been axed reportedly due to a lack of usage.

The new models have both received price increases of $200, with the 11-inch model starting at $1,000, and the 13-inch at $1,300. I think those prices are fair; I expected increases to be more substantial due to the cost of OLED panels. Base storage amounts have also been subsequently bumped; the new models begin with 256 GB of storage and are configurable up to 2 TB. They ship May 15, just like the iPads Air, and are available for pre-order beginning Tuesday.

Hardware-wise, the new iPads Pro are truly some of the most impressive pieces of hardware Apple has manufactured yet, and I’m very excited to own one. But I can’t help but ask a simple question about these new products: why? Apple has clearly dedicated immense time, energy, and money to these new iPads, and it’s very apparent from the specifications and advertising. Yet when I unbox my new iPad come next week, I’ll probably use it the same, just as I have always used my iPad. It won’t be any better at computing than my M2 iPad Pro I’ve owned for the last year and a half. The Worldwide Developers Conference in June is where the big-ticket software announcements come, but just as Parker Ortolani, a product manager at Vox Media, said on Threads, we have collectively been waiting for “the next big iPadOS update” since the first iPad Pro was launched in 2015 — before iPadOS even existed. iPadOS is a reskinned version of iOS, and Apple must change that this year at WWDC. Until then, the new iPads, while spectacular from every imaginable hardware angle, lack a purpose.


Apple Pencil Pro and Magic Keyboard

Apple announced updates to its two most popular accessories for the iPad Air and iPad Pro: the Apple Pencil, and the Magic Keyboard. The second-generation Apple Pencil, first announced in 2018, has remained unchanged since its first debut and has been compatible with all high-end iPads since 2020, and the Magic Keyboard, first announced in 2020, has also been kept untouched. Both products on Tuesday received major overhauls: Apple debuted the Apple Pencil Pro, a new product with haptic feedback and a touch sensor for more controls, and the new Magic Keyboard, which is now finished in a sturdier aluminum, has a function key row, and features a redesigned hinge. Both products are strictly only compatible with Tuesday’s iPads; subsequently, prior versions of the Apple Pencil and Magic Keyboard cannot be used with the new iPads Pro or iPads Air, aside from the USB Type-C Apple Pencil released in October, which remains as a more affordable option.

The Magic Keyboard’s redesign, Apple says, makes it a more versatile option for “professional” work on iPadOS. The keys, using the same scissor-switch mechanism as the previous generation, now have a more tactile feel due to the hefty aluminum build, which also adds rigidity for use on a lap — the lack thereof was a pitfall of the older Magic Keyboard. The trackpad is now larger and features haptic feedback, just like Mac laptops, and the hinge is more pronounced, making an audible click sound when shut. The Magic Keyboard also adds a small function row at the very top of the deck, adding an Escape key for anyone bullish enough to code on an iPad. (This would’ve been a great time to put Terminal on iPadOS.) While the new additions will undoubtedly add weight to the whole package, I think the trade-off is worth it because it makes the iPad feel more like a Mac. The new Magic Keyboard retails for the same price: $300 for the 11-inch version, and $350 for the 13-inch one. It, again, ships May 15, with pre-orders available Tuesday.

The Apple Pencil Pro, while not as visually striking of an upgrade as the Magic Keyboard, does build on the foundation of the second-generation Apple Pencil well. That stylus, which Apple still sells for older iPads, features a double-tap gesture, which allows quick switching between drawing tools, such as the pen and eraser. The new stylus builds on the double-tap feature, adding a touch sensor to the bottom portion of the stalk which can be squeezed and tapped for more options. Instead of only double-tapping the pencil, users are now able to squeeze it to display a palette of writing tools — not just the eraser. This integration works in apps that support the new PencilKit features in iPadOS; for those that don’t, the double-tap gesture works just as it did before. To select a tool, it can simply be tapped on the screen like normal with the pencil.

The Apple Pencil Pro also supports a feature called “barrel roll,” which allows users to move their fingers in a circle around the pencil to finely control its angle on the virtual page, just like someone would do with a real pencil. And when squeezing, double-tapping, or using the barrel roll gesture, a new Haptic Engine added to the pencil will provide tactile feedback for selections. Apple also added Find My functionality to the pencil, though it is unclear if it included Precision Finding, the feature that utilizes the ultra-wideband chip in recent iPhones to locate items down to the inch. (I don’t think it did since the iPad doesn’t have a U2 chip.)

The Apple Pencil Pro retails for $130 — the same price as the second-generation Apple Pencil — and is available for pre-order starting Tuesday, with orders arriving May 15. The more comedic aspect of this launch, however, is the new Apple Pencil Compare page on Apple’s website, which looks genuinely heinous. Apple now produces and sells four different Apple Pencils, all with separate feature sets and a hodgepodge of compatibility checks. To review:

  • Apple Pencil Pro: The latest version is compatible with the M2 iPads Air and M4 iPads Pro announced Tuesday. It retails for $130.
  • Second-generation Apple Pencil: The older version of the Apple Pencil is compatible with iPads Pro from 2018 and newer and the fourth- and fifth-generation iPads Air from 2020 and 2022. It is not compatible with any of the new iPads announced Tuesday. It also sells for $130.
  • USB-C Apple Pencil: The new USB-C Apple Pencil from October, which does not have double-tap or pressure sensitivity, is compatible with every iPad with a USB-C port, including the latest models. It is available for $70.
  • First-generation Apple Pencil: This pencil is for compatibility with older, legacy iPads, as well as the now-discontinued ninth-generation iPad. It costs $100.

No reasonable person will choose to remember that information, so Apple has assembled an Apple Pencil compatibility page, which is absolutely abhorrent. There is even a Contact Us link on the page for those who need assistance to figure out the chaos. “Who wants a stylus?”


Conclusion

As I have stated many times throughout this article, I think the new hardware announced Tuesday is spectacular. The new iPads Air fit in well with the lineup, the 10th-generation iPad has received a price reduction of $50, replacing the archaic ninth-generation model which had a Home Button and Lightning port, and the new iPads Pro are marvels of engineering. I think all models are well-priced, I like the new design of the Magic Keyboard, and I’m thankful the Apple Pencil has been updated.

But none of the above overshadows how disappointed I am in the iPad’s software, iPadOS. As good as the new hardware may be, I don’t think I will use it any differently as I do my current iPad now. That’s a shame — for how much work was put into Tuesday’s announcements, the bespoke software for the iPad should do better. Until then, the iPad will continue to remain a product in Apple’s lineup — nothing more, and nothing less.


A correction was made on May 5, 2024, at midnight: An earlier version of this article stated that the new M2 iPad Air supports the second-generation Apple Pencil. That is not true; it only supports the USB-C Apple Pencil and the new Apple Pencil Pro. I regret the error.

A correction was made on May 14, 2024, at 2:11 a.m.: An earlier version of this article stated that the USB-C Apple Pencil was released in March. It was actually released in October of last year. I regret the error.


  1. In Gurman we trust. I’ll never make the mistake of doubting him again. ↩︎

  2. I recommend reading my “Wonderlust” event impressions from September to learn more about processor binning. Skip to the section about the A17 Pro. ↩︎

Semafor Interviews Joe Kahn of The New York Times

Ben Smith for Semafor interviewed Joe Kahn, the executive editor of The New York Times. Here is what Kahn had to say in response to Smith’s question about The Times’ role in saving democracy:

It’s our job to cover the full range of issues that people have. At the moment, democracy is one of them. But it’s not the top one — immigration happens to be the top [of polls], and the economy and inflation is the second. Should we stop covering those things because they’re favorable to Trump and minimize them? I don’t even know how it’s supposed to work in the view of Dan Pfeiffer or the White House. We become an instrument of the Biden campaign? We turn ourselves into Xinhua News Agency or Pravda and put out a stream of stuff that’s very, very favorable to them and only write negative stories about the other side? And that would accomplish — what?

I think The New York Times has completely misunderstood what “independent journalism” is. Kahn and other Times journalists, whose work I read regularly, think of us — those accusing The Times of journalistic malpractice — as wanting them to favor the Biden administration or to be against former President Donald Trump somehow. That couldn’t be farther from the truth — it is my firm belief that news shouldn’t be biased toward a political candidate.

News, however, should be biased toward the truth, and The Times warps the truth however it wants to fit the public’s narrative. That’s exactly what Kahn is doing here by using the polls as a determinant for what to cover and how to cover it. I understand the core message: that America’s most respected newspaper should cover America’s problems. But, oftentimes, America’s problems and the way it interprets them are disconnected from reality. It is the job of the country’s newspapers of record to influence public opinion, not report on only what Americans seem to care about.

It’s the job of the news media to report the facts without subjectivity, and Kahn clearly knows this and restates it multiple times throughout the interview. But, Kahn also released this piece of truth: “I think the general public actually believes that he’s responsible for these wars, which is ridiculous, based on the facts that we’ve reported,” referring to President Biden. If the public, by Kahn’s own admission, is so foolish to believe Biden started the wars in Europe and the Middle East, why should The Times’ newsroom cover reality through the public’s (incorrect) lens, as Kahn says The Times is doing?

The Times’ job is to cover reality, regardless of whether it favors the incumbent or his predecessor. Currently, it’s not doing that. It’s warping the news to please its audience, which is not news-making. Once again, my request is not for The Times to be a knight defending democracy by praising Biden’s every move. I want it to be objective in its reporting. Currently, it isn’t — and I feel like that is on purpose.

AI at Next Week’s Apple Event?

Apple announced its earnings for the second quarter on Thursday, and Tim Cook, the company’s chief executive, interviewed with CNBC. CNBC wrote the following:

Cook also said Apple has “big plans to announce” from an “AI point of view” during its iPad event next week as well as at the company’s annual developer conference in June.

I don’t even understand why this was reported on, because artificial intelligence is the new craze both in Silicon Valley and Wall Street. Of course the chief executive of the world’s second-largest technology company — which reported revenue down 4 percent this quarter — would try to pump his stock price, and of course he would do that by saying there will be an AI-related announcement at next week’s hotly anticipated Apple event. It makes logical sense from a business perspective: If Cook can motion investors to hold off on dumping Apple stock this week, he can launch new iPads next week, point to the sales numbers, and watch the stock hike again. That is his job.

Later, CNBC retracted its original quote, but gave the full context to Zac Hall, editor at large at 9to5Mac, somehow:

We’re getting into a period of time here where we’re extremely excited like I’m in the edge of my seat literally because next week, we’ve got a product event that we’re excited about. And then just a few weeks thereafter, we’ve got the… Worldwide Developers Conference coming up and we’ve got some big plans to announce in both of these events. From an AI point of view…

Cook is not saying there will be AI-related announcements at these events, he is just saying (a) that there are “big plans” and (b) there will be announcements some time between now and the end of eternity “from an AI point of view.” Those are mutually exclusive statements — it is foolish to assume otherwise because Cook is well-trained before he sits in front of the media. Apple never reveals what it will announce before an event, even when it would be in the interest of the stock price.

So, that all begs the question: Will there be AI at next week’s event or not? It’s impossible to say conclusively, but I think there will certainly be mentions of AI during the presentation. However, I do not believe Apple will announce AI software of its own just a month before WWDC, where software is usually debuted. I imagine the AI references will be limited to passing mentions of how the new iPads Pro are “great for AI computing” and how you can run AI models with apps on the App Store, just like the “Scary fast” Apple event from October, where the company announced the M3 MacBooks Pro. The mentions will exist to please investors and to hold them off just a bit longer for WWDC, where the big-ticket AI features will be introduced via iOS 18.

Thursday’s keynote will not be a preview of AI features — or at least, so I think. Instead, it looks like it’ll serve as a filler event to build anticipation for the true announcements coming in the summer, while also finally refreshing the iPads, which is long overdue. This scenario also takes into account Mark Gurman’s report for Bloomberg on Sunday that said Apple will ship the M4 in the new iPads Pro: M4 or not, this event is slated to be hardware-focused, and I think the only AI references next week will exist to appease Wall Street. My final take: No AI at next week’s event.

The Rabbit R1 is Just an Android App

Mishaal Rahman, reporting for Android Authority on Tuesday:

If everything an AI gadget like the Rabbit R1 can do can be replicated by an Android app, then why aren’t these companies simply releasing an app instead of hardware that costs hundreds of dollars, requires a separate mobile data plan to be useful, and has terrible battery life? It turns out that’s exactly what Rabbit has done… sort of.

See, it turns out that the Rabbit R1 seems to run Android under the hood and the entire interface users interact with is powered by a single Android app. A tipster shared the Rabbit R1’s launcher APK with us, and with a bit of tinkering, we managed to install it on an Android phone, specifically a Pixel 6a.

Once installed, we were able to set up our Android phone as if it were a Rabbit R1. The volume up key on our phone corresponds to the Rabbit R1’s hardware key, allowing us to proceed through the setup wizard, create a “rabbithole” account, and start talking to the AI assistant. Since the Rabbit R1 has a significantly smaller and lower resolution display than the Pixel 6a, the home screen interface only took up a tiny portion of the phone’s display. Still, we were able to fire off a question to the AI assistant as if we were using actual Rabbit R1 hardware, as you can see in the video embedded below.

The Rabbit R1, just like the Humane Ai Pin, is nothing more than a shiny object designed to attract hungry venture capitalists. The entire device is an Android app, a low-end MediaTek processor, and a ChatGPT voice interface wrapped up in a fancy orange trench coat — in other words, nothing more than a grift that retails for $200. I’ve said this time and time again: These artificial intelligence-powered “gadgets” are VC money funnels whose entire job is to turn profits then disappear six months later when Apple and Google add more broad AI functionality to their mobile operating systems. In the bustle of the post-October 2022 AI sphere, Rabbit raised a few million dollars in Los Angeles, built together an Android app with a rabbit animation, bulk bought some off-the-shelf cheap electronics from China, engineered a bright orange case, put the parts together, made its founder dress up like an off-brand Steve Jobs, and poof, orders started flooding in by the thousands. Ridiculous.

The Rabbit R1, in many ways, is more insulting than the Humane Ai Pin, which I’ve already bashed enough. It is significantly more affordable, priced at $200 with no subscription — unlike Humane’s $700, $24-a-month product — but it is quite literally worse than the Ai Pin from Rabbit’s chief rival VC funnel in every metric. The entire device, as Marques Brownlee, a YouTuber better known as MKBHD, demonstrated in his excellent review of the device, is a ChatGPT wrapper with an ultra-low-end camera and a knob — or wheel, rather — used in favor of a touch screen presumably to make it seem less like a smartphone. In practice, it is a bad, low-end smartphone that does one thing — and only one task — extraordinarily poorly, consistently flubbing answers and taking seconds to respond. It is a smartphone that does everything poorly aside from looking great. (Teenage Engineering designed the Rabbit R1; I’ll give the product design props.) I am astonished that we are living in a world where this $200 low-end Android smartphone is receiving so much media attention.

Rahman contacted Jesse Lyu, Rabbit’s chief executive and co-founder, for comment on his article, and Lyu, grifter-in-chief at Rabbit, naturally denied the accusations in the stupidest way possible. I don’t even understand how this made it to publication; it’s genuinely laughable. Lyu’s justification for the device is that Rabbit sends data and queries to servers — presumably its own servers — for processing. Here is a non-comprehensive list of iOS apps with large language models built in that send data to the web for processing: OpenAI’s ChatGPT, Microsoft Copilot, Anthropic Claude, and Perplexity — also known as every single AI processing app made by a large corporation because it is all but impossible to run LLMs on even the most sophisticated, powerful smartphone processors, let alone any random inexpensive MediaTek chip, such as found on the R1. The Rabbit R1 is an Android app that exchanges data with the internet with a cellular radio and some network calls. Any 15-year-old could engineer this in two weeks from the comfort of their bedroom.

I aggressively smeared the Humane Ai Pin not because I thought it was a grift, but because I thought it had no reason to exist. I thought and still think that Humane built an attractive piece of hardware and that the company still has conviction in creating a product akin to the smartphone in the hopes of eventually eclipsing it. (I think this entire idea is flawed, and that Humane will eventually go bankrupt, but at least Humane’s founders are set on their ambition.) Rabbit as an entire company, by stark contrast, is built on a throne of lies and scams: It came out of the woodwork randomly during the Consumer Electronics Show in January after raising $10 million the month prior from over-zealous VC firms, threw a launch party in New York with influencers and press alike, then shipped an Android app to consumers for $200. It’s a cheap smear of hard-working, dedicated hardware markers; it makes a mockery of true innovators in our very complicated technology climate in 2024. These “smartphone replacement” VC attractions ought to be bankrupt by, if not right after, June.