
For the past year, pressure has really been ramping up on the tech industry to do stuff with AI. What “AI” actually means can vary, but usually refers to a large language model (LLM) chatbot that takes natural language input.
While Apple has tried very hard to remind everyone that all the stuff they do with machine learning and the neural engine counts too, the fact is that Apple is perceived as being behind, because it has no LLM chatbot of its own.
LLMs are artificial, but not really intelligent. They can be quite wrong, or simply malfunction. They are far better at conversational threads, and in understanding context than original-recipe voice assistants, like Siri. It was enough to spook Apple’s executives and lead them to begin a crash AI program.
OpenAI, Microsoft, Meta, Google—you name it. It’s a land grab. Everyone is trying to find a way around smartphone platforms, search monopolies, data brokers, ad sales, SEO, publishers, photographers, stock footage…. pretty much everything. The urgency, the sheer sweatiness of tech companies to show their AI relevance is palpable.
Apple doesn’t want anyone to see them sweat, but at WWDC they’re going to have to break out the AI buzzwords and show where they fit into the current zeitgeist. Here’s what Apple can learn from the mistakes other companies are making when it comes to demonstrating AI prowess.
Summaries and slop
Don’t show off summarizing a conversation. I know Mark Gurman suggests this might be a new feature, but every demo of it from other companies has gone over like a lead balloon. Summarization demos say one thing: “How can I more efficiently ignore the nuance and humanity of the people around me?” Also, demos of summaries are just plain boring.
Google I/O featured several instances of summarization that were not useful and borderline disrespectful. There was a theoretical conversation between a power couple and their prospective roofer. Google’s “helpful” summary said a quote was agreed to, but didn’t actually say what the quote actually was! The actual price—seems like a key element of a quote—didn’t appear until a follow-up question. It also omits all the nuance of the roofer’s interactions with the husband in the scenario. Who would trust that summary?
LLM summaries remove words, collapse context, kill tone, and neuter meaning. Busy technology executives eat these demos up, though!
Don’t demo things that snoop on a user’s calls or their device’s screen. At Google I/O, a demo displayed a fraud warning during a phone call. That means there was an AI model listening to the phone conversation. Even if that’s happening entirely on your device, it’s still unnerving that Google is now listening to the contents of my phone calls. The same goes for Microsoft’s Recall, another on-device feature that watches everything you do—so long as you forget Microsoft’s lousy track record securing people’s devices.
Under no circumstances should there be a chatbot in a conversation with real people that’s jumping in to offering to help coordinate times or issue reminders. Fortunately, Apple doesn’t ship a workplace chat platform, so we’re unlikely to get Google’s demo of “Chip,” the nosy virtual chat kibitzer shown at Google I/O. But I don’t want that bot in my iMessage threads, either.
No generative slop. Don’t show off AI-written poetry or book reports. If people ask for help writing a cover letter, show them an example of a cover letter. AI should point users to vetted and approved templates. (But there should be an AI story with Xcode at WWDC, or why even have it be about AI? It just needs to be respectful of developers’ needs, and actually useful in helping developers with their jobs.)
I think Apple already learned a valuable lesson about visual metaphor when they smashed instruments of human expression into a thin iPad, but just to reiterate: Don’t do that again.
Speaking of creation: Don’t show off images generated out of nothing but a prompt. Any generative elements should be augmented from source images or video. Show off altering aspect ratio on an image, object and lens flare removal, creating thumbnail images, sharpening, denoising, or focus effects.
Even then, keep it grounded to what a reasonable person would want to do with their photos. The Photos app doesn’t need to become Midjourney, or Stable Diffusion, and it certainly doesn’t need to use any models with opaque, legally questionable sources to augment a photo of you smiling at the beach. It should still be that photo at the end of the day.
As for partner demos, I would recommend against demonstrations from companies that have AI models that allow people to make a logo or icon for their company or product without using an artist. Under no circumstances should Midjourney, Dall-E, or any of the other generators that scraped art and photos off the internet be used as a demo. That sends the wrong message, even if it is absolutely a use case that can be demoed to show how the neural engine makes creating a logo 90% faster than on Intel.
Don’t demo video generators. These mostly scare people and impress weirdos. “Look, her hands are boiling!” They’re basically a substitute for artifacting stock footage, and Apple is not a purveyor of artifacting stock footage.
AI video tools that handle retiming, color grading, detail recovery, and noise reduction are all acceptable, especially if they can lean on Apple’s multifaceted imaging pipeline, or can use Apple’s depth data as part of the dataset in processing the footage.
For example: Apple is interested in customers shooting Spatial Video, but there are technical shortcomings with the different lenses. Show us how data can be transferred from one eye to the other to help reduce artifacts, and increase resolution. Do an easy-to-use version of something akin to Ocula.
It is possible to preserve AI/ML as a tool without having AI/ML take over the output. There should always be a kernel of reality in every demo to ground it. It should apply to real life, and not trying to compete in the crowded hallucination market.
Hey, Siri
Now that the lede is good and buried, let’s talk about Siri.
We’d all love a senior Apple exec to get on stage and issue a mea culpa before launching the new version, but it’s probably going to be something more like, “Millions of people use Siri every day, which is why we’re excited to announce Siri is even better than before.”
Unfortunately, Mark Gurman has kind of burst the bubble:
The big missing item here is a chatbot. Apple’s generative AI technology isn’t advanced enough for the company to release its own equivalent of ChatGPT or Gemini. Moreover, some of its top executives are allergic to the idea of Apple going in that direction. Chatbot mishaps have brought controversy to companies like Google, and they could hurt Apple’s reputation.
But the company knows consumers will demand such a feature, and so it’s teaming up with OpenAI to add the startup’s technology to iOS 18, the next version of the iPhone’s software. The companies are preparing a major announcement of their partnership at WWDC, with Sam Altman-led OpenAI now racing to ensure it has the capacity to support the influx of users later this year.
Baffling. I have no idea what that demo will look like, but I hope it isn’t “Showing results from ChatGPT on your iPhone” and there’s a big modal window of ChatGPT output.
It is worth noting that not everyone is enamored with ChatGPT, despite the enthusiasm over the features ChatGPT has.
Apple certainly won’t be demoing the imposter Scarlett Johansen voice from OpenAI at WWDC like OpenAI did at their spring event. You know, on account of them being sued, and all.
That same OpenAI spring presentation had perhaps one of the best demos of an LLM voice interface I’ve seen where one presenter spoke in English, and the other spoke in Italian, and ChatGPT 4o acted as live translator. That was a great demo, and translation is definitely one of the areas Apple is playing catch-up in already. It’s not rumored to be a feature, but it would be a good demo.
Google demoed integration with Google Workspace (Drive, Sheets, Gmail, Gchat (lol), etc.) and Apple should show that Siri can pull in information and context from Mail, Messages, Calendar, Photos, Reminders, etc. Ideally, it would be great to work with apps beyond that, but it needs to be able to plug into at least that data.
That means there needs to be a privacy interface for what apps Siri can access, especially if it is relaying it to a third party, and a privacy story about how Apple won’t be looking into every app on your device if you don’t want it to.
I fear that Apple simply won’t address anything but ChatGPT basics shoved into Siri windows. Which is possibly worse than continuing to work quietly on whatever the hell it is they’re working on. I’ll still run through some examples I’d love to see:
Show us someone asking a HomePod or Watch to do something, and instead of saying it can’t, it’ll execute it on your iPhone. Tell us the story about how Siri is secure and functional across devices under your Apple ID.
Demo someone telling Siri to play something on TV. Then asking their Apple Watch to “pause the TV”. Where Siri can know “the TV” is the one I started playing something on (and my iPhone is near based on Bluetooth), even if there are many TVs attached to my Apple ID.
Put on a little show of someone asking Siri where something is in the interface, or how they can do something. “Hey Siri, where are my saved passwords?” It whisks the person right to the Passwords section of Settings. “Hey Siri, I turned down the brightness all the way but it’s still too bright, what can I do?” and it surfaces the Reduce Whitepoint control. Conversationally, “How can I only turn on Reduce Whitepoint late at night?” and it offers a Shortcut based around the sleep and wake-up times.
Demo someone using new Siri with CarPlay, an essential application of Siri, where someone can conversationally talk to Siri to “Play ‘Mona Lisa Overdrive’” and then follow that up with “Play the rest of the album” and it’ll queue up the tracks after instead of doing something completely random like it does now.
Absolutely demo someone pausing music on their Mac, and telling their HomePod to “play what I was last listening to” and it can go resume playback on the HomePod exactly as if you had just hit play on your Mac.
Demo Siri being able to understand what’s currently on-screen when asked. “Hey Siri, who is the actor in this video?” Then conversationally follow that up with “What have I seen them in recently?” Where it could look through what was recently watched through the TV app and check that against the roles that actor has played. That’s not putting anyone out of a job (Well, except Casey. Sorry, buddy.)
Above all else, demo to the audience that when Siri doesn’t know what to do, it’ll ask. Show us a graceful failure state that reassures people how Apple can behave responsibly.
Let me illustrate what not to do with a recent interaction I had with Current Siri:
Me: “Play the soundtrack for The Last Starfighter“
Siri: “Here’s The Last Starfighter“
[Opens TV app on iOS and starts playing The Last Starfighter from my video library.]
Me: “Play The Last Starfighter soundtrack.”
Siri: “Here’s Dan + Shay”
[Music app starts playnig Dan + Shay “Alone Together”.]
Me: “Play The Last Starfighter Original Motion Picture Soundtrack.”
Siri: “Here’s The Last Starfighter by Craig Safan.”
It seems, however, that nothing is really rumored along these lines. Oh well, guess, I’ll listen to some more Dan + Shay!
Ethics? Anyone?
A very troubling aspect of these rumors is Apple partnering with OpenAI. They didn’t ethically buy rights to use information to train their models, just like they didn’t take Scarlett Johansen’s no for an answer. They’re in active lawsuits with various media companies.
Even companies that have struck a deal with OpenAI—like Stack Overflow, and Reddit—are getting bought off after their sites were already being scraped. Users, who generated all the value in the site, can’t even delete their posts in protest.
Is Apple going to endorse OpenAI by giving them a thumbs up and slotting them into their next operating system releases without comment? They absolutely shouldn’t show anyone from OpenAI in their WWDC presentation, especially not Sam Altman.
There’s an easy way to draw a parallel to Google. Companies sue Google all the time over rights, and Apple still includes Google.
Of course, they are taking money from Google to be the default search engine on iOS, and then trying to have Safari insert Spotlight suggestions to pretend there’s a privacy angle. That Google deal now means that the default search will go through Google’s AI Overview. So Apple is already going to endorse Google’s approach to AI too, even if they don’t strike a deal for anything more.
And let’s not forget the ethics of Apple’s climate pledge. There should be a point in the WWDC keynote where Apple communicates how they can harness AI and still stay on target for their climate goals. That probably seems like a small thing, but people are getting pretty hand-wavy about maintaining their commitments while also putting their models to use.
Regardless of what happens, I suspect there will be plenty of disappointment and outrage to go around in the aftermath of WWDC. These are the times we live in. I just hope Apple takes some lessons from that thing with the hydraulic press and the iPad and doesn’t step in it too badly, just to show that they’re keeping up with the AI hype from the bozos of the tech world.
Title: The Dos and Don’ts of AI at WWDC
URL: https://sixcolors.com/post/2024/05/the-dos-and-donts-of-ai-at-wwdc/
Source: Six Colors
Source URL: https://sixcolors.com
Date: May 22, 2024 at 06:00PM
Feedly Board(s): Technologie