Actually, really liked the Apple Intelligence announcement. It must be a very exciting time at Apple as they layer AI on top of the entire OS. A few of the major themes.
Step 1 Multimodal I/O. Enable text/audio/image/video capability, both read and write. These are the native human APIs, so to speak.
Step 2 Agentic. Allow all parts of the OS and apps to inter-operate via “function calling”; kernel process LLM that can schedule and coordinate work across them given user queries.
Step 3 Frictionless. Fully integrate these features in a highly frictionless, fast, “always on”, and contextual way. No going around copy pasting information, prompt engineering, or etc. Adapt the UI accordingly.
Step 4 Initiative. Don’t perform a task given a prompt, anticipate the prompt, suggest, initiate.
Step 5 Delegation hierarchy. Move as much intelligence as you can on device (Apple Silicon very helpful and well-suited), but allow optional dispatch of work to cloud.
Step 6 Modularity. Allow the OS to access and support an entire and growing ecosystem of LLMs (e.g. ChatGPT announcement).
Step 7 Privacy. <3
We’re quickly heading into a world where you can open up your phone and just say stuff. It talks back and it knows you. And it just works. Super exciting and as a user, quite looking forward to it.
Yikes. Just hit em with the ol’ “<3” for privacy. Does not inspire confidence.
How so? Many people want to use AI in privacy, but it’s too hard for most people to set it up for themselves currently.
Having AI tools on the OS level so you can use it in almost any app and that is guaranteed to be processed on device in privacy will be very useful if done right.
Yeah just like Microsoft Recall right? An AI that has access to every single thing you do (and would also be recording, otherwise how does it know “you”) can never be private by design. Its literal design is to know everything about you, your actions, and your habits. I wouldn’t trust anyone to be able to create an actually secure piece of software that does the above. It will always be able to be stolen/sold/abused.
macOS and Windows could already be doing this today behind your back regardless of any new AI technology. Don’t use an OS you don’t trust.
I don’t use either of those thankfully:).
That’s fair, but you are misunderstanding the technology if you’re bashing the AI from Apple for making macOS less secure. Most likely, it will be just as secure as for example their password functionality, although we don’t have details yet. You either trust the OS or not.
Microsoft Recall was designed so badly, there’s no hope for it.
you can use it in almost any app
if done rightHow are you going to be able to use it in “almost any app” in a way that is secure? How are you going to design it so that the apps don’t abuse the AI to get more information on the user out of it than intended? Seems pretty damn inherently insecure to me.
That’s why it’s on the OS-level. For example, for text, it seems to work in any text app that uses the standard text input api, which Apple controls.
User activates the “AI overlay” on the OS, not in the app, OS reads selected text from App and sends text suggestions back.
The App is (possibly) unaware that AI has been used / activated, and has not received any user information.
Of course, if you don’t trust the OS, don’t use this. And I’m 100% speculating here based on what we saw for the macOS demo.
“and it just works”
has he even used an llm before?
He sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.
I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.
Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.
An LLM has no comprehension of what it says. It’s just a puppy that is really good at performing for treats. This will always yield nonsense a meaningful proportion of the time.
I don’t care how statistically good your model can be under certain constraints and inputs. At the end of the day, all you’ve done is classically condition your computer.
It goes a tad bit beyond classical conditioning… LLM’a provides a much better semantic experience than any previous technology, and is great for relating input to meaningful content. Think of it as an improved search engine that gives you more relevant info / actions / tool-suggestions etc based on where and how you are using it.
Here’s a great article that gives some insight into the knowledge features embedded into a larger model: https://transformer-circuits.pub/2024/scaling-monosemanticity/