Apple built custom servers and OS for its AI cloud

Apple built custom servers and OS for its AI cloud

Apple built custom servers and OS for its AI cloud PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Apple has revealed it created its own datacenter stack – servers using its in-house silicon and operating system – at its Worldwide Developer Conference (WWDC) on Monday.

Cupertino hasn’t actually announced the servers or OS (and never addressed rumors of its plan to make datacenter-grade processors). Instead, references to the chips and OS can be found scattered across the blizzard of announcements about AI features and product updates.

Those AI features rely on what Apple’s called “Private Cloud Compute” – an off-device environment where the iGiant runs “larger, server-based models” that do AI better than the models Cupertino loads onto its iThings.

Apple describes the devices in Private Cloud Compute as “custom-built server hardware that brings the power and security of Apple silicon to the datacenter.” Cupertino also uses the term “compute node,” but it’s unclear if that’s a synonym for “server.” Apple has further confused matters by discussing a “Private Cloud Compute cluster” as being pressed into services when iThing users turn to the cloud for AI resources unavailable on their devices.

Whatever the correct term for the machines, and their configuration, Apple says they use “the same hardware security technologies used in iPhone, including the Secure Enclave and Secure Boot.”

The machines run a new operating system that Apple’s described as “a hardened subset of the foundations of iOS and macOS tailored to support Large Language Model (LLM) inference workloads while presenting an extremely narrow attack surface.”

That OS omits “components that are traditionally critical to datacenter administration, such as remote shells and system introspection and observability tools,” Apple wrote. Even the kind of telemetry needed by the site reliability engineers who keep Apple’s cloud running has evidently been minimized to offer “only a small, restricted set of operational metrics” – so that info processed by Apple’s models is inaccessible to humans other than the end-users who provide it. In other words, cloud sysadmins can’t access personal info “even when working to resolve an outage or other severe incident.”

Apple has not otherwise revealed any information about the servers’ CPUs.

The presence of the same Secure Enclave and Secure Boot tech as used in the iPhone suggests the silicon shares some elements of A-series designs Apple uses in its smartphones and lower-end tablet computers.

The A-series boasts a 16-core neural engine – the same number of cores found in the recently announced M4 processor offered in the iPad Pro. It’s unclear exactly how the two engines compare.

The most recent iPhone chip – the A17 – uses Arm’s v8.6A instruction set. The M4 is thought to use the more modern v9.4.

Maybe Apple cooked a custom chip for these servers. It certainly operates at a scale which makes doing so feasible.

Whatever is inside, Apple’s use of Arm-based silicon for AI servers is yet more evidence – if any were needed – that the Arm architecture is ready for datacenter duty in demanding applications.

AWS, Google, Oracle and Microsoft all offer Arm-powered servers in their public clouds for general purpose workloads, and tout them as offering superior price/performance compared to chips from Intel and AMD on some jobs.

Surely Apple would not be betting its next-gen AI on silicon that isn’t ready to deliver its promised integration of cloudy and on-device action?

In true Apple fashion, we have one more thing to note: the absence of anything associated with these cloud servers that suggests the company intends to return to the business of selling to your datacenter, a field it abandoned over a decade back.®

Time Stamp:

More from The Register