Generative AI is no longer just a futuristic promise—it’s here, reshaping industries with capabilities like real-time language understanding, image generation, autonomous decision-making, and hyper-personalized experiences. But as this technology matures, and confidentiality and cost concerns rise, a critical evolution is taking place: moving generative AI from the cloud to the edge.
For leaders across healthcare, manufacturing, retail, automotive, and defense, this shift isn’t just a technical upgrade—it’s a strategic imperative. Edge AI enables real-time intelligence, reinforces data privacy, ensures operational resilience, and significantly cuts costs. In this expert-led discussion featuring ADLINK and Phison, we break down why the edge is becoming the next frontier of generative AI—and how businesses can seize this pivotal opportunity.
For leaders across healthcare, manufacturing, retail, automotive, and defense, this shift isn’t just a technical upgrade—it’s a strategic imperative. Edge AI enables real-time intelligence, reinforces data privacy, ensures operational resilience, and significantly cuts costs.
In this expert-led discussion featuring ADLINK and Phison, we break down why the edge is becoming the next frontier of generative AI—and how businesses can seize this pivotal opportunity.
“When milliseconds matter, the cloud is too far,” says Dr. Wei Lin, CTO of Phison Electronics. Autonomous vehicles, factory robots, or medical devices can’t afford latency. By bringing generative AI inference directly to the edge—where data is generated—decisions happen instantly and reliably.
“Customer behavior data is powerful—but it must remain private,” explains Ethan Chen, GM of Edge Computing Platforms BU, ADLINK. On-premise deployment helps ensure sensitive data—whether from healthcare, finance, or defense—never leaves the local device or premises, reducing exposure risks tied to cloud-based AI.
On-premise Gen AI solution deployment allows the use of internal enterprise resources, and most importantly, domain-specific—such as workflows, technical documents, and regulatory content that’s already in place. “Through fine-tuning and making the most use of existing domain data, it bridges gaps in the model’s understanding, making Generative AI more aligned with real-world applications and delivering more accurate, professional responses,” said Chen.
Cloud dependence introduces a single point of failure. “You can’t count on constant connectivity, especially on factory floors or in the field,” says Dr. Lin. On-premise AI systems keep running even when offline—ensuring uninterrupted operation in remote, rugged, or high-security environments.
Processing data at the edge slashes bandwidth usage and reduces cloud storage costs. “But cost remains a hurdle,” notes Dr. Lin. “If building an AI robot costs the same as hiring a person for 20 years—it’s not viable.” Hardware cost, especially GPU memory, is the key bottleneck.
“AI models can be categorized by scale,” explains Dr. Lin.
~10B parameters: Basic verbal interaction
~100B parameters: Reasoning and inference
~1000B parameters: Advanced models like ChatGPT and DeepSeek
Here’s the catch: GPU memory requirements scale almost linearly. “It’s roughly 1GB of VRAM per 1B model parameters,” adds Chen. That makes large models prohibitively expensive to run on edge devices—until now.
“When we saw Phison’s aiDAPTIV+ in action, we knew it was a game-changer,” says Chen. ADLINK integrated it into their latest DLAP edge AI platform, accelerated by NVIDIA Jetson Orin™ NX and Jetson Orin™Nano. The result? The DLAP Supreme—a platform that turns ~8B hardware capability into support for AI models up to 80–90B.
How?
aiDAPTIV+ breaks the GPU memory barrier.
“We offload parts of the AI model from GPU to NAND flash in our SSDs,” explains Dr. Lin. This means you don’t need 90GB of VRAM to run a 90B model—you might only need 10–20GB, slashing hardware costs dramatically.
This makes advanced generative AI not just powerful—but finally affordable at the edge.
Bringing large generative models to the edge, affordably, opens up revolutionary use cases—especially for small and mid-sized enterprises.
In healthcare, portable edge devices can run vision and language models for diagnostics, triage, or even mental health monitoring—without exposing sensitive data to the cloud. Rural clinics could access world-class AI tools offline and at low cost.
In retail, smart mirrors and kiosks can offer personalized recommendations, generate dynamic visuals, or answer customer queries—right on the spot, with zero data leaving the premises.
In logistics and manufacturing, edge AI can analyze sensor data and video feeds in real time to detect safety risks, optimize workflows, or predict maintenance—without needing cloud access.
In transportation and defense, embedded AI in vehicles and drones enables real-time decision-making, language translation, and navigation—all processed locally for speed and security.
Generative AI at the edge is no longer a dream—it’s a deployable, affordable reality. Thanks to innovations from Phison and ADLINK, aiDAPTIV+ and the DLAP Supreme platform, what was once reserved for hyperscalers is now within reach for enterprises of all sizes.