NVIDIA and Microsoft Unveil RTX Spark Superchip for Local Agentic AI

NVIDIA RTX Spark: Local AI Superchip Targets Professional Workstations and On-Device Enterprise Workflows

At Computex 2026 in Taipei, NVIDIA CEO Jensen Huang unveiled the RTX Spark, a consumer-grade superchip developed in close collaboration with Microsoft that is engineered to run complex artificial intelligence agents directly on local devices rather than routing computation through cloud infrastructure. Announced on 1 June 2026, the RTX Spark represents a fundamental shift in how professional workstations are conceived, moving the PC from a tool that executes user instructions to a platform capable of running sustained, autonomous multi-step workflows while the user is away from the desk. The chip will debut in devices including the Microsoft Surface Laptop Ultra, alongside hardware from ASUS, Dell, HP, Lenovo, and MSI, with release scheduled for later in 2026.

For professional services firms, including engineering consultancies, environmental practices, legal teams, and infrastructure developers, this announcement carries implications well beyond the consumer laptop market. The core proposition is that organisations will be able to run very large AI language models, including models with up to 120 billion parameters and context windows of up to one million tokens, entirely on local hardware without transmitting sensitive project data to third-party cloud servers. This directly addresses two persistent concerns in professional services: data residency and the ongoing cost of cloud API access for AI-assisted document processing and analysis workflows.

The collaboration between NVIDIA and Microsoft is notable because it is not simply a hardware announcement. Microsoft has co-developed security primitives within the Windows operating system specifically to govern how local AI agents behave, and a new runtime environment called NVIDIA OpenShell has been designed to contain agent activity within defined boundaries. This degree of vertical integration between chip architecture and operating system is rare in the consumer computing space and signals a serious commercial push toward enterprise-grade local AI execution on standard workstation hardware.

Key details of the RTX Spark superchip architecture and performance

The RTX Spark superchip integrates a 20-core Arm-based central processing unit, co-engineered with MediaTek, alongside a Blackwell-architecture graphics processing unit that contains 6,144 CUDA cores. The two processing units are connected via NVIDIA’s NVLink Chip-to-Chip (C2C) interconnect, which allows them to share a unified pool of up to 128 gigabytes of LPDDR5X memory operating at a bandwidth of 300 gigabytes per second. This unified memory architecture eliminates the bottleneck that traditionally exists when discrete laptop GPUs must transfer data to and from separate CPU memory pools, a constraint that has historically limited the size of AI models that can be run efficiently on portable hardware.

The claimed peak AI compute performance of the RTX Spark is 1 petaflop of local FP4 (4-bit floating point) inference throughput. To contextualise that figure, running a 120-billion-parameter language model locally at useful speeds has until recently required either dedicated server-class GPU clusters or cloud infrastructure. The RTX Spark achieves this through local FP4 and FP8 quantisation, which compresses model weights while preserving reasoning quality at a level sufficient for professional document analysis, code auditing, and structured data synthesis tasks. The 1 million token context window supported by the hardware means the chip can hold and reason over extremely large documents, entire project folders, or extended regulatory correspondence within a single inference session.

On the software side, NVIDIA and Microsoft have committed to rebuilding industry-standard applications natively for the Arm-Blackwell architecture. Named applications include Adobe Photoshop, Adobe Premiere, Blender, and DaVinci Resolve. For professional services contexts, the more relevant commitment is the native porting of GIS, CAD, and modelling tools to the architecture, though specific software titles beyond the creative suite have not been confirmed at the time of the Computex announcement. Microsoft has also fully optimised its Prism x86 emulation layer for the RTX Spark, and NVIDIA is working with developers of critical digital rights management and anti-cheat technologies, specifically BattlEye and Denuvo, to deliver native Arm support. This resolves a long-standing compatibility barrier that has limited Windows-on-Arm adoption in professional environments.

The autonomous agent infrastructure on the device runs through open-source runtimes identified as OpenClaw and Hermes, which manage the scheduling and execution of long-running, multi-step AI workflows. These runtimes are designed to allow “continuous agents,” meaning AI processes that run overnight or across extended periods without active user supervision, to execute complex tasks such as processing large volumes of documents, synthesising data across multiple files, or generating structured reports from unstructured source material. The NVIDIA OpenShell runtime environment and native Windows security primitives together define the permission boundaries within which these agents operate, an important consideration for professional firms evaluating data governance obligations.

Australian context: local AI execution and professional services data obligations

Australian professional services firms, particularly those working in regulated industries such as environmental consulting, infrastructure delivery, legal services, and government advisory, operate under a distinct set of data governance obligations that make the RTX Spark’s local execution capability directly relevant. The Privacy Act 1988 (Cth) and the Australian Privacy Principles (APPs) impose obligations on entities that handle personal information, and in some circumstances, transmitting that information to offshore cloud servers can trigger cross-border disclosure obligations that require additional contractual and technical safeguards. The ability to run large AI models entirely on local hardware, without data leaving the device or the firm’s network, removes this compliance complexity for a broad category of document processing and analysis tasks.

References and related sources

Primary source: nvidianews.nvidia.com
tomshardware.com
pcmag.com
tomshardware.com
tomshardware.com

How iEnvi can help

iEnvi integrates technology and data-driven approaches into environmental consulting. We monitor AI and technology developments that affect how environmental professionals deliver services to clients.

This is an iEnvi Machete news summary. Prepared by iEnvi to summarise the source article for environmental professionals tracking AI, data, and technology developments that affect consulting and project delivery.

Published: 02 Jun 2026

Need advice on this topic? Speak to an iEnvi expert at info@ienvi.com.au or 1300 043 684, or contact us online.

Need advice on this issue? iEnvi provides practical, senior-led environmental consulting across contaminated land, remediation, ecology and environmental risk.

Contaminated land advice Remediation services Discuss your site Talk to iEnvi

NVIDIA RTX Spark: Local AI Superchip Targets Professional Workstations and On-Device Enterprise Workflows

Key details of the RTX Spark superchip architecture and performance

Australian context: local AI execution and professional services data obligations

References and related sources

How iEnvi can help

National environmental consultancy with direct senior involvement.

Explore

Core Services

Office Coverage