Blog
FR

Lire en français

The Physical Footprint of AI: The Path to Browser-Based Decentralization

As public opposition to energy-intensive megadata centres grows, local AI execution via WebGPU is emerging as an eco-friendly and sovereign alternative.

A conceptual illustration of a web browser executing AI locally on a device, reducing reliance on massive data centres.
A conceptual illustration of a web browser executing AI locally on a device, reducing reliance on massive data centres.

The Physical Paradox of Artificial Intelligence

The nationwide rollout of artificial intelligence is now facing a physical and social reality that is difficult to ignore. While the Canadian government recently unveiled its new national strategy, "AI for All", aiming to accelerate the adoption of these technologies to boost productivity, the public is expressing growing reluctance regarding the required hardware infrastructure. According to a poll published by TVA Nouvelles, seven out of ten Canadians oppose the construction of large data centres near their homes.

This gap between digital ambitions and the social acceptability of infrastructure highlights the invisible environmental cost of AI. Behind the seamless experience of conversational assistants lie highly energy-intensive server farms, requiring millions of litres of water for cooling and megawatts of electricity to run processors. In Quebec, where hydroelectricity has long been perceived as an unlimited resource, managing connection requests for data centres now forces tight strategic choices to avoid overloading the distribution grid.

Understanding the Energy Footprint of Inference

To understand the scope of the problem, it is helpful to distinguish between two key phases in the lifecycle of a language model: training and inference. Training, which involves feeding a model massive volumes of data to teach it to predict the next word, is a one-time but extremely heavy operation, generally reserved for centralized supercomputers. Inference, on the other hand, corresponds to each daily query made by a user.

Although a single query consumes very little energy, the multiplication of these requests by millions of users creates a cumulative effect that is unsustainable for centralized infrastructure. According to projections by the International Energy Agency, electricity consumption related to data centres, AI, and cryptocurrencies could double in the near term. In light of this trajectory, the all-cloud model is showing its limits. Sending every simple question to a server located hundreds of kilometres away to receive a text response is the equivalent of using a semi-trailer truck to deliver a letter.

The Technological Answer: Local Execution via WebGPU

To bypass this dependence on centralized infrastructure, a technical solution is emerging: executing models locally, directly on the user's device. This approach relies on a modern web standard called WebGPU. This programming interface allows the browser to directly and securely access the processing power of the graphics card (GPU) on a personal computer or smartphone, without requiring any third-party software installation.

Thanks to advances in model compression (quantization, which reduces the size of language models without major loss of accuracy), high-performing models with several billion parameters can now run smoothly inside a simple browser tab. The environmental impact is immediate: the query is processed locally, eliminating network transit, bandwidth consumption, and reliance on remote servers. The energy consumed is limited to that of the user's device battery or power outlet, creating a localized energy-saving loop.

The ProductivIA Approach: Balancing Performance and Sobriety

Within the ProductivIA application ecosystem, this decentralization philosophy is concretely reflected in the IA Locale application. Designed to work entirely without code or complex configuration, this application allows users to run language models directly in their browser. Unlike conventional solutions that require sending data to external servers, IA Locale uses the standard WebGPU API to process requests on the user's machine.

This architecture offers a threefold advantage:

  • Energy sobriety: By avoiding repeated server calls for writing, summarizing, or proofreading tasks, organizations drastically reduce their indirect digital carbon footprint.
  • Absolute confidentiality: The processed text or document data never leaves the computer's RAM. This feature ensures natural compliance with Quebec's Law 25 requirements, without requiring a complex privacy impact assessment for cross-border data transfers.
  • Operational autonomy: The application remains functional even in the event of an internet outage or central server failure, eliminating the risk of systemic failure inherent in centralized architectures.

For queries requiring greater computing power or model comparison, the platform offers the GoIA application. This tool allows users to switch seamlessly between different engines, whether they are hosted on public infrastructure or on Quebec's sovereign Matania engine. This flexibility allows administrators to reserve server calls for high-value tasks, while prioritizing local execution for everyday work.

Toward a Hybrid and Responsible Architecture

The transition toward an AI that respects planetary boundaries will not happen by abandoning technology, but by optimizing its architecture. The combination of a resource-efficient operating system like Boréal-OS, which can extend the useful life of existing computer hardware, and an application suite like ProductivIA that runs AI locally via WebGPU, demonstrates that a viable alternative exists. By giving computing control back to the user, this approach reconciles software innovation with the imperatives of energy sobriety and territorial sovereignty.

Back to blog
© ProductivIA 2026
info@productivia.ca - 581-504-0294
296, rue Saint-Pierre - Matane, QC G4W 2B9
Confidentiality Policy - Legal information