Compute DePINs: Paths to Adoption in an AI-Dominated Market

    Ryan Connor

    Key Takeaways

    • Compute DePINs are one of the best positioned verticals in crypto for mass adoption as AI compute demand has exposed structural inefficiencies in the cloud computing and leading-edge compute markets. These structural inefficiencies could last for years, as concentrated and technically complex supply chains are moving too slowly to meet the insatiable demand for compute, and the step function growth in demand for AI compute sustains.
    • The Compute DePIN landscape will likely be fragmented, where projects focus on a single or related set of end use cases that conform to a set of related hardware configurations. There will likely be attempts at an aggregation layer that acts like a DePIN of Compute DePINs. This model is the highest risk / highest reward layer of the Compute DePIN stack.
    • Crypto-native developers and organizations, price-sensitive startups with little cloud lock-in, and academics & researchers are natural first customers. Projects can also position themselves as variable load sources of compute for independent and private data centers, or as synthetic data generation services for foundation model startups, which is a large market where Compute DePINs have a clear price and product advantage.

    The Critical Importance of Compute

    Compute has been an increasingly vital resource to the global economy, from the ascent of the microchip and its use in military and scientific applications in the 50s and 60s, to the PC in the 80s and the smartphone and scientific computing today. Digital services have grown in prominence both in our personal lives and in industry, positioning the resource as absolutely critical to modern civilization.

    The rise in importance of computational resources - semiconductors - has led to the rise of mega-cap tech companies in the United States to the most dominant companies on the planet. It also aided the rise of geopolitical powers such as the United States, Japan, China, and Europe, underpinning their economic, military, and technological prowess (for a deep dive into why, please see Chip Wars). 

    The Rise of Generative AI

    The importance of compute has greatly increased with the rise of the transformer architecture in 2017, the subsequent “iPhone moments” of generative AI models Dall-E and ChatGPT in 2022, and the exponential adoption of generative AI tools and applications by developers, consumers, and enterprises we’re seeing today. 

    The transformer architecture - a groundbreaking neural network design that utilizes self-attention mechanisms and is powered by parallel processing - gave rise to generative image, generative video, and large language models (LLMs) that have captured the minds of the developers, knowledge workers, consumers and enterprises due to their creativity, labor saving abilities, dramatic productivity-enhancing features, and sparks of artificial general intelligence. 

    The speed of generative AI adoption is unmatched in the history of technology, with profound implications for compute markets. 

    ChatGPT was AI’s iPhone moment, reaching 1 million and then 100 million global users faster than any app in history. Knowledge workers are finding LLMs so indispensable that over 40% of market researchers cannot live without code generation tools and researchers are admitting they’d rather take a pay cut than lose access to LLMs. The number of professionals using LLMs in the workplace is rising at a rapid rate. This adoption is almost universally employee-led but is becoming increasingly enterprise-led, as we’re observing enterprise budgeting for LLM spend accelerating 2-5x in 2024

    chatgpt so fast.png

    The amount of developer mindshare that generative models have captured is unprecedented. Stable Diffusion, a generative image model, became the most popular github repo at a rate previously unheard of amongst developers. Huggingface, a repository for open-source AI/ML tools, libraries, and models, has grown the number of models available on its platform by over 5x in just 16 months. Developers clearly have plans to build AI-centric applications and product features, with an appetite previously unseen in technology markets. 

    huggingface models.png

    stable diffusion.png

    But the compute resource and developer dynamics at play in generative AI are starkly different from that of traditional software. In traditional software development, developers were rewarded for optimizing code and algorithms to minimize computational resources and data usage. There were few (if any) problems where increasing compute and data usage by orders of magnitude was worth the tradeoff. Efficient code that could run on “good-enough” hardware was the goal. Moore’s Law would scale, and so would the cloud provider and the end application. 

    In contrast, generative AI development rewards developers for using more compute and more data, because AI scaling laws provide a clear, predictable path to increasing the performance of generative AI models, enabling superhuman abilities and labor-saving productivity enhancements. 

    The AI-Compute Flywheel

    The superhuman capabilities AI models enable results in a AI-Compute Flywheel, where enhanced productivity drives demand for more compute, which drives more productivity, and so on. The business economics and incentives are such that enterprise and consumer applications will only require more computational resources over time - a continuation and acceleration of a trend we’ve observed since the 1950s. 

    LLMs are achieving superhuman performance across many areas, including image classification, reading comprehension, visual reasoning, mathematics, and medicine

    human baseline.png

    These performance gains are directly related to the number of floating point operations (or “FLOPs”, a precise measurement of compute) performed in training and the subsequent parameters of a given model. This relationship is explained in the 2020 paper by OpenAI regarding “scaling laws”. Put simply, as models are trained for longer on more data, their performance increases according to a power-law relationship. As a rule of thumb, in order to double the performance of an AI model, you need to 10x training compute and 10x training data.

    scaling laws.png

    Importantly, at new levels of compute and data scale, LLMs exhibit emergent properties, i.e. step function improvements in their abilities. Fine-tuning these models, or the re-training of a pre-trained model on a more narrow subset of tasks, pushes these gains and emergent properties even further. This suggests that, even if LLM abilities hit an asymptote, ML researchers and enterprises may still find it worthwhile to throw more compute at generative models to test the thresholds of emergent behavior. 

    emergent properties.png

    As a result of scaling laws - more compute, more data, better model - the compute power required to train a state-of-the-art AI model has increased at a truly extraordinary rate. The total amount of compute required to train a single state of the art (SOTA) AI model has increased by 4.2x per year since 2010. This growth has been so rapid that, when you plot the growth of computational training requirements of SOTA AI models since 1950 on a log scale, the curve remains visually exponential. Remember - we plot curves on a log scale to remove the exponential curve. I am personally not aware of any chart that demonstrates such a rapid exponential rise. 

    FLOPS SOTA models.png

    It is essential to amass as many GPUs as possible to train a model by running calculations in parallel. Why? A single GPU would take decades to train GPT-4. As a consequence, training is run on large clusters of GPUs in data centers. For example, the preeminent SOTA LLM GPT-4 required 25,000 GPUs to train all running in parallel continuously for 90 days, and the total cost of training (i.e. cost to lease that hardware) was somewhere between $50 and $100 million. As SOTA models become increasingly multimodal - i.e., capable of image, video, and language generation - the size, complexity, and computation required to build and use these models should climb even higher

    Serving these models (performing model inference) exhibits similar dynamics, where the cost of serving moves higher with the model’s capabilities. Take, for example, MMLU, a premier benchmark for assessing LLM capabilities. The benchmark assesses performance in scenarios across 57 subjects, including STEM, the humanities, and the social sciences. The smarter the SOTA model, the more costly it is to run in production. Further, the market for inference is estimated to be over 5x larger than the market for training AI models.  

    infernce v mmlu.png

    Computational Efficiency & Jevons Paradox

    High costs incentivize organizations and developers to create new compute-efficient methods for training and serving generative models. While we have no doubt methods like RAG, MoEs, quantization, caching and new methods will reduce COGS for generative models, we think it is obvious that AI workloads and AI-related computational requirements will suffer from Jevons Paradox. Jevons Paradox occurswhen technological progress increases the efficiency with which a resource is used (reducing the amount necessary for any one use), but the falling cost of use induces increases in demand enough that resource use is increased, rather than reduced.” Jevons Paradox is a constant in technology markets, and we view the relationship between AI and compute as no different over long periods. 

    Market Sizing Estimates

    Driven by artificial intelligence and generative AI, but also by other rapidly growing and compute-intensive workloads like AR/VR rendering, the GPU market is expected to grow at a 32% CAGR to over $200B in five years. AMD’s estimates are even higher, the company sees a $400B market for GPU chips by 2027 (see herehere, and here for sources and more details). 

    Compute Market Inefficiencies

    Unfortunately for builders, startups, and enterprises, the market for compute resources today is characterized by a concentration of power, unfair business practices and favoritism that impede time to market, scarcity of supply, and crippling costs.

    New Supply Is Limited

    In elastic markets, new supply is easily brought on in response to rising demand. But semiconductor supply is highly inelastic. It generally takes around $10-20B and 2-4 years to build a new leading-edge semiconductor fabrication plant (or "fab"), greatly limiting how quickly the industry can scale up compute resources over short timeframes. Skilled labor to build and operate leading-edge fabs is in short supply. In the US in particular, the ability to construct fabs is greatly limited by environmental red tape and a lack of talent (see here).

    At the leading edge, it’s well understood that Nvidia is taking full advantage of its dominant position design. For example, the company forbids the use of its consumer graphics cards in data centers, forcing data centers to purchase more expensive models for the public cloud. The company is also favoring strategic partners with H100 allocations and is even favoring particular end-use cases over others. Having the cash for an order isn’t enough. Nvidia wants to know what you’ll use their product for and what’s in it for them.

    Centralized Ownership

    The supply of available leading edge semiconductors is highly centralized in the data centers of just a few key players. Much of the supply of A100s, the leading chip for AI inference, is privately owned by Meta and Tesla, outside of the public cloud. H100s, the most performant AI accelerator for model training, is even more skewed, with the bulk of publicly known H100s concentrated in the hands of a few players, and with Meta privately securing more H100s than the top cloud providers combined.

    Even in the public cloud, in today’s supply-constrained environment, it’s becoming increasingly common for cloud service providers (CSPs) to force startups and enterprises to provide business plans around their use of the compute hardware they hope to rent. Startups are even finding it worthwhile to essentially trade equity for Azure and AWS clusters, further centralizing the technology industry.

    The available supply of compute in the public cloud is largely concentrated in the US. As geopolitical tensions over semiconductors have risen (more on this below), KYC for the public cloud has become increasingly common, extending time to market for those lucky enough to secure GPU access. While not required today, it’s likely the US will propose stringent KYC requirements on CSPs over the coming months and years, impacting the majority of data centers globally, potentially further tightening access to compute and further lengthening time to market for startups and enterprises. 

    Datacenters in US.png

    Concentrated Supply Chain & Geopolitical Risks

    Leading-edge semiconductor production is concentrated to only three firms that have the technological capabilities and expertise to produce leading-edge chips profitably and at scale. ASML is responsible for (literally) 100% of the world’s extreme ultraviolet lithography (EUV) machines, which are responsible for 100% of 7nm node production and smaller. Most importantly, Taiwan Semiconductor Manufacturing Company (TSMC) produces 92% of the world’s advanced logic chips - defined as the 10nm node or smaller. 

    The geopolitical consequences and risks to TSMC cannot be overstated. China has stepped up its rhetoric regarding China-Taiwan reunification and is increasingly engaged in aggressive military action toward the country. Put bluntly, a seizing of TSMC fabs by China could exacerbate the leading-edge supply shortage, with potentially perilous, though truly unforecastable, implications for global tech startups and enterprises, global GDP, and geopolitical stability.

    These market dynamics of a concentrated supply chain, geopolitical risks, inelastic supply, and centralized ownership of leading-edge compute in the public cloud have resulted in burdensome market inefficiencies. 

    • KYC requirements and slow time to market - Cloud providers are known to pick winners and losers when it comes to GPU allocation. KYC is required and could take weeks. In some cases, startups are required to submit a business plan, which adds time and effort to the process, slowing time to market in a highly competitive space. 
    • Staggered deployments - Larger enterprises might find it useful to build their own data centers. Here, interfacing directly with Nvidia for a 5000 GPU order might be severed in three blocks of GPUs over quarters, if you can get an allocation at all. 
    • Limited configurations - Traditional cloud providers often have rigid and predefined cluster configurations, limiting flexibility in terms of model types, networking, and storage options. These limitations can make it challenging to optimize clusters for specific workloads or to achieve the desired performance levels, impacting margins. 
    • High costs - It’s estimated that training a state of the art foundation model requires $50-100 million in compute costs, and number will grow. Leading foundational model startup Anthropic spends over half of its revenue on cloud computing costs, with gross margins much lower than traditional software companies. 

    Decentralized Compute Networks

    Crypto protocols create incentives that coordinate human efforts at a global scale - this is their defining feature. This model essentially began with Bitcoin but was then adopted by Helium, where the protocol used Bitcoin-like token incentives not to validate blocks and secure a network, but to incentivize the global construction and upkeep of a decentralized telecommunications network. We call these networks DePINs - or decentralized physical infrastructure networks.

    Compute DePINs have risen to the occasion, building systems to solve the accessibility bottleneck in compute markets. These crypto-native protocols use token incentives and blockchain rails to bootstrap the provision of compute, matching owners of compute resources with willing purchasers, like an AirBNB or Uber for CPUs and GPUs. The decentralized compute space began with Golem in 2016 and has grown to include projects like Akash, Render, io.net, Fluence, Nosana, Kuzco, Aethir and others. 

    These models are primarily focused on sourcing latent compute. Despite the shortage of high quality compute in the public cloud, total global compute resources are actually underutilized in totality. Much of this untapped compute sits concentrated in a few key areas and out of reach of the public.  

    • Consumer graphics cards - Consumer workstations used for graphics and gaming represent a large portion of latent compute globally. For example, we estimate there are ~200 million latent consumer graphics cards available for Compute DePINs globally, versus just 6 million high-end GPUs in the public cloud. 
    • PoW miners - Proof-of-work miners have sat on underutilized hardware for years as Bitcoin’s hash rate has marched higher and made certain chips uncompetitive for mining BTC. The halving will accelerate this trend, where Bitcoin miners will find that the ROI of leasing their GPUs for AI jobs exceeds that of BTC mining today. 
    • Filecoin miners - Filecoin miners typically have high-end machines to provide storage services, but greatly underutilize their compute resources. In the past, Filecoin miners used highly performant CPUs to perform sealing - the process of packaging data into a secure and verifiable format before storing it on the Filecoin network. With the rise of sealing-as-a-service, these CPUs are now underutilized. 
    • Over provisioned data center chips - Despite the high demand for compute in the public cloud, compute resources often sit idle in a number of edge scenarios. Enterprises that have acquired compute in long-term contracts do not fully utilize their compute at all times and have downtime between jobs and after jobs are completed. The data centers and their customers routinely over provision compute resources to prepare for abnormal spikes in demand. Trailing edge chips also sit idle when private data centers reallocate workloads to leading edge chips. 
    • Independent & international data centersTier 1 and 2 data centers are often overlooked sources of compute. International data centers lack the distribution channels to compete with large CSPs. 

    The Compute DePIN Stack

    There are three layers in the decentralized compute stack: the bare metalorchestration, and aggregation layers

    The bare metal layer refers to the physical hardware infrastructure that forms the foundation of the decentralized compute stack. Bare metal layer DePINs focus simply on bootstrapping the raw computational resources and making them accessible via API. We view Filecoin miners as a bare metal pure play because the protocol is amassing compute resources without orchestration geared toward compute workloads. 

    The orchestration layer is responsible for managing, coordinating, and automating the deployment, scheduling, scaling, operation, load balancing, and fault tolerance of the bare metal layer. This layer abstracts away the complexity of managing the underlying hardware and provides a higher-level UI for end-users. Orchestration layer projects today are forming around specific end-markets (more on this below). 

    The aggregation layer sits at the top of the decentralized compute stack and provides a unified interface for users to access and orchestrate resources from multiple compute DePINs, where, instead of a narrow set of workloads, the aggregation layers will provide UI, orchestration and monitoring tools, and hardware configurations for a wide array of workloads, from AI jobs, to rendering, pixel streaming, to provenance and zkML. We view the aggregation layer as a unified UI, orchestration, and distribution service for many Compute DePINs. 

    In practice today, Compute DePINs operate at different layers of the stack, and often at more than one at a time, with varying degrees of service at each layer and geared toward different workloads. For example, the Render Network has structured its token to incentivize the provision of a bare metal layer of RTX devices, but its Open Compute Client program makes it possible to access Render Network resources by way of community vote, opening it to use by the aggregation layer. Clients that build orchestration services on top of the Render Network and leverage its bare metal resources are Otoy’s OctaneRender for rendering jobs, with projects like Prime Intellect and io.net in the process of onboarding Render nodes for AI workloads. io.net is positioning itself across the stack, using its token to incentivize bare metal to join the io.net network, and building an orchestration and networking layer that straddles Render, Filecoin, Aethir and io.net nodes (the implementations across these networks today are at varying stages of development). With plans for bare metal nodes, an orchestration service for those nodes, and a roadmap of serving multiple use cases, including a IO Models Store, serverless inference and cloud gaming serving, io.net’s roadmap is an ambitious attempt at securing a position as the aggregation layer for Compute DePINs.

    compute depin stack.png

    Hardware & Latency Requirements Drive Workloads and Target Markets

    The market for compute is highly heterogeneous. Different applications and use cases have very different hardware specifications. Key dimensions of heterogeneity include:

    • Parallelism v. serial dependencies - Some workloads are highly parallel and can easily be split across many cores and nodes (e.g. rendering frames for animations). These jobs are best done on GPUs. Other jobs have complex dependencies that limit parallelization, like computational fluid dynamics or environmental simulations, which favor serial processing. Clusters of CPUs are more suitable here. 
    • Memory intensity - Certain workloads like in-memory databases or graph analytics are extremely memory intensive and sensitive to memory bandwidth and latency. Others are much less so.
    • I/O intensity - Workloads like web serving, transactional databases, and streaming analytics can be very I/O intensive and require fast storage and networking provided by the supply side.
    • Specialization - Some emerging workloads benefit greatly from specialized hardware. Deep learning is a well known example, with the proliferation of specialized AI accelerators with tensor cores. Certain encryption methods may also use specialized hardware. Bitcoin ASICs are an extreme example, where these cards can only run one calculation - sha256 hashes. 
    • Data sovereignty - Regulations or corporate policies may restrict where certain data can be processed or stored. Geographic location of the compute is critical here. 
    • Latency sensitivity - Interactive and real-time workloads can be very latency sensitive, while offline batching and other precompute or preprocessing jobs may be relatively insensitive to latency.

    The confluence of varying hardware specifications and latency requirements implies a fragmented market across different chips and orchestration softwares, and increased complexity (infra, hiring, GTM, customer acquisition, customer support and service, etc) and subsequent execution risk for Compute DePINs as they get more general and tackle more end markets. To simplify this dynamic, we view target customers for Compute DePINs at the intersection of hardware specialization requirements and end-user latency profile, and the risk profile of a Compute DePIN partially as a function of generality (i.e. the potential for Jack of all trades -> Master of none).

    Hardware specialization. We’re seeing Compute DePINs focus on coordinating narrow use cases to specific hardware configurations today. For example, rendering workloads are actually more suitable for consumer GPUs, and less suitable for (technically more performant) data center GPUs. While A100s are much more powerful than RTX 4090s from a FLOPS perspective, 4090s are much more performant for rendering workloads like ray tracing, because they have specialized hardware like RT cores cores that are optimized for the class of calculations used in ray tracing jobs. As such, the Render Network has positioned itself to amass a supply of RTX 3090s and 4090s from graphics artists, whereas IO.NET and Akash are today focused latent A100s from data centers geared towards AI workloads to onboard initial customers like AI startups. 

    Latency profile. Compute DePIN markets will also form around latency profiles, where end-use cases with high latency sensitivity will require leading-edge, colocated hardware with enterprise-grade networking contributed by professional data center operators like miners and IDCs, while latency insensitive applications, like academic workloads and some 3D rendering, will gravitate toward trailing edge hardware, potentially maintained by small firms and even hobbyists. Given that leading-edge, colocated semiconductors will unlikely be the primary configurations offered by most Compute DePINs, we view the limited supply of these configs that Compute DePINs are able to acquire as a challenging but necessary customer acquisition tool to bridge customers from centralized compute to decentralized compute. 

    Some example end use cases at the intersection of latency sensitivity and hardware specialization are summarized below. The bold purple text highlights some workloads that we think are most suitable initial target markets for Compute DePINs. Use cases in light purple text are suitable as well, but less so given their commodity nature. We generally believe that, while high latency sensitive workloads are technically possible on Compute DePINs who source compute from IDCs with colocated leading edge hardware, a perceived lack of uptime guarantees or lack of concurrent maintenance coupled with the nascent nature of these systems will make the latency sensitive customers a tough sell, at least initially. 

    hardware and latency profile.png

    Value Accrual in Decentralized Compute Stack

    On the condition that a Compute DePIN can overcome the high complexity and risk associated with aggregating and serving heterogeneous compute jobs, we believe the aggregation layer has the highest payoff potential in the stack. This layer benefits from

    1. Network effects. Aggregators benefit from self-reinforcing cycles where demand drives more supply, which in turn drives more demand. As this self-reinforcing cycle plays out, switching costs increase and user and supplier lock-in develops.
    2. Ownership of the end user relationship. Organizations closer to the end user generally benefit from higher margins, a higher frequency of valuable user data, which drives tighter product feedback loops, which is critical to staying ahead of the competition, and will be critical for Compute DePINs competing with the hyperscalers.  
    3. Complementary services. The lock-in associated with network effect allows aggregators to offer complementary services at a greater success rate than otherwise possible. 

    The middle layer, orchestration, will capture the second-most value by providing critical coordination and management of resources, leading to switching costs and lock-in effects for customers who rely on their services in narrow markets. However, the orchestration layer may face competition and have less direct control over the end-user relationship compared to aggregators. Picking an orchestration layer protocol necessitates finding a narrow end-market where the DePIN has a wedge. Once such area is 3D rendering, where a) AR/VR is set to drive a step function change in compute demand for render workloads, b) where CSPs have deprioritized distributed rendering in favor of AI workloads, and 3) many latent RTX series chips are sitting idle earning zero ROI, and are prime candidates to contribute supply to a DePIN. 

    Finally, the bare metal layer will likely accrue the least value among the three layers. Margins are generally lower here due to limited services and less value provided to the end user. 

    Strengths, Weaknesses, Markets

    Compute DePINs have a number of inherent advantages versus incumbent CSPs. 

    Pricing advantage. A compute DePIN’s asset-lite model coupled with an advantage in sourcing latent supply confers a pricing and margin advantage. First, the required ROI on latent resources is much lower than otherwise, especially for consumer cards that generate zero ROI. Second, in contrast to the CSPs, there is no capex, less overhead, and less people management for a Compute DePIN. 

    Customizability. Specifically at the aggregation layer, end-users have a high degree of flexibility in their configurations.

    Instant accessibility & censorship resistance. Compute DePINs at all layers of the stack benefit greatly from their near-frictionless onboarding. In contrast to the CSPs, customers have the potential to permissionlessly aggregate compute resources in only minutes (see here for an example from io.net and here for Akash Network). As KYC requirements globally grow more arduous, there is a regulatory arbitrage here for projects who choose to optimize for this, as well as the selling point that your cluster will never be shut down at the whims of a centralized entity. 

    Net new features. The highest incremental value Compute DePINs can provide are net new features for customers. We view heterogenous GPU clustering as a high impact net new feature that could drive significant customer acquisition for Compute DePINs. If possible, heterogenous clustering is a completely new unlock for cluster jobs, potentially enabling very large clusters that could rival and even exceed CSP scale. 

    Latency, ecosystem, tooling, & privacy disadvantages. Compute DePINs are largely building their stacks from the ground up, and could be orchestrating workloads over internet connections with varying speeds, and in the case of some DePINs like io.net, on globally distributed devices. Moreover, even when devices can be colocated via data center partnerships, it is unlikely that leading edge chips will be the primary offering of Compute DePINs, given the supply/demand and centralization dynamics explained above. Compute DePINs are therefore challenged by latency in many cases. 

    Compute DePINs also lack the familiarity and ease of the cloud, the distribution channels of the CSPs, and, most importantly, the supporting cloud ecosystem infrastructure, services, and tooling. By taking your process flows out of the cloud, you lose the supporting CSP ecosystem resources, e.g. telemetry and resource monitoring applications like CloudWatch. It’s this supporting ecosystem and tooling that has generated such a high degree of developer and enterprise lock-in, lock-in that has been compounding over the last decade.

    Privacy is another crucial issue. Enterprises running compute jobs where inputs contain customer PII, or jobs involving regulated data that must not leave certain geographies, limits the ability of Compute DePINs to serve certain classes of customers. Sure, Compute DePINs will encrypt data and send packets via mesh VPNs. At best, the prospect of running compute jobs on an unknown individual's hardware leads to longer sales cycles and more customer education for traditional businesses. Compute DePINs like io.net are countering these fears by actively developing solutions to these regulatory challenges, such as the development of advanced encryption methods and compliant operational protocols, partnering with SOC2 or HIPPA-compliant data centers, and are positioning their service as a white label UI layer for orchestration. 

    We believe the dynamics above favor the following end-markets. 

    Crypto-native customers. An obvious first customer for Compute DePINs crypto native developers, startups, and even DAOs. Here, the education phase of customer acquisition is likely far shorter than for non-crypto developers and CTOs, the sales cycles are shorter, there is less handholding, and crypto native developers and DAOs may favor decentralized tooling ideologically. Moreover, there will be many shots on goal from crypto x AI startups in zkML, AI agents, and gaming. Compute DePINs are a natural fit for serving open source AI models for these startups. 

    Early stage customers. Compute DePINs are best positioned for early stage tech companies with no prior relationship or meaningful lock-in with a CSP. 

    Academics & researchers. Since the ChatGPT moment, academia has taken a back seat to private enterprise in terms of AI research. Enterprises and startups are far better capitalized and will likely outbid most researchers for leading-edge compute. Academics and researchers typically perform workloads that are latency insensitive, however, meaning many of their workloads are actually better suited for globally distributed consumer grade chips, like higher end RTX series chips, and have less of a need for colocation than, say, a consumer app. We believe that this dynamic is responsible for the surge in RTX 3090s used in academic research today. Here, Compute DePINs have a large inherent advantage against the hyperscalers. 

    Cited Nvidia chips.png

    Variable load service for independent and private data centers. Compute DePINs can position themselves as a variable load service to the base load of compute provided by private and independent data centers. This way, instead of competing directly with traditional CSPs, Compute DePINs can complement existing infrastructures, with strategies in place to gradually integrate and scale up their services within the broader compute ecosystem. Whether or not startups and enterprises use DePINs as variable load themselves, or access DePINs via their cloud provider is yet to be seen. We’ve heard conflicting views on the ground regarding the IDCs’ willingness to integrate DePIN APIs into the offering. 

    Consumer card call option. The efficacy of consumer cards is mixed. They’re suitable for latency insensitive jobs, academic work, and rendering, but are not suitable for compute and memory intensive jobs, like AI training and some LLM inferencing. However, as models get smaller and more efficient, or if engineers are forced to optimize model serving trailing edge hardware, we believe consumer cards represent call options on these developments. 

    Synthetic data pipelines. It’s our view that Compute DePINs may possess an advantage in synthetic data generation versus data purchase deals due to the low cost of inference for synthetic data generation on a cluster of consumer hardware like RTX 3090s and 4090s. For example, paying Reddit for its data to use in LLM training will cost $60 million. In contrast, it would take a cluster of RTX 3090s four months and only $5 million in leasing costs to generate 2 trillion tokens at current prices on a decentralized network (link). Synthetic data generation is also far less dependent on cloud tooling and far less latency sensitive than other workloads, which plays to inherent the advantages of Compute DePINs. Synthetic data today can be useful across various application domains such as computer vision, speech, healthcare, and business. The primary benefits of synthetic data are that it addresses challenges related to data quality, scarcity, bias, and privacy concerns of traditional human generated data, while showing similar efficacy in training models. With growing concern that we may run out of data as early as 2026, synthetic data may be the only path forward. 

    Other Considerations

    Geopolitical hedge

    In light of the political dynamics surrounding TSMC described above and general fragility of modern semiconductor supply chains, we believe Compute DePINs are a strong hedge against these instabilities, as enterprises could be forced to leverage Compute DePINs in the face of a major supply chain disruption. 

    “DePIN-Fi”

    Because of its proximity to decentralized financial markets, and because of the high ROIs associated with compute, GPUs are assets that generate ROI, and any asset that generates return can be collateralized in DeFi. Take for example, a single A100 GPU, purchased at the market rate of $9500. At 1.00 USD/hour, and 50% utilization rate, over a 5 year lifecycle, the GPU generates an IRR of 36%. For a consumer-grade RTX 3090 at a purchase price of $1748 of 0.36 USD/hour (current price on io.net) under the same assumptions yields an 86% IRR for a device that was previously idle. Under these IRR assumptions, with (best case) millions of chips earning verifiable income on chain, we can conceive of capital markets where hardware income streams are borrowed against, or tranches of GPUs at various specifications are bundled into financial products and sold to investors. We expect DePIN-Fi specialized protocols to develop around these opportunities, given the specialized knowledge it takes to underwrite financing opportunities in hardware. We’re seeing the early innings of this trend today with DeBunker built on io.net.

    Risks

    The Compute DePIN landscape faces several significant risks and challenges that could impact their adoption and growth. Compute DePINs currently face latency and performance limitations due to their reliance on orchestrating workloads over internet connections and globally distributed devices, which may restrict adoption for latency-sensitive applications. The lack of mature ecosystems, familiar tooling, and supporting infrastructure compared to established CSPs could impede adoption and increase customer education and support costs. The revealed preference of startups could be that they are willing to wait for cloud allocations rather than deal with the pain of managing their own infra. 

    Furthermore, privacy and regulatory concerns around running compute jobs on decentralized networks, especially those involving sensitive data or personally identifiable information (PII), may limit the addressable market and lead to longer sales cycles. Compute DePINs also face intense competition from hyperscalers and established CSPs who have deep pockets, extensive infrastructure, and entrenched customer relationships, putting pressure on their economics. The rapid pace of innovation in AI creates technological uncertainty, requiring projects to continually adapt their strategies and technology choices to remain competitive. 

    Token economics and governance risks, such as poorly designed incentives or centralization of token ownership, could undermine network security, stability, and growth. There is no tried and true token model for DePINs today, and therefore token models must be flexible. Additionallynally, the evolving regulatory landscape for crypto and decentralized networks may increase compliance costs and limit the flexibility and censorship-resistance of Compute DePINs.

    Last, DePIN projects today are at varying stages of maturity. Projects like Akash and Render are generating revenues onchain (see here and here). Projects like Fluence and Gensyn are not yet fully live. io.net’s plans are highly ambitious, where they hope to train and serve AI models on geographically distributed hardware, but this is a highly technical challenge where public research has shown mixed results.  

    Final Thoughts

    The rise of AI has exposed significant inefficiencies and bottlenecks in the current centralized cloud computing market. Compute DePINs can emerge as a potential solution to these challenges, leveraging decentralized networks and token incentives to source and allocate compute resources more efficiently and cost-effectively. 

    While Compute DePINs offer several advantages, such as pricing benefits, customizability, and instant accessibility, they also face significant hurdles in terms of latency, ecosystem maturity, tooling, and privacy concerns. The fragmented nature of the market, driven by diverse hardware and latency requirements, suggests that Compute DePINs will likely focus on specific niches in their early days, with the highest value outcomes likely at the aggregator layer if these teams can execute. 

    This research report has been funded by io.net. By providing this disclosure, we aim to ensure that the research reported in this document is conducted with objectivity and transparency. Blockworks Research makes the following disclosures: 1) Research Funding: The research reported in this document has been funded by io.net. The sponsor may have input on the content of the report, but Blockworks Research maintains editorial control over the final report to retain data accuracy and objectivity. All published reports by Blockworks Research are reviewed by internal independent parties to prevent bias. 2) Researchers submit financial conflict of interest (FCOI) disclosures on a monthly basis that are reviewed by appropriate internal parties. Readers are advised to conduct their own independent research and seek the advice of a qualified financial advisor before making any investment decisions.