Breaking out of the brutal battle of training large models#
- Training large models is upstream production, it is "intellectual/data-intensive research + training compute-intensive R&D engineering"
- Applying large models is downstream consumption, it is "domain knowledge-intensive fine-tuning + inference compute-intensive operational engineering"
Jumping out of the battle of "building large models", in addition to vertical industry large models, there is also a scalable opportunity for consumer-oriented AI agents: the highway of AI agents
The distributed inference compute network is the "logistics" network for landing large models#
Consider the large model as "goods", goods = model + compute + bandwidth + storage, all elements are indispensable, split the elements:
1. Model#
Including centralized large models in the public network, privately deployed enterprise-level models, and edge-side "small" models
- Centralized large models: suitable for integrated design of software and hardware, even deploying proprietary architectures, reducing the demand for inference compute, improving inference speed, obtaining high-speed response, and covering operational costs through scale
- Enterprise-level models: suitable for deployment in local area networks, with strong requirements for data privacy
- Edge models: prioritize deployment on handheld smart terminals and home computing centers
2. Compute#
Mainly consider inference compute, with more emphasis on cost factors
- The model itself will shield the differences in compute architectures, whether it is high-end compute cards or general-purpose CPUs. There will be a market for transforming the "traditional compute center". Enterprises that have made large investments in data centers will have strong demands
- Considering the network structure itself, the closer to the compute center, the faster the response speed. Therefore, distributed compute or edge compute will have advantages. PCDN players have the opportunity to create a second growth curve
3. Bandwidth + Storage#
Bandwidth and storage can be simplified and put together
- AI applications such as the Internet of Things/Connected Vehicles/Industrial Intelligence will have long tokens and high requirements for both bandwidth and storage. They will require models to be deployed at the front line
- Base stations may become emerging AI inference centers, and the role of telecommunications operators may increase. AI agent operators may not necessarily need to invest in developing large models themselves, but concentrate resources to operate large models
- After iterative updates of large models, they need to be quickly updated and distributed to the edge
The distribution of benefits in the AI high-speed logistics network, deeply integrated with web3#
List the roles of various players participating in the era of large models:
- Model manufacturers: a few core giants, large models with independent brands, including open source and closed source
- Model OEM manufacturers: manufacturers that fine-tune and package based on a few large models
- IaaS resource providers: data centers, small and medium-sized facilities in parks, operators of personal compute centers, players providing compute, bandwidth, and storage
- Model curators: those who promote model applications, including KOLs, anchors, media, etc.
- Model weavers: developers who weave different models to achieve specific tasks, may overlap partially with curators
When these players come together to serve consumers and profit from consumer payments, it is necessary to establish a mechanism for reasonable and stable distribution of benefits to ensure the continuous play of the game.
In the web3 part, operate a public chain or layer2
- Let different roles determine the allocation ratio dynamically through bidding, with "AI coin" as the underlying currency of the network
- Different roles can also invest their own resources and purchase opportunities to promote themselves, such as marketing expenses for large model manufacturers
- Even individuals can profit by deploying certain models in their own homes and providing services to neighbors
AI needs to be containerized#
In the container of AI, there is the orchestration and weaving of APIs, including workflows, API URLs, profit distribution mechanisms, and unique identifiers of models. From the current technology stack, it may be a combination of Webassembly + Rust + K8s-like software.
After containerization, AI agents are easier to schedule, distribute, and destroy because different network structures, compute infrastructure, and software differences will be smoothed out by containers, ultimately forming a huge resource pool. Containers can also correspond to standard blocks on the blockchain and eventually settle into profits.