Are LLMs More Energy Efficient Than Humans?
Download PDF1. Introduction
Few people would now refuse a capable AI assistant for cognitive work. Large language models (LLMs) are increasingly used not just to answer questions but to carry out extended, multi-step tasks. Running these workflows is energy demanding. A natural question follows: which is more energy efficient at solving a given problem — an LLM or a human?
Adoption of LLMs at scale carries an energy cost that needs to be accounted for in any serious discussion of sustainability. The systems are rapidly becoming more capable, and the cost of compute per task has been growing even faster than the capability gains it buys [1]. If LLMs are substantially less energy efficient than humans, that materially affects the case for scaling LLM-based systems. If they are comparable, or more efficient, the picture changes considerably — not least because LLMs can perform work that is not feasible for humans at the same scale.
LLMs and humans turn out to be within the same order of magnitude in energy efficiency for knowledge work tasks.
More importantly, energy efficiency may not be the most useful lens through which to evaluate LLM adoption — the capability expansion argument can be more consequential.
2. Defining the comparison
Not surprisingly, the hardest part of pulling this piece together was finding actual data on energy consumption and settling on meaningful metrics.
People and companies more deeply involved in the LLM business certainly have more precise figures. LLM energy consumption is also an emergent property of complex systems, and it is likely treated as a commercially sensitive matter that providers have little incentive to publish. I constrain myself to publicly available data and am happy with order-of-magnitude accuracy in estimating energy consumption for comparable tasks.
Defining a comparable metric for humans is not easy either. Should we include only the energy associated with the task itself, or the broader supply chain that keeps a knowledge worker (human or otherwise) fed, housed, and equipped? As a simple reference point, in most comparisons I use the energy value of an average daily food intake, with reference to total daily per-capita consumption where it matters.
3. How much energy do LLMs consume?
3.1 Tokens
LLMs process text as tokens — common character sequences found in a body of text [2, 3]. A short conversational query might span 100–300 tokens; an extended technical exchange can run to many thousands. Token count is the natural unit for measuring LLM workload, and energy per token the natural efficiency metric.
According to a post by Sam Altman [4], the average query uses about \(0.34\) Wh \(\approx 1.2\) kJ. The number of tokens in a typical query varies widely; assuming a range of 200–1000 gives an energy cost of 1–6 J/token.
3.2 Processing throughput
Energy consumption per token depends significantly on model size and hardware utilisation. A 2026 benchmark for the NVIDIA H100 [5] reports peak inference throughput of 3,500–4,000 tokens/s for Llama 70B and 5,000–6,000 tokens/s for Llama 13B.
At 700 W GPU power, peak throughput corresponds to about 0.2 J/token for the 70B model. In practice, processing a million tokens per day occupies the GPU for 2–3 hours once model loading, batching, and queuing overheads are accounted for, which raises the figure to 5–8 J/token. Doubling for facility overhead (cooling, networking, storage) gives a realistic range of 10–15 J/token for a 70B-scale model.
That is broadly consistent with the figures reported in the 2025 MIT Technology Review article [6]: approximately 100 J for a "typical" query using Llama 3.1 8B, and approximately 7,000 J for the larger Llama 3.1 405B. For a 100–1,000 token query that corresponds to 0.1–1 J/token and 7–70 J/token respectively.
As a further cross-check, the MELODI framework paper [7] reports energy consumption ranging from \(1\times 10^{-7}\) to \(15\times 10^{-7}\) kWh/token for 2–8 billion parameter models and \(0.9\times 10^{-5}\) to \(1.4\times 10^{-5}\) kWh/token for 70 billion parameter models — corresponding to 0.4–5 J/token and 30–50 J/token respectively.
Across these three independent sources, the picture is consistent (Figure 1): energy per token rises from under 5 J/token for small models to several tens of joules per token for frontier-scale models. 10 J/token is a reasonable estimate for a large frontier-class model, and that is the figure I use in the case studies below.
4. How much energy do humans consume?
A standard daily food intake is approximately 10,000 kJ [8].
A person doing the kind of cognitive work typically delegated to LLMs, however, is not living off a self-sufficient plot of land without electricity. They live in a heated or cooled home, commute, use a computer, and depend on the broader infrastructure of modern life. Per-capita energy consumption varies widely around the world [9]; a reasonable working figure is about 20,000 kWh per capita per year, or approximately 200 MJ per person per day. This is corroborated by the Energy Institute Statistical Review of World Energy 2025 [10], which reports an average world energy supply of about 70 GJ per capita per year (\(\approx\) 190 MJ per person per day).
Total per-capita energy consumption is therefore approximately 20 times higher than food intake alone. This upper bound encompasses everything — heating, transport, the supply chain that feeds us — and presumably includes a small contribution from LLM queries themselves. For the case studies that follow, I use the food-only figure as a lower bound and the full per-capita figure as an upper bound. The truth for any given task falls somewhere between.
5. Case studies
One way to compare the energy use of problem-solving with and without LLM assistance is to look at substantial tasks — ones that consume meaningful amounts of both time and tokens. Longer tasks are easier to measure and tend to average out performance fluctuations. Below I look at two examples: one a serious physics project carried out with Claude and documented on Anthropic's blog, the other my personal experience at a much smaller scale.
5.1 Vibe-physics
In March 2026, Matthew Schwartz published a post on Anthropic's blog [11] about his experience writing a high-energy theoretical physics paper while outsourcing calculations and manuscript preparation to Claude. It is a good read for anyone in a broadly similar line of work.
The numbers in the post amount to about \(36 \times 10^6\) tokens consumed and 50–60 hours (\(\approx 7\) days) of human supervision. That gives an LLM energy cost of
Adding 7 days of human supervision on a food-only basis,
brings the total LLM-assisted workflow to approximately \(4 \times 10^{8}\) J.
Schwartz estimates that without LLM assistance the same project would have taken 3–5 months. Four months at roughly 20 working days each gives 80 days of unassisted work. On the food-only basis,
about twice the energy of the LLM-assisted workflow. On the full per-capita basis, 80 days at 200 MJ/day amounts to \(1.6 \times 10^{10}\) J — roughly 40 times the LLM workflow. The true figure for any given task falls between these bounds, but under either accounting the LLM-assisted approach comes out ahead.
5.2 My recent experience
As a second example, a less ambitious project came up in my own work. I needed code for the numerical solution of a two-phase flow in porous media problem with particular boundary conditions and nonlinear coefficients. It looked straightforward but required some fiddling. I worked through it with Claude Opus.
The project consumed approximately \(2{-}2.5 \times 10^{6}\) tokens, plus about one day of my own time for oversight and direction:
to which one day of human supervision on a food-only basis (\(10^{7}\) J) brings the total to \(3 \times 10^{7}\) J.
Without LLM assistance I estimate the same work would have taken about three days, or \(3 \times 10^{7}\) J on the food-only basis — essentially identical to the LLM-assisted workflow. On the full per-capita basis, three days at 200 MJ/day amounts to \(6 \times 10^{8}\) J, about 20 times the LLM workflow. As in the previous example, the two approaches sit within the same order of magnitude on the lower bound, and the LLM workflow is more efficient on the upper.
Summary
Across both examples, LLM-assisted and unassisted workflows sit within the same order of magnitude on the food-only baseline, and the LLM-assisted approach is between 20\(\times\) and 40\(\times\) more efficient on the full per-capita baseline (Table 1).
| Vibe-physics | Two-phase flow | |
|---|---|---|
| LLM tokens | \(36 \times 10^{6}\) | \(2 \times 10^{6}\) |
| LLM energy | \(3.6 \times 10^{8}\) J | \(2 \times 10^{7}\) J |
| Human supervision (food) | \(7 \times 10^{7}\) J | \(1 \times 10^{7}\) J |
| Total LLM workflow | \(4 \times 10^{8}\) J | \(3 \times 10^{7}\) J |
| Unassisted human, food only | \(8 \times 10^{8}\) J | \(3 \times 10^{7}\) J |
| Unassisted human, per capita | \(1.6 \times 10^{10}\) J | \(6 \times 10^{8}\) J |
| Ratio (food basis) | \(2\times\) | \(1\times\) |
| Ratio (per-capita basis) | \(40\times\) | \(20\times\) |
6. Does this matter?
So, we can accept the working thesis that LLMs are at least as energy-efficient as humans for the kinds of tasks considered here. How that picture evolves is harder to predict: all else equal, larger and more capable models require more energy per token. The trajectory of energy per unit of LLM capability is at least as important to watch as the current snapshot. Does this matter?
From the energy supply point of view, it clearly does, and it sits within a broader trend: the world needs more and more energy. Population growth is a factor but lifestyle improvement is arguably a stronger one, and lifestyle improvements are themselves powered by energy-consuming machines.
Energy efficiency, however, is not the only criterion that matters when choosing a tool. The most important thing the case studies suggest is not just that LLM-assisted workflows are competitive on energy per task — it is that they make some tasks feasible that were previously ruled out by the time and human effort required. Comparing energy per task assumes the task gets done either way. When the alternative is that it does not get done, the comparison changes shape.
A transport analogy helps frame the trade-off. A single person travelling 100 km on foot consumes approximately 20 MJ [12]. Cycling the same distance uses roughly half as much energy [13, 14]. A common petrol car consumes \(\approx 290\) MJ [14] — an order of magnitude more than walking. An electric car uses perhaps a quarter of that; a fully occupied electric train is approximately as efficient as a bicycle (Figure 2).
It would be wonderful if all transport were by bicycle and electric train. But society's needs are more varied. A heavy steel beam still has to reach a construction site that may be nowhere near a railway, and walking with it is neither practical nor efficient. The right choice of transport depends on the task, not solely on energy per kilometre.
The same holds for LLM use. There are problems where LLM workflows deliver enormous productivity gains regardless of their energy footprint, and there are cognitive tasks better done at walking pace, e.g. where the overhead of using an LLM exceeds the cost of just doing the work yourself. Recognising which is which is itself a useful skill.
7. Conclusions
For substantial knowledge work tasks, LLM-assisted workflows consume energy in the same order of magnitude as the human work they partially replace, and often less once the broader energy cost of sustaining a knowledge worker is taken into account.
For an individual practitioner, the more useful question is rarely whether an LLM is the more energy-efficient choice in some abstract sense, but whether a given task is well-suited to LLM assistance at all.
References
- [1] T. Ord, "Are the Costs of AI Agents Also Rising Exponentially?" 2025. tobyord.com
- [2] OpenAI, "Tokenizer." OpenAI Platform. platform.openai.com/tokenizer
- [3] S. Wolfram, "What Is ChatGPT Doing … and Why Does It Work?" 2023. writings.stephenwolfram.com
- [4] S. Altman, "The Gentle Singularity," 2024. blog.samaltman.com
- [5] JarvisLabs Team, "NVIDIA H100 Price Guide 2026: GPU Costs, Cloud Pricing & Buy vs Rent," 2026. jarvislabs.ai
- [6] J. O'Donnell and C. Crownhart, "We Did the Math on AI's Energy Footprint. Here's the Story You Haven't Heard." MIT Technology Review, 2025. technologyreview.com
- [7] E. J. Husom, A. Goknil, L. K. Shar, and S. Sen, "The Price of Prompting: Profiling Energy Use in Large Language Models Inference," arXiv:2407.16893, 2024. arxiv.org/abs/2407.16893
- [8] Commonwealth of Australia, "Nutrient Reference Values for Australia and New Zealand Including Recommended Dietary Intakes," 2006. eatforhealth.gov.au
- [9] H. Ritchie, "Global Comparison: How Much Energy Do People Consume?" Our World in Data. ourworldindata.org
- [10] Energy Institute, "Statistical Review of World Energy," 74th ed., 2025. energyinst.org
- [11] M. Schwartz, "Vibe Physics: The AI Grad Student," Anthropic Research, 2026. anthropic.com/research/vibe-physics
- [12] C. Hall, A. Figueroa, B. Fernhall, and J. A. Kanaley, "Energy Expenditure of Walking and Running: Comparison with Prediction Equations," Medicine and Science in Sports and Exercise, vol. 36, no. 12, pp. 2128–2134, Dec. 2004. pubmed.ncbi.nlm.nih.gov/15570150
- [13] A. Mizdrak et al., "Fuelling Walking and Cycling: Human Powered Locomotion Is Associated with Non-Negligible Greenhouse Gas Emissions," Scientific Reports, vol. 10, p. 9196, 2020. pmc.ncbi.nlm.nih.gov
- [14] D. J. C. MacKay, Sustainable Energy — Without the Hot Air. Cambridge: UIT, 2009. withouthotair.com