AI Is Only Expensive When You Measure It Wrong
Why the real AI economics question is not token spend versus salaries — it is profitable output per dollar.
A lot of engineers on X have apparently forgotten basic engineering economics.
The current argument goes like this:
VC money is drying up.
AI compute is getting expensive.
Some AI workflows cost as much as human labor.
So replacing people with AI may not be as cheap as everyone thought.
After a year of layoffs, that is an easy headline to enjoy.
“Humans are back.”
Maybe.
But be careful.
The fact that AI token spend can approach, or even exceed, human labor cost does not automatically make the human-only workflow more economical.
It means the comparison finally has to get serious.
Cost only matters relative to output.
And output only matters relative to profit.
So no, software jobs are not suddenly safe.
Neither are AE jobs.
The risk is just moving from cheap AI hype to actual operating model math.
The wrong comparison
People keep comparing AI spend to salaries like those are interchangeable budget lines.
“This team spent $100 per employee per day on frontier model usage.”
Okay.
What did they produce?
Did they ship more features?
Did they close more tickets?
Did they review more submittals?
Did they answer more RFIs?
Did they reduce rework?
Did they increase billable throughput?
Did they create something the company can actually sell?
Without those answers, the cost number is mostly noise.
A fully human workflow might cost X per unit of output.
A human-plus-AI workflow might cost Y per unit of output.
That is the first comparison.
But it still is not enough.
The better question is:
How much profit does each dollar of spend generate?
Cost per output is not the final metric
Cost per feature matters.
Cost per RFI response matters.
Cost per submittal review matters.
Cost per drawing revision matters.
But none of those numbers matter by themselves.
A company does not survive by producing output. It survives by producing profitable output.
So the real question is:
“How much sellable work did this cost create?”
In software, the useful metric might start with dollars per shipped feature, then move to revenue, retention, or usage created by that feature.
In engineering and construction administration, it might be dollars per billable hour, dollars per RFI response, dollars per reviewed submittal, or dollars per completed revision.
The unit changes by industry.
The economic relationship does not.
What this looks like for an AE firm?
In the AE industry, a lot of engineering-side work is hourly.
That gives us a natural translation layer.
Human time becomes billable hours.
Now token consumption has to fit into that same model.
So we convert token spend into an effective hourly charge based on the person driving the agent. Tokens are not treated like some random software expense sitting off to the side. They are tied to the project work they support.
If an engineer is using agents to support construction administration work, those agents may help draft RFI responses, review submittals, generate revisions, research code paths, assemble context, or improve the depth and quality of a response.
The token spend is attached to the work.
And when the work is billable, the token spend can be translated back into revenue.
We do this now at PermitZIP.
At the time of writing this, our effective billing rate rate is around 6.7x our token spend.
That is before fully measuring the productivity gain.
We have not yet fully quantified how many more RFIs, submittal reviews, revisions, or permit-related tasks one person can complete per day with the right agent workflow.
But we already know the token spend itself can produce a return when it is aimed at billable project work.
So the upside is not just:
“We spent tokens and got work done.”
It is:
“We spent tokens, billed against the work, improved the quality of the output, and may have expanded how much work one person can responsibly handle in a day.”
That is a very different economic picture.
Humans bill linearly
A human can only bill for the time they spend.
One hour of human time generally becomes one hour of billable time.
That is the traditional professional services model.
It scales linearly-ish.
This is one reason engineering firms often get low valuation multiples. Revenue is tightly coupled to headcount. More work requires more people. More people create more coordination, more management, more review, more overhead, more inefficiencies, and more risk.
AI changes the shape of that relationship.
The objective is fine tuning what one good human can drive.
If one engineer spends an hour directing multiple agents across multiple projects, and those agents consume tokens that are tied to billable project work, that engineer may create more than one hour of billable value inside one clock hour.
The right human, driving the right token spend, can produce and support more billable output than the human could produce alone.
Some work is still cheaper with humans
Some work is still cheaper with humans.
There are tasks where the prompting, validation, retrying, reviewing, and context loading cost more than just having a competent person do the work.
There are tasks where the model creates output quickly, but also creates just enough uncertainty that a human has to inspect everything anyway.
There are tasks where the token spend is real, the latency is annoying, the error rate matters, and the productivity gain is not enough to justify the workflow.
The economics are task-specific.
Engineers should understand this.
A tool is not good because it is technically impressive. It is good when it improves the system.
Token cost does not map cleanly to labor cost
We have been paying for token consumption for over a year now at PermitZIP.
It is already obvious that token cost does not map cleanly to human labor cost.
A dollar of human labor and a dollar of token spend do not produce the same kind of output.
Human labor is flexible, contextual, and slow.
Token-driven execution is fast, scalable, and uneven.
Both are non-deterministic in their own annoying ways.
Both are also confidently wrong more often than anyone wants to admit.
Humans get distracted, misunderstand priorities, get tired, miss details, defend bad assumptions, have egos, and have bad days.
Models hallucinate, over-answer, miss context, follow bad instructions too confidently, and sometimes create work that looks finished before it is actually useful.
Humans can notice when the request is dumb.
Models can brute force through structured work at a speed humans cannot touch.
Humans carry judgment.
Models carry throughput.
The return profile is not the same.
The advantage comes from blending human judgment with token-driven execution.
Outsized returns show up when the human knows how to direct the system.
The budget shift.
A lot of companies staffed up for human-only throughput.
Then they discovered that serious AI usage is not just a few subsidized subscriptions.
It is infrastructure.
It is token budget.
It is tooling.
It is evaluation.
It is workflow redesign.
It is people who know how to direct the system.
So now companies are rebalancing.
The layoffs we are seeing in tech are tied to this recalibration of production costs.
Budgets are moving from headcount-only operating models to blended systems where compute is a real cost center.
Companies usually cannot keep the same labor budget and simply add serious AI spend on top.
The money has to come from somewhere.
In many cases, that means fewer people, higher-leverage people, and more expensive tools.
That sounds harsh.
But supply and demand does not care about your feelings.
The operating model has changed.
The objective is whether the blended system produces more profitable output per dollar.
Ignore that at your own risk.
Being laid off today can mean getting locked out of the industry for a long time, because the cost of labor plus tokens is being reconciled in real time.
If you get booted from the human part of the system, do not assume another company is backfilling with more humans.
A lot of them are not.
They are making room for the tokens.
The useful metric is profitable output per dollar
This is where the conversation should be.
Not “AI is cheaper than humans.”
Not “AI is more expensive than humans.”
Those are lazy categories.
The useful questions are:
What is the cost per useful unit of output?
Then:
How much revenue or profit does that unit create?
For software, maybe that starts with cost per feature or features shipped per day. But the company still has to ask whether those features increase revenue, reduce churn, expand usage, or create enterprise value.
For AEC, maybe it is RFIs answered per day, submittals reviewed per day, revisions completed per day, or permits processed per week.
But the output has to connect to dollars.
Can it be billed?
Can it protect margin?
Can it reduce write-offs?
Can it improve cycle time?
Can it increase the amount of work one person can responsibly carry?
Headcount is not the only scarce resource
A lot of people are emotionally attached to headcount as the main measure of capacity.
More people means more output.
Sometimes.
But every manager who has actually run an engineering team knows headcount also creates coordination cost.
More reviews.
More meetings.
More onboarding.
More QA.
More HR.
More process.
More communication paths.
More places for work to hide.
AI has its own costs too.
Prompting.
Context management.
Tooling.
Evaluation.
Human review.
Bad outputs.
Security concerns.
Token burn.
Adoption.
Neither system is free.
The point is not to worship AI or defend headcount.
The point is to compare operating models honestly.
The best AI systems are human-directed
The strongest workflows I have seen are human-directed systems.
A person with judgment defines the task, constrains the output, reviews the result, and decides what matters.
The model handles the parts where speed, volume, pattern matching, drafting, restructuring, extraction, research, or iteration create leverage.
The return comes from designing a system where human judgment sits in the right place.
Many companies treat AI like a toy subscription or a magic labor replacement machine.
Both are bad operating models.
The better model is more uncomfortable:
Fewer people.
Better people.
More expensive tools.
More output.
More revenue per human.
More profit per dollar of total spend.
That is where this is going.
Wishful thinking does not change the math
If AI costs as much as a human engineer per year, that does not automatically make it expensive.
If it produces significantly more profitable output per dollar, it is cheap.
If it produces noise, rework, and review burden, it is expensive.
The annual spend alone does not answer the question.
Neither does the token bill.
Neither does the salary comparison.
The metric is the blended system:
Human cost plus token cost, measured against useful output, then measured against dollars generated.
That is basic engineering economics.
Software engineers should understand this better than almost anyone.
Cost. Throughput. Quality. Constraints. Revenue. Return.
That is what matters now.
That is what the industry is calibrating around.
Employees can ignore it.
Companies can ignore it.
Neither gets a special exemption from the math.
Tech is already laying people off while it figures out the right mix of humans, tools, and compute.
AEC is not immune to that.
If you are an employee and you want to pretend this is hype, fine.
Understand the risk.
If your job gets cut, the replacement may not be another person.
It may be a smaller team with better tools, bigger token budgets, and a different operating model.
Getting back in will be harder if your skills only fit the old model.
Maybe you are locked out for a few months.
Maybe a few years.
Maybe permanently if the work you used to do gets absorbed into the system and the industry stops hiring for it.
That is what happens when production costs get recalibrated.
If you run a company, the warning is just as direct.
Your competitors are doing this math too.
They are testing where humans matter, where agents work, where tokens replace labor, where quality improves, where margins expand, and where cycle time collapses.
Some of them will figure it out before you do.
They will produce more with less.
They will price differently.
They will hire differently.
They will carry more work per person.
They will generate returns you cannot match with a headcount-only model.
So measure the system honestly.
Human cost.
Token cost.
Useful output.
Revenue generated.
Margin protected.
Rework avoided.
Cycle time reduced.
The recalibration is already happening.
You can participate in it, or you can be priced by it.



