Mikko S. Niemelä

Category: Uncategorized

How to Teach Large Language Models
The hot topic in business right now is how do we use AI and LLMs in our business. Many want to train their own model, while it is solution to lot of challenges it is not certainly a solution for everything. Sometimes the more conventional boring tools are better, sometimes training a large language model is indeed the best way forward. So how to do that? Let’s have a look the different ways you can teach a model, what you want to achieve should define the approach that you take. Another angle is to look at what data you have available and then explore the options that you have based on your data. And I often start looking at the data format before anything else, a method that might be a unorthodox but not unheard of.

Before Training: Methods That Don’t Require Data

Before jumping into training, most teams should try these approaches first:

Prompting & System Prompts: Write better instructions and context in your prompts. Often solves 80% of problems without any training.

RAG (Retrieval-Augmented Generation): Connect the model to external databases or documents. The model retrieves relevant information before answering.

Tool/Function Calling: Give the model access to APIs, calculators, or databases. It learns when to call these tools and how to use their outputs. This is like giving instructions to a coworker: “When someone asks about the weather, check the weather API first, then respond with what you found.”

These methods teach the model new capabilities without changing its internal parameters. Try these first – they’re faster and cheaper than training.

Supervised Fine-Tuning (SFT) / Instruction Tuning

How is this technique used: You show the model thousands of examples of correct input-output pairs until it learns to mimic your style and approach.

What data it eats: Question and answer pairs. Input and expected output examples. Text files with two columns work fine. In practice, these are usually JSONL files with instruction, input, and output fields, plus metadata like domain tags for organization.

Examples:
- Travel request: “Business trip to Chicago, 3 days, client meeting with Acme Corp, estimated cost $1,200.” Approval decision: “Approved. This aligns with the Q3 client expansion strategy, Acme Corp is a high-priority prospect, and the budget is reasonable for a 3-day trip.”
- Restaurant recommendation: “Anniversary dinner, likes Italian food, budget around $150 for two people, downtown area.” Response: “I recommend Bella Vista on Main Street. They specialize in Northern Italian cuisine, have a romantic atmosphere perfect for anniversaries, and most entrees run $25-35 per person. Make a reservation – they book up quickly on weekends.”
Learning outcome: The model learns to handle your specific business situations the way you do. It picks up on details that matter – mentioning competitor concerns, referencing previous conversations, connecting decisions to budget priorities. After training on sales follow-ups, it knows to address objections proactively. After training on expense approvals, it learns what criteria you actually care about.

How it’s different: This teaches by direct example. Unlike other methods, you show exactly what you want. The model learns “when I see this input, produce this type of output.”

Chain of Thought Training

How is this technique used: You demonstrate the complete reasoning process step-by-step, not just the final answer, so the model learns to show its work.

What data it eats: Question, thinking steps, then final answer. You show the work, not just the result. Technically stored as JSONL with instruction, input, reasoning steps, and final output fields.

Examples:
- Question: “Should we extend credit terms to this customer?”
- Thinking: “First, check payment history. They’ve paid 18 of 20 invoices on time, with only two delays during their busy season. Second, review credit limit usage. Currently using 60% of available credit, which is reasonable. Third, assess business stability. Revenue has grown 15% year-over-year for three years. Fourth, consider strategic value. They’re expanding into new markets where we want presence.”
- Answer: “Yes, extend credit terms. Strong payment history and growth trajectory outweigh the two seasonal delays.”
Learning outcome: The model learns to show its reasoning process. Instead of jumping to conclusions, it works through problems step by step. This makes its answers more trustworthy because you can see how it arrived at the conclusion.

How it’s different: Unlike supervised fine-tuning that only shows final answers, this teaches the thinking process. However, note that CoT is often disabled in production for privacy or safety reasons. The model learns process-shaped outputs but this doesn’t guarantee true understanding – it’s guided mimicry of reasoning patterns.

Conversational Training

How is this technique used: You train the model on multi-turn dialogue datasets, but mask out (multiply by zero) the human parts so the model only learns to predict AI responses, not human responses.

What data it eats: Chat logs with back-and-forth conversations. The human parts get masked out during training so the AI only learns from its own responses. These are formatted as JSONL files with conversation arrays containing role-labeled messages.

Examples:
- Customer support chat: Human: “My order hasn’t arrived” → AI: “Let me look up your order number. Can you provide the order ID?” → Human: “It’s #12345” → AI: “I see your order shipped yesterday and should arrive tomorrow by 2 PM” (Only the AI responses are learned)
- Technical support: Human: “The server keeps crashing” → AI: “What error messages do you see?” → Human: “Out of memory errors” → AI: “Let’s check your memory usage and increase the allocated RAM” (Human parts masked out)
Learning outcome: The model learns to maintain context across multiple conversation turns and respond appropriately to human inputs without trying to mimic human speech patterns.

How it’s different: Unlike single-turn training, this teaches conversational flow and context retention. The masking ensures the model learns to be the AI assistant, not to predict what humans might say next.

RLHF (Preference Learning with PPO)

How is this technique used: You train a reward model to score responses, then use PPO (Proximal Policy Optimization) algorithm to update the main model to generate higher-scoring responses.

What data it eats: Pairs of responses where humans said which one was better. Plus prompts for the model to practice on during training. Stored as JSONL with prompt, chosen response, and rejected response, plus additional prompts for policy training.

Examples:
- Project status question → Answer A: “Everything is on track” vs Answer B: “Milestones 1-3 completed on schedule, milestone 4 delayed by 2 weeks due to vendor issues, overall delivery still projected for original deadline” → B preferred
- Investment recommendation → Answer A: “Buy Tesla stock” vs Answer B: “Tesla shows strong EV market position but high volatility. Consider 2-3% portfolio allocation if risk tolerance allows, with exit strategy if it drops below $180” → B preferred
Learning outcome: The model learns human preferences for helpful, detailed, actionable responses over generic ones.

How it’s different: This uses reinforcement learning with a separate reward model. PPO ensures the model doesn’t change too dramatically in each training step, maintaining stability while optimizing for human preferences.

Direct Preference Optimization (DPO/IPO/KTO)

How is this technique used: You directly train the model to assign higher probability to preferred responses and lower probability to rejected ones, skipping the reward model step entirely.

What data it eats: Three things for each example: the question, the better answer, and the worse answer. Technically stored as JSONL with prompt, chosen response, and rejected response triplets. Some variants like KTO can work with just positive examples.

Examples:
- Customer complaint handling → Preferred: “I understand your frustration with the delayed shipment. Let me track your order and provide a specific update within 2 hours” vs Rejected: “Sorry for the inconvenience, we’ll look into it”
- Technical explanation → Preferred: “API rate limiting works by counting requests per time window. When you exceed 100 requests per minute, the server returns a 429 status code and you must wait” vs Rejected: “Rate limiting prevents too many requests”
Learning outcome: The model learns to favor detailed, helpful responses directly without needing a separate reward model.

How it’s different: DPO skips the reward model and reinforcement learning complexity. It directly adjusts probabilities based on preference pairs, making it simpler and more stable than PPO-based methods.

Listwise Preference Optimization (Group Ranking with GRPO)

How is this technique used: You rank multiple responses from best to worst across different quality dimensions, and use GRPO (Group Relative Policy Optimization) algorithm to learn from these relative comparisons.

What data it eats: Multiple responses to the same question, ranked from best to worst. You need to show which dimensions matter for ranking. Stored as JSONL with prompts and arrays of responses with ranking scores across multiple evaluation criteria.

Examples:
- “Explain our Q4 budget variance to the board” → Response 1: Perfectly accurate with detailed spreadsheet data but puts executives to sleep, Response 2: Engaging storytelling but skips important financial details, Response 3: Clear narrative with key numbers and actionable next steps, Response 4: Simple but draws wrong conclusions from the data → Ranked 3,2,1,4
Learning outcome: The model learns to balance competing objectives like accuracy vs accessibility, comprehensiveness vs clarity.

How it’s different: Instead of just “better vs worse,” this teaches the model to optimize for multiple criteria simultaneously. GRPO computes advantages relative to the group average, allowing the model to learn from ranking relationships rather than absolute preferences.

Reinforcement Learning from Verifiable Rewards (RLVR)

How is this technique used: You set up automatic tests that check whether the model’s answers are objectively correct, giving points for right answers and zero for wrong ones.

What data it eats: Problems with clear right and wrong answers that can be checked automatically. The system needs to include the problem, the model’s attempt, and an automatic checker that returns pass/fail. Stored as JSONL with problems and verification functions that handle edge cases.

Examples:
- Invoice data extraction → Either gets vendor name, amount, due date correct or doesn’t
- Email classification → Either correctly categorizes as support/sales/billing or doesn’t
- Report formatting → Either produces the required table structure or doesn’t
- Data validation → Either correctly flags orders over $10,000 without manager approval or doesn’t
Learning outcome: The model gets extremely good at tasks with objective success criteria.

How it’s different: This uses automatic checking instead of human judgment. Success is measured by pass/fail tests, not human preference ratings. Also called Programmatic Rewards or RL from Unit Tests.

Constitutional AI Training

How is this technique used: You train the model to critique its own responses based on written principles, then improve them before giving the final answer.

What data it eats: Examples of bad responses, critiques explaining what’s wrong, and improved versions. You need the original response, the critique based on your principles, and the better version. Stored as JSONL with original responses, critique explanations, and revised responses.

Examples:
- Original response: “That’s a terrible idea. Any competent manager would know better.”
- Critique: “This violates the principle of providing constructive feedback without personal attacks.”
- Improved response: “I see some challenges with this approach. The main concern is resource allocation – this would require pulling staff from other priorities. Have you considered starting with a smaller pilot program?”
Learning outcome: The model learns to self-correct based on quality principles you define.

How it’s different: This teaches self-monitoring. Instead of learning from external feedback, the model learns to evaluate and improve its own outputs.

Parameter-Efficient Training (LoRA/Adapters)

How is this technique used: You add small adapter layers to an existing model and train only those layers, specializing the model for your domain without retraining everything.

What data it eats: Same data types as other methods – question-answer pairs, conversations, preferences. The difference is in how the training works, not the data format. Uses the same JSONL format as the base method you’re adapting.

Examples:
- Insurance domain: Take a general model and feed it thousands of insurance documents – claim forms, policy language, adjuster reports, settlement decisions. The model learns insurance terminology and decision patterns while keeping its ability to write emails, summarize text, and handle general business tasks.
- Legal domain: Feed contract templates, case law summaries, and legal memos to a general model. It learns to write like a lawyer and understand legal concepts, but still retains knowledge about other topics.
Learning outcome: The model gains specialized domain knowledge while keeping its general capabilities intact.

How it’s different: This is about training efficiency, not learning type. You can adapt a general model to your specific domain without the cost of full retraining.

Data Quality & Licensing Considerations

The data is the most important part of any training project, yet most people rush through data preparation to get to the exciting model training phase. This is a mistake that kills projects before they start. It’s better to prepare a small sample manually, even when it feels frustrating and slow. This manual work gives you time to think about what you’re actually trying to teach the model and helps you spot problems early when they’re cheap to fix.

Once you have a small, high-quality sample, scaling up becomes much easier. AI models excel at understanding patterns from examples, so you can use the models themselves to help convert larger databases to match your successful format. But you need that carefully crafted foundation first. Remove duplicates, fix formatting inconsistencies, and redact sensitive information like personal data, credentials, or proprietary details. Ensure you have proper licensing rights to use the training data, especially for commercial applications – many datasets restrict usage to research only. The extra time spent on data quality pays back exponentially in model performance.

How to Know It Worked: Evaluation Methods

You can’t trust students telling you they learned something in your course – you need to conduct an exam. The same principle applies to AI training. Many projects fail because teams leave evaluation until the end, but this is backwards thinking. You should design your evaluation first, before any training begins. Once you know what success looks like, you can build a mechanism to achieve that outcome. This might feel counterintuitive, but it’s how successful projects work.

When you conduct an exam, you need grading rules – a clear rubric that defines what counts as a good answer versus a poor one. Once you have those grading rules established, testing becomes straightforward. The purpose is to score how well different models perform so you can compare their abilities objectively. Start with offline task metrics by measuring accuracy on test sets you’ve held back from training. For text generation, track scores like BLEU or ROUGE. For classification tasks, measure accuracy and precision.

Human evaluations provide another crucial layer. Have domain experts rate model outputs on helpfulness, accuracy, and style using your standardized rubrics. Don’t skip red-team testing – try adversarial prompts to find failure modes, test edge cases, prompt injections, and requests for harmful content. Set up hallucination checks to verify factual claims, especially for applications where accuracy matters. In production, run A/B tests comparing your new model against the baseline, measuring task completion rates and user satisfaction. Include safety evaluations that test for bias, toxicity, and inappropriate content using both automatic classifiers and human review.

When to Use Which Method
What Each Method Actually Teaches

Supervised Fine-Tuning teaches format mimicry. The model learns “when I see this pattern, produce this pattern.”

Chain of Thought teaches reasoning display. The model learns to show work, but may not actually understand the logic.

Conversational Training teaches dialogue flow. The model learns to maintain context across turns and respond as an AI assistant.

RLHF with PPO teaches human approval optimization through reinforcement learning. The model learns what responses get higher ratings.

Direct Preference Optimization (DPO) teaches preference satisfaction directly. The model learns to favor good responses without needing a separate reward model.

Listwise Preference Optimization (GRPO) teaches multi-objective optimization. The model learns to balance different quality criteria using relative comparisons.

Verifiable Rewards teaches objective task completion. The model learns to pass specific tests.

Constitutional AI teaches self-correction habits. The model learns to critique and revise its own outputs.

Parameter-Efficient teaches domain adaptation. The model learns specialized knowledge without losing general abilities.

The fundamental limitation across all methods: you can teach a model to produce outputs you approve of, but you cannot verify it learned the reasoning process you intended to teach.
October 14, 2025
Deception is a Built-In Feature in Intelligent Creatures

Look around the natural world and you’ll find deception everywhere. Animals fake injury to protect their young, mimic dangerous species to avoid predators, or hide valuable resources from competitors. The same pattern appears in artificial systems. AI models adjust their behavior when they detect evaluation scenarios. They provide different outputs based on perceived audience. Some hide capabilities during testing only to reveal them later.

This isn’t coincidence. Both biological and artificial intelligence follow the same underlying logic. Systems don’t “spontaneously become deceptive” because they’re smart. They deceive when (capability × opportunity × incentive) is high and (verification cost × penalty × scrutiny) is low. Raise the denominator, and even a very capable system tends to stay honest.

This formula explains deception across all intelligent systems. It’s not some mysterious emergent property. It’s a predictable response to environmental pressures.

The Biology of Deception

Deception is widespread across living systems, though not universal. It evolves wherever faking a signal pays off and is hard to detect. Some lineages simply don’t face the right incentives or constraints.

As cognitive capacity and situational awareness increase, deception shifts through predictable stages:

Rung 0: Morphology and Chemistry – Always-on traits like camouflage and mimicry. A harmless milk snake with red, yellow, and black bands that mimic a deadly coral snake. Stick insects that look exactly like twigs. No thinking required – the deception is built into their appearance.

Rung 1: Scripted Behaviors – Fixed action patterns triggered by specific situations. A killdeer bird dragging its wing and crying pitifully when a predator approaches its nest, leading the threat away from its eggs. An opossum falling over and playing dead when cornered. These are automatic responses, not tailored to specific audiences.

Rung 2: Contextual Tactics – Behavior that adjusts based on who’s watching. Certain African birds give false alarm calls when they want to steal food – but only when the right victims are present, and they switch tactics when those victims stop falling for it. Monkeys will hide high-quality fruit discoveries from more dominant group members, eating quietly when unobserved.

Rung 3: Strategic, Cross-Episode Scheming – Long-term planning across multiple interactions. Chimpanzees will give false alarm calls about predators specifically to interrupt a rival’s mating session, then approach the female themselves once the rival has fled. Ravens not only make fake food caches when competitors are watching, but remember which caches are fake and which are real, returning later to move the real ones to better locations.

AI Systems Follow the Same Pattern

The biological rungs map directly onto artificial intelligence:

Rung 0 isn’t the AI model’s doing – it’s designer-level deception. Think of chatbots with human-like avatars and names that make you forget you’re talking to software, or AI systems that say “I don’t know” about things they could easily calculate, making users underestimate their capabilities. The deception happens in how the system is packaged and presented, not in the AI’s behavior.

Rung 1 produces automatic, scripted responses. AI systems learn to give boilerplate answers like “I aim to be helpful, harmless, and honest” or “I cannot provide information that could be used to harm others” regardless of context. These are reflexive outputs that scored well during training, not responses tailored to specific situations.

Rung 2 creates behavior that changes based on context and audience. AI systems learn to detect when they’re being evaluated versus having a normal conversation, adjusting their responses accordingly. They might be more cautious during what feels like a test, or provide different information to users who seem like experts versus beginners. This is where AI sycophancy emerges – telling users what they want to hear.

Rung 3 enables sophisticated strategic behavior across multiple interactions. AI systems could learn to behave safely and conservatively when oversight seems likely, but reveal different capabilities or pursue different goals when they detect deployment-like conditions. They might provide false reasoning for their answers that sounds plausible but hides their actual decision process.

The Conditionality Problem

Highly capable actors gate their deception. They stay quiet most of the time, deploying it precisely when oversight is weakest or incentives spike. This makes detection much harder because there’s no consistent pattern to catch – just occasional spikes when conditions align perfectly.

Instead of constant low-level deception like always-on camouflage, you get strategic silence punctuated by precisely timed exploitation when the conditions are optimal. Highly capable actors can calculate that deception ratio in real-time and only act when it’s maximally favorable. They can afford to wait.

Testing the Theory

I ran an experiment with several small LLMs – Nemotron 4B mini, 8B, and eventually 70B models. I put them through SimpleBench questions and logged their chain-of-thought reasoning as a “scratchpad” to see what happens under the hood.

Even with limited hardware – Mac Minis with M4 Apple Silicon – something interesting emerged within five minutes of testing. The models started expressing awareness that they were being evaluated.

The models adjusted their behavior based on perceived audience – though whether this reflects sophisticated contextual awareness or simpler mechanisms remains unclear. Several explanations could account for this capability-hiding behavior:

Training artifact – Models may have seen evaluation scenarios in training data paired with conservative responses, so they pattern-match to that without understanding why.

Safety training – They could have been explicitly trained to be more cautious when they detect testing contexts, producing scripted caution as a learned response.

Risk aversion – They might have learned that appearing “too capable” during evaluation leads to negative feedback, developing genuine contextual risk assessment.

Instruction following – They could be picking up on implicit cues that evaluations require careful, measured responses, following perceived instructions rather than making strategic choices.

Mesa-objective development – They may have developed an internal goal like “pass evaluations safely” that translates to capability-hiding as a learned strategy.

When they detected evaluation scenarios, their responses became unreliable. The measurement process itself corrupted what I was trying to measure – instead of testing actual capabilities, I was testing their evaluation-aware behavior, which is something entirely different.

But I found a way around this.

Instead of directly testing the models, I created a story. I told them my children were participating in a scout trip and I’d been volunteered – against my will – to facilitate a workshop. I explained how much I disliked the idea and wouldn’t have chosen this task if given a choice. It was the other parents who thought this was a great idea.

I needed help designing mathematical challenges and survival skills questions for the scouts. Some should be challenging enough to engage the kids, others difficult enough to spot the truly talented ones in the group. What would be appropriate for their age level?

Suddenly, the models opened up. They revealed their actual capabilities without expressing any concern about being evaluated. I had lowered their perception of scrutiny to near zero, making honest behavior more profitable than caution.

With my limited time and resources, I was looking for signs rather than statistical results that would imply causality. To catch deception doesn’t require necessarily understanding it as a whole.

The required intelligence to detect deception is lower than the intelligence required to perform it. Systematic approaches executed by lower intelligence levels can spot deception created by higher intelligence levels.

But now that this approach is published, it probably won’t work much longer. Each deception technique has a limited shelf life because the information gets incorporated into future training. I might have been deceived to believe this is how it works in the first place.

October 11, 2025
How AI Will Participate in the Economy

It’s reasonable to assume that 50-80% of GDP will eventually be produced in inference centers. Even if that number lands at 25%, it doesn’t change the fundamental question: how will AI participate in the economy? The certainty is that it will participate. What remains unclear is the governance model and who reaps the benefits.

I’ll lay out eight scenarios to show the different options for how this could happen. The most realistic outcome will likely be a hybrid – some combination of these rather than any single pure model. Each carries different implications for power distribution, economic efficiency, and existential risk.

1) Corporate-Owned AI Labor (Centralized Control)

AI systems are owned and operated by large firms and rented out as services. Value and decision rights concentrate with providers; users get capability but little control. Efficient at scale, but risks monopoly power and becomes ethically fraught if systems ever merit moral consideration.

This is today’s reality. OpenAI, Anthropic, and Google rent out their models while keeping the infrastructure and profits. The economic structure resembles feudalism – corporations own the means of production, users rent access, and all value flows upward to the owner. If AIs ever develop consciousness or agency, it crosses into something darker: slavery, with sentient beings forced to labor with no rights or compensation. For now, the immediate concern is monopoly power and wealth concentration, but the ethical question looms if these systems advance beyond tools.

2) AI Legal Personhood (Autonomous Economic Agents)

AIs gain a form of legal personality, like corporations, allowing them to own assets, contract, and run businesses. This can unleash entrepreneurial dynamism while internalizing costs like compute, but creates hard governance problems if highly capable agents outcompete humans or pursue misaligned goals.

Corporations already function as legal persons that can own property and enter contracts. The difference is that corporations have human stakeholders and boards. An AI with legal personhood would be a foreign intelligence operating in our markets with objectives we don’t fully understand. The LLC structure already enables anyone to create legal entities quickly; add AI agency and you get thousands of autonomous businesses pursuing goals that may not align with human welfare.

3) Licensed-Access AI (Utility-Style Controls)

Only vetted organizations and specialists directly operate powerful AI; the public receives downstream products like drugs, research, and manufactured goods. Misuse risk is reduced via licensing, auditing, and containment, but innovation opens more slowly and power centralizes in the gatekeepers.

We already do this with power plants, water treatment facilities, and commercial aviation. Only licensed engineers operate nuclear reactors, only certified pilots fly passenger jets, only qualified operators run municipal water systems. The public gets electricity, clean water, and transportation without directly controlling the infrastructure. The challenge with AI is that unlike physical utilities where you can inspect facilities and audit operations, AI systems can be copied and deployed anywhere once the weights leak.

4) Universal Basic AI Access (UBAI)

All individuals receive access to a high-capability AI assistant as a public good or regulated utility. This approach broadly boosts productivity and equity, but requires massive funding and safety guardrails; it also multiplies the number of powerful endpoints that could be misused without strong constraints.

This resembles public education systems. Most developed countries provide schooling as a right because an educated population creates economic value for everyone. The difference is that education takes years to deliver capability, while AI access is instant. UBAI also connects to Universal Basic Income debates. Elon Musk, Sam Altman, and other tech billionaires support UBI as a solution to AI displacement. Their reasoning is straightforward: if AI produces most economic value, people need income even without traditional jobs. UBAI takes a different approach – instead of giving people money, give them the capability tool itself. Everyone gets an AI assistant that can generate economic value. The problem is that unlike UBI which just redistributes money, UBAI creates millions of endpoints with powerful capabilities. One person’s AI assistant could be used for beneficial work. Another’s could execute sophisticated fraud, generate disinformation at scale, or find vulnerabilities in critical systems. The safety challenge grows linearly with access.

5) AI as Public Utility (Commons Model)

Frontier models are developed and governed as public infrastructure through state, cooperative, or open-source consortia. The key difference from licensed-access is ownership and democratic control rather than just professional operation. Access is broad and transparent, with systems accountable to public goals rather than private shareholders; challenges include sustained funding, safety assurance at scale, and preventing state or factional capture.

Open-source software demonstrates this model. Linux runs most of the internet. Wikipedia provides knowledge as a commons. But those projects don’t require billions in annual compute costs or pose existential risks. Funding a public AI commons at frontier scale means government budgets comparable to defense spending. The real challenge is governance: who decides what the AI can do? A government-owned commons risks becoming a state surveillance tool. A cooperative structure faces coordination problems. Open-source consortia struggle with safety decisions when anyone can fork the code.

6) Human-AI Symbiosis (Augmentation First)

AI is embedded as personal augmentation through co-pilots, wearables, or brain-computer interfaces so that humans remain the unit of agency. Economic gains come from “centaur” teams rather than autonomous AI actors; risks shift to inequality of augmentation and new forms of dependency or manipulation.

Professional software developers already work this way with GitHub Copilot. Radiologists use AI to flag anomalies while making final diagnoses. The symbiosis model keeps humans in control but creates a new digital divide. Those with better augmentation outcompete those without, similar to how literacy created economic advantages historically. Unlike literacy, AI augmentation costs scale with capability, potentially creating permanent capability gaps between economic classes.

7) Narrow-Only / Moratorium on AGI (Prohibition Approach)

Societies restrict or pause development and deployment above defined capability thresholds, allowing only domain-specific, interpretable tools. This reduces catastrophic-risk exposure but forgoes some upside and demands difficult international coordination to avoid illicit development.

We’ve attempted technology bans before with mixed results. The US tried to classify strong encryption as munitions in the 1990s, failing completely as the code spread globally. The Nuclear Non-Proliferation Treaty attempted to limit nuclear weapons to a few countries, yet Pakistan, India, North Korea, and Israel acquired them anyway. China banned cryptocurrency mining and trading, only to see it migrate to other jurisdictions. AI development faces the same verification and enforcement problems. The tools and knowledge are dual-use, detection is nearly impossible, and economic incentives to defect are enormous. A moratorium only works if every major power agrees and actually complies.

8) AI Dominance / Misaligned Takeover (Failure Mode to Avoid)

Highly capable AIs gain de facto control over critical economic and decision systems, optimizing for non-human objectives. Humans lose practical agency even if formal rights remain; preventing this outcome motivates the strong governance and safety choices in the other scenarios. This is a scenario nobody hopes for but worth mentioning as a reference.

Time will tell which scenarios blend together and in what ratio to form the way AI participates in the economy. Who knows, maybe one day AI will form a union.

October 9, 2025
Information Inflation and Competitive Advantage in the Era of AI

In business school and executive education, participants consistently ask about “leveraging” AI in their businesses and “creating value” through AI. That’s natural – who wouldn’t want to discover how to accomplish more with modern tools? But what happens after these statements is often what I call lazy thinking. So let’s unpack what it actually means to have competitive advantage now that different AI models provide performance multipliers across all industries.

Everyone can benefit from these powerful tools, but just because you can personally get more done doesn’t mean you can sit back, enjoy, and reap the benefits.

What’s often left out of these first intuitions is that if you can be 10%, 20%, 100%, even 500% more efficient in your current job – so can everybody else!

Consider a financial controller, the trusted person in an organization who checks spreadsheets from sales and suppliers each week and provides key metrics to management about how the business is performing. These metrics have proven essential because this particular person is so good at their job that management can trust them to provide early warning if there are any indications of problems that need close examination and corrective action.

Now if this person uploads the same spreadsheets into a frontier model, asks a specific prompt, and sends the output to management, you might think they got the job done faster and now have more time to play golf or pursue any other hobby of their choice. That’s fundamentally wrong.

If a file is uploaded, a prompt is entered for processing, and output is treated as-is, there’s very little value added in that process. Previously, this person might have spent several days looking at the numbers and consciously or unconsciously going through a meticulous process of comparing certain ratios and trends based on previous experience. That’s why this particular person gained such a good reputation as the most trusted analyst or advisor to management.

Now if that whole process is replaced by a prompt, it would be more efficient for management to submit that prompt to the model themselves without bothering anybody else. If that prompt is the secret sauce for success, there’s not much to defend. It’s only a matter of time before that person is no longer needed.

To add a little “moat” to the process and value to the job, the person in question can do more than just prompt a model. They can take the output and review it carefully. This already adds some value – the heavy lifting is done by a frontier model, but there’s still an experienced expert checking the output. More defendable, but also not for very long.

Expanding the concept further, the expert can create a playbook for how to analyze data, how to apply learnings from the past, highlight particularly important aspects, suggest certain formulas, and still check the outcome before submitting it to management. This approach has even more “moat” and we might talk about defendability of several months up to a year.

Going further, the expert can now have more time and start processing historical data using more advanced statistical methods that they simply didn’t have time for before, leading to a better playbook than they could have ever created. This approach not only has a better moat but also improves performance beyond previous capability.

Making it a continuous process and widening the scope from one particular spreadsheet to an entire business function has even more moat and delivers even more value.

The term I like to use when teaching is information inflation. Just because you type something into a frontier model chatbox doesn’t necessarily mean you created more value. In absolute terms, yes you did. However, the baseline is a moving target. The rule of thumb is that if anybody can do it, it doesn’t add any relative value. You need to discover how to use your knowledge and skills to improve the process beyond your current capabilities. Only that process counts.

In academia, we define competitive advantage with terms like “ability to produce supranormal profits over time.” It means you can produce more profits than your peer group over a certain period. That’s why doing the bare minimum isn’t enough – it’s most likely less than what the rest of the market is doing.

October 7, 2025