The AI-First Company by Ash Fontana How to Compete and Win with Artificial Intelligence
What's it about?
The AI-First Company (2021) argues that businesses which deliberately build AI into their core operations from the start – rather than bolting it on later – are the ones poised to dominate their industries. It walks you through how to identify valuable data, build the right teams, integrate AI into existing workflows, and reinvest the gains from automation to keep compounding a competitive edge.
Dogs have a sense of smell roughly 10,000 times more powerful than ours. Lions can sprint at 50 miles per hour. Whales can hear each other across entire ocean basins. And yet none of those animals built a civilization. Humans did – not because they’re faster or stronger or sharper-sensed, but because they gather information, process it collectively, and learn from it faster than anything else on Earth.
Now hand that ability to a machine. One that never sleeps, never forgets, and sharpens itself with every new piece of data it encounters. That’s what this lesson is all about: building an AI product that can gather, process, and communicate information faster than the competition – and stay ahead. The businesses that will dominate the next several decades aren’t just using AI as a tool – they’ve built it into the very DNA of how they operate, compete, and grow.
By the end of this lesson, you’ll have a clear, practical framework for building a company that uses artificial intelligence and compounds its advantage through it, automatically, day after day.
Every era of business has a defining edge. The industrial age rewarded scale – whoever built the biggest factory set the price and squeezed out everyone else. The internet age rewarded network effects – whoever attracted the most users became impossible to displace. Think of the telephone: worthless if you’re the only one who owns one, nearly indispensable once everyone does.
Today’s defining edge is something called a data learning effect, or DLE. The basic mechanic goes like this: your system makes a prediction, a customer acts on it, that action generates new data, the new data improves the model, and the better model makes sharper predictions. The system gets smarter automatically, without anyone having to push it. That’s the compounding part. Small improvements, reliably repeated, become enormous advantages
What makes DLEs especially powerful is that they don’t rely on just one competitive force. They combine three classic advantages in a single loop: scale, because more data strengthens the model; processing efficiency, because better algorithms extract more signal from the same raw material; and network effects, because more users generate more outcomes, which feed back into the loop and make the product more useful for everyone. When these three forces work together, the resulting advantage is genuinely hard to attack.
That’s not to say DLEs don’t have limitations. Externally, regulators can intervene if a company’s data accumulation starts looking like a monopoly – something that’s increasingly on policymakers’ radar. Internally, data storage gets expensive, and piling on more inputs doesn’t always add value. Sometimes it just adds noise. The rule of thumb is strict: additional data must make the model more accurate, and that increased accuracy must genuinely matter to the customer. Break either link in that chain and the compounding stops cold.
Before you build an AI you need to ask whether your customer actually needs one.
Think about the kind of decision your customer is trying to make. If it’s a one-off call with high stakes, using structured data from a known source combined with statistical analysis will probably serve them better than any machine learning model. AI earns its keep when decisions are frequent, data is messy, and the environment keeps shifting. In most real business situations, customers need a blend of both approaches – and a common mistake is jumping straight to the sophisticated solution before you’ve validated whether the simpler one does the job.
The smarter path is what the author calls Lean AI. Start with basic statistical tools – histograms, clustering, unsupervised grouping of similar objects. This phase isn’t a compromise or a consolation prize. It’s reconnaissance. You’re learning what’s actually in your data before committing to building anything elaborate on top of it.
From there, bring in a single data scientist to answer a single, well-defined question. Not a team. Not a platform. Not a roadmap. One person, one question, one clear demonstration of return on investment. Your question could be as small as figuring out how to determine which type of bridge can be seen in a photo.
Once that value is proven, you can start preparing your data more seriously – but even then, carefully. The temptation is to label everything in sight, harvest from every possible source, and build a sprawling infrastructure that feels suitably ambitious. Resist all of it. Momentum comes from validated progress, not from the appearance of scale.
Only after you’ve proven the value of your question should you build your first model. It doesn’t need to be highly accurate yet. It just needs to exist and generate predictions – because predictions are what start the loop turning, and the loop is what the entire long-term strategy is built around.
Not all data is worth pursuing, and confusing quantity for strategic value is one of the most expensive mistakes an AI-first company can make.
The datasets worth fighting for are the ones that are genuinely hard to replicate. These might be physically difficult to access – they may require going to a local council office, or collecting paper files and photocopying them manually. Other datasets may be legally restricted and only harvestable at a slow, painstaking rate. Financial market data providers, for example, only allow you to quote consecutive stock prices at a predetermined time interval, unless you pay more. If you’re offering an AI product that makes predictions on profitable trades, having this data could make all the difference.
When it comes to actually acquiring data, smaller customers are often more valuable partners than they first appear. Large enterprise customers can hand you enormous volumes, but they often come with so many usage constraints that the strategic value gets negotiated away before you’ve signed anything.
Once you have data, it needs to be labeled. Without labels, your model won’t know what it’s looking at. One solution is to outsource the work. This gets you volume but at the cost of specialization. Another option is building an in-house labeling team, which gives you precision but costs more time and money to stand up. Most companies land somewhere in between – domain experts set the quality standard, machines scale the effort, and non-experts correct the machine’s errors as they go.
You could also make use of data synthesis, where you generate new data points from existing ones using a set of pre-defined rules. For example, you can take the basic features of a chair – the length of the legs, whether it has a back, the type of covering – and output all the possible combinations of those components, with the rule that the outputs must all have four legs. This lets you test products and features at a low cost and without touching live customer data. This keeps risk exposure low during the messy early stages of building.
An AI-first company needs a genuinely different kind of team – one that spans data infrastructure engineering, machine learning research, statistics, econometrics, and deep domain expertise. That’s a wide range to hire for, and assembling it takes time. A good place to look is university departments, in fields including economics, actuarial science, biology, biostatistics, engineering, and physics. These kinds of backgrounds equip potential hires with a lot of fundamentals required for data handling.
Once your team exists, you move on to the harder challenge: managing it. Data scientists don’t work like engineers – they work like researchers. An engineer gets a specification, builds to it, ships it, and moves on to the next ticket. A data scientist explores. They ask questions, follow threads, sometimes hit dead ends, and often backtrack before finding something useful.
That process looks inefficient from the outside, but it’s exactly how good models get built. Managing it well means scheduling regular touchpoints to keep the questions these scientists are investigating tightly connected to what customers actually need – because without that tether, brilliant research can drift toward problems that nobody will ever pay to solve.
Budget management looks fundamentally different here, too. Running AI systems demands serious computing power, and data scientists need the freedom to experiment without constantly seeking approval. But that freedom has to be paired with genuine oversight of how resources are being consumed, because cloud computing bills can compound just as fast as model performance. The companies that get this balance right – real freedom within clear guardrails – turn their research teams into engines of competitive advantage.
Building your model involves training it to make predictions based on input data. More specifically, you need to instruct it what features it needs to look for in data to be able to identify things.
There are several ways you can achieve this. Sometimes you can code directly. Say you’re building an AI model to identify eyes. You can write in a feature in the code that activates whenever the model comes across a circular, black group of pixels – a pupil.
Other times you need to engage in supervised learning. This is when you give the model labeled data. You might feed it a bunch of pictures labeled as “zebra” and a bunch of others labeled “not zebra”. Your model then finds a set of relevant features for identifying a zebra by comparing both sets of images. Human feedback then becomes important for correcting the model when it makes a wrong prediction.
Finally we have unsupervised learning. This is when your model groups the data by itself, using various programmed methods, and extracts relevant features on its own. Unsupervised learning needs large volumes of data but earns its keep when you’re still figuring out what patterns are even worth looking for.
It’s easy to celebrate when a model first goes live. But that’s not the end of your work. As a model continuously adapts to new data, it can quietly wander away from the reality it was originally built to reflect – it can drift. The outputs gradually become a little less reliable, a little more skewed, until one day a customer notices something is off. That’s why rigorous model management is absolutely essential. Building it as an afterthought is how companies get embarrassed publicly.
Good model management means three layers of testing, applied consistently. First, test your data quality using statistical process control or SPC – a method that flags anomalies in output and tracks them back to changes in how your incoming data is distributed. Second, test model accuracy by systematically adjusting your model’s parameters, features, and training periods, mapping results carefully at each step. Third, run integration tests to make sure your model’s code works cleanly alongside your own infrastructure and your customers’ systems.
On the question of bias, the most powerful tool you have is constraints. Decide upfront what your model is and isn't allowed to predict, who can access those predictions, and what they're permitted to do with them. Where those limits are breached, automate a shutdown rather than relying on manual detection.
Learning is a loop. You observe something, act on what you’ve learned, observe the consequences, then act again with better information. Every scientist, every athlete, every manager worth their salary runs some version of this cycle. Data learning effects work the same way.
Take a simple example: an AI-First product predicts the pasta shelf in a store will be empty in 15 minutes and pings the stocker. He checks after lunch, finds it bare, restocks it, and logs it in his app. The in-store camera catches the whole sequence – pasta, no pasta, pasta again. The AI pulls in both signals, confirms the prediction was spot-on, and next time nudges the stocker a few hours earlier so the shelf never actually runs out.
Any good loop requires letting customers act on your model’s predictions in the real world, not just in controlled tests. Real-world use introduces the kind of healthy unpredictability that generates genuinely new data, and helps the loop compound. This data also becomes the most strategically precious data in your arsenal. Your competitors didn’t collect it. They can’t buy it anywhere. It belongs exclusively to you and forms the basis of your competitive advantage.
Once the loop is running and your data advantage is genuinely compounding, the final moves are about consolidation. Vertical integration is one of your most powerful levers here. It means you move beyond simply solving a technical problem to solving a business problem for your customers. For example, you might answer all customer emails yourself, rather than providing outsourced agents with a list of suggested responses. A more radical example would be to form your own insurance company, rather than just providing an AI tool for existing companies to process claims more effectively. The more you own the stack, the more data flows back to you, and the more revenue and profit you capture directly.
Disruption is the next lever. Disruption starts with providing a specialized product to a niche segment of an incumbent’s customer base, and at a lower cost. For example, an intelligent system that could search for all possible chemical combinations when making a drug could reduce the cost of producing that drug, and ultimately the price for the customer.
Once you’ve captured some of your incumbent’s clients, you direct them to novel, personalized AI features and charge them more for it. Google is a great example: it started out charging less for its ads than website banners, but now charges more because of the personalized targeting it offers.
Ultimately, to complete your disruption, you’ll want to provide completely automated products to your most demanding customers. The fully automated market belongs to AI-first companies – old incumbents can’t even set foot there.
The main takeaway of this lesson to The AI-First Company by Ash Fontana is that the companies dominating the next few decades won’t just use AI – they’ll be built around it.
The core engine of AI-first companies is the data learning effect, where predictions generate actions, actions generate data, data sharpens the model, and the loop compounds automatically. Getting there means starting smaller than you think. Start lean – validate simple statistical approaches before committing to machine learning, and prove value with one question before scaling anything. Fight hard for data that competitors genuinely can’t replicate, build a team that thinks like researchers rather than engineers, and manage your models rigorously so they don't quietly drift off course.
Once the loop is running, consolidate it through vertical integration and targeted disruption of incumbents. That’s how the advantage compounds – and keeps compounding.
The AI-First Company (2021) argues that businesses which deliberately build AI into their core operations from the start – rather than bolting it on later – are the ones poised to dominate their industries. It walks you through how to identify valuable data, build the right teams, integrate AI into existing workflows, and reinvest the gains from automation to keep compounding a competitive edge.
Dogs have a sense of smell roughly 10,000 times more powerful than ours. Lions can sprint at 50 miles per hour. Whales can hear each other across entire ocean basins. And yet none of those animals built a civilization. Humans did – not because they’re faster or stronger or sharper-sensed, but because they gather information, process it collectively, and learn from it faster than anything else on Earth.
Now hand that ability to a machine. One that never sleeps, never forgets, and sharpens itself with every new piece of data it encounters. That’s what this lesson is all about: building an AI product that can gather, process, and communicate information faster than the competition – and stay ahead. The businesses that will dominate the next several decades aren’t just using AI as a tool – they’ve built it into the very DNA of how they operate, compete, and grow.
By the end of this lesson, you’ll have a clear, practical framework for building a company that uses artificial intelligence and compounds its advantage through it, automatically, day after day.
Every era of business has a defining edge. The industrial age rewarded scale – whoever built the biggest factory set the price and squeezed out everyone else. The internet age rewarded network effects – whoever attracted the most users became impossible to displace. Think of the telephone: worthless if you’re the only one who owns one, nearly indispensable once everyone does.
Today’s defining edge is something called a data learning effect, or DLE. The basic mechanic goes like this: your system makes a prediction, a customer acts on it, that action generates new data, the new data improves the model, and the better model makes sharper predictions. The system gets smarter automatically, without anyone having to push it. That’s the compounding part. Small improvements, reliably repeated, become enormous advantages
What makes DLEs especially powerful is that they don’t rely on just one competitive force. They combine three classic advantages in a single loop: scale, because more data strengthens the model; processing efficiency, because better algorithms extract more signal from the same raw material; and network effects, because more users generate more outcomes, which feed back into the loop and make the product more useful for everyone. When these three forces work together, the resulting advantage is genuinely hard to attack.
That’s not to say DLEs don’t have limitations. Externally, regulators can intervene if a company’s data accumulation starts looking like a monopoly – something that’s increasingly on policymakers’ radar. Internally, data storage gets expensive, and piling on more inputs doesn’t always add value. Sometimes it just adds noise. The rule of thumb is strict: additional data must make the model more accurate, and that increased accuracy must genuinely matter to the customer. Break either link in that chain and the compounding stops cold.
Before you build an AI you need to ask whether your customer actually needs one.
Think about the kind of decision your customer is trying to make. If it’s a one-off call with high stakes, using structured data from a known source combined with statistical analysis will probably serve them better than any machine learning model. AI earns its keep when decisions are frequent, data is messy, and the environment keeps shifting. In most real business situations, customers need a blend of both approaches – and a common mistake is jumping straight to the sophisticated solution before you’ve validated whether the simpler one does the job.
The smarter path is what the author calls Lean AI. Start with basic statistical tools – histograms, clustering, unsupervised grouping of similar objects. This phase isn’t a compromise or a consolation prize. It’s reconnaissance. You’re learning what’s actually in your data before committing to building anything elaborate on top of it.
From there, bring in a single data scientist to answer a single, well-defined question. Not a team. Not a platform. Not a roadmap. One person, one question, one clear demonstration of return on investment. Your question could be as small as figuring out how to determine which type of bridge can be seen in a photo.
Once that value is proven, you can start preparing your data more seriously – but even then, carefully. The temptation is to label everything in sight, harvest from every possible source, and build a sprawling infrastructure that feels suitably ambitious. Resist all of it. Momentum comes from validated progress, not from the appearance of scale.
Only after you’ve proven the value of your question should you build your first model. It doesn’t need to be highly accurate yet. It just needs to exist and generate predictions – because predictions are what start the loop turning, and the loop is what the entire long-term strategy is built around.
Not all data is worth pursuing, and confusing quantity for strategic value is one of the most expensive mistakes an AI-first company can make.
The datasets worth fighting for are the ones that are genuinely hard to replicate. These might be physically difficult to access – they may require going to a local council office, or collecting paper files and photocopying them manually. Other datasets may be legally restricted and only harvestable at a slow, painstaking rate. Financial market data providers, for example, only allow you to quote consecutive stock prices at a predetermined time interval, unless you pay more. If you’re offering an AI product that makes predictions on profitable trades, having this data could make all the difference.
When it comes to actually acquiring data, smaller customers are often more valuable partners than they first appear. Large enterprise customers can hand you enormous volumes, but they often come with so many usage constraints that the strategic value gets negotiated away before you’ve signed anything.
Once you have data, it needs to be labeled. Without labels, your model won’t know what it’s looking at. One solution is to outsource the work. This gets you volume but at the cost of specialization. Another option is building an in-house labeling team, which gives you precision but costs more time and money to stand up. Most companies land somewhere in between – domain experts set the quality standard, machines scale the effort, and non-experts correct the machine’s errors as they go.
You could also make use of data synthesis, where you generate new data points from existing ones using a set of pre-defined rules. For example, you can take the basic features of a chair – the length of the legs, whether it has a back, the type of covering – and output all the possible combinations of those components, with the rule that the outputs must all have four legs. This lets you test products and features at a low cost and without touching live customer data. This keeps risk exposure low during the messy early stages of building.
An AI-first company needs a genuinely different kind of team – one that spans data infrastructure engineering, machine learning research, statistics, econometrics, and deep domain expertise. That’s a wide range to hire for, and assembling it takes time. A good place to look is university departments, in fields including economics, actuarial science, biology, biostatistics, engineering, and physics. These kinds of backgrounds equip potential hires with a lot of fundamentals required for data handling.
Once your team exists, you move on to the harder challenge: managing it. Data scientists don’t work like engineers – they work like researchers. An engineer gets a specification, builds to it, ships it, and moves on to the next ticket. A data scientist explores. They ask questions, follow threads, sometimes hit dead ends, and often backtrack before finding something useful.
That process looks inefficient from the outside, but it’s exactly how good models get built. Managing it well means scheduling regular touchpoints to keep the questions these scientists are investigating tightly connected to what customers actually need – because without that tether, brilliant research can drift toward problems that nobody will ever pay to solve.
Budget management looks fundamentally different here, too. Running AI systems demands serious computing power, and data scientists need the freedom to experiment without constantly seeking approval. But that freedom has to be paired with genuine oversight of how resources are being consumed, because cloud computing bills can compound just as fast as model performance. The companies that get this balance right – real freedom within clear guardrails – turn their research teams into engines of competitive advantage.
Building your model involves training it to make predictions based on input data. More specifically, you need to instruct it what features it needs to look for in data to be able to identify things.
There are several ways you can achieve this. Sometimes you can code directly. Say you’re building an AI model to identify eyes. You can write in a feature in the code that activates whenever the model comes across a circular, black group of pixels – a pupil.
Other times you need to engage in supervised learning. This is when you give the model labeled data. You might feed it a bunch of pictures labeled as “zebra” and a bunch of others labeled “not zebra”. Your model then finds a set of relevant features for identifying a zebra by comparing both sets of images. Human feedback then becomes important for correcting the model when it makes a wrong prediction.
Finally we have unsupervised learning. This is when your model groups the data by itself, using various programmed methods, and extracts relevant features on its own. Unsupervised learning needs large volumes of data but earns its keep when you’re still figuring out what patterns are even worth looking for.
It’s easy to celebrate when a model first goes live. But that’s not the end of your work. As a model continuously adapts to new data, it can quietly wander away from the reality it was originally built to reflect – it can drift. The outputs gradually become a little less reliable, a little more skewed, until one day a customer notices something is off. That’s why rigorous model management is absolutely essential. Building it as an afterthought is how companies get embarrassed publicly.
Good model management means three layers of testing, applied consistently. First, test your data quality using statistical process control or SPC – a method that flags anomalies in output and tracks them back to changes in how your incoming data is distributed. Second, test model accuracy by systematically adjusting your model’s parameters, features, and training periods, mapping results carefully at each step. Third, run integration tests to make sure your model’s code works cleanly alongside your own infrastructure and your customers’ systems.
On the question of bias, the most powerful tool you have is constraints. Decide upfront what your model is and isn't allowed to predict, who can access those predictions, and what they're permitted to do with them. Where those limits are breached, automate a shutdown rather than relying on manual detection.
Learning is a loop. You observe something, act on what you’ve learned, observe the consequences, then act again with better information. Every scientist, every athlete, every manager worth their salary runs some version of this cycle. Data learning effects work the same way.
Take a simple example: an AI-First product predicts the pasta shelf in a store will be empty in 15 minutes and pings the stocker. He checks after lunch, finds it bare, restocks it, and logs it in his app. The in-store camera catches the whole sequence – pasta, no pasta, pasta again. The AI pulls in both signals, confirms the prediction was spot-on, and next time nudges the stocker a few hours earlier so the shelf never actually runs out.
Any good loop requires letting customers act on your model’s predictions in the real world, not just in controlled tests. Real-world use introduces the kind of healthy unpredictability that generates genuinely new data, and helps the loop compound. This data also becomes the most strategically precious data in your arsenal. Your competitors didn’t collect it. They can’t buy it anywhere. It belongs exclusively to you and forms the basis of your competitive advantage.
Once the loop is running and your data advantage is genuinely compounding, the final moves are about consolidation. Vertical integration is one of your most powerful levers here. It means you move beyond simply solving a technical problem to solving a business problem for your customers. For example, you might answer all customer emails yourself, rather than providing outsourced agents with a list of suggested responses. A more radical example would be to form your own insurance company, rather than just providing an AI tool for existing companies to process claims more effectively. The more you own the stack, the more data flows back to you, and the more revenue and profit you capture directly.
Disruption is the next lever. Disruption starts with providing a specialized product to a niche segment of an incumbent’s customer base, and at a lower cost. For example, an intelligent system that could search for all possible chemical combinations when making a drug could reduce the cost of producing that drug, and ultimately the price for the customer.
Once you’ve captured some of your incumbent’s clients, you direct them to novel, personalized AI features and charge them more for it. Google is a great example: it started out charging less for its ads than website banners, but now charges more because of the personalized targeting it offers.
Ultimately, to complete your disruption, you’ll want to provide completely automated products to your most demanding customers. The fully automated market belongs to AI-first companies – old incumbents can’t even set foot there.
The main takeaway of this lesson to The AI-First Company by Ash Fontana is that the companies dominating the next few decades won’t just use AI – they’ll be built around it.
The core engine of AI-first companies is the data learning effect, where predictions generate actions, actions generate data, data sharpens the model, and the loop compounds automatically. Getting there means starting smaller than you think. Start lean – validate simple statistical approaches before committing to machine learning, and prove value with one question before scaling anything. Fight hard for data that competitors genuinely can’t replicate, build a team that thinks like researchers rather than engineers, and manage your models rigorously so they don't quietly drift off course.
Once the loop is running, consolidate it through vertical integration and targeted disruption of incumbents. That’s how the advantage compounds – and keeps compounding.
Comments
Post a Comment