Thought Leadership

2025-01-03

Is there truly no moat in AI, or does post-training behavior create one after all?

Trust is a key factor in successful AI adoption, often intertwined with more commonly discussed measures of quality. While quality can be evaluated using metrics such as accuracy and latency, trust is often influenced by subtler elements. Among these elements, behavior—the manner in which a model responds, reasons, and interacts—may be one of the most decisive in determining whether users embrace or reject a system.

First, we discuss the notion of a post-training moat. Then we examine why behavior shapes trust, followed by an example of my own and we'll close out on where this may take us this year.

Actually, maybe there is a “moat” after all...

There is a widespread notion that “there is no moat” in AI because large-scale models often use the same public data for pre-training. While it is true that pre-training on openly available data creates fewer barriers to entry, focusing solely on pre-training overlooks the significance of post-training. During the alignment or fine-tuning stage, models develop distinctive behaviors that can be difficult to replicate. These behaviors often draw on privately held or proprietary data, shaping how the model “speaks,” how it balances opposing views, and how it adapts to user needs. This post-training process can create a practical moat, anchored not in the publicly available data but in the unique ways a model has been refined.

Why Behavior Shapes Trust

Behavior is about more than correctness; it reflects how models present information, handle ambiguity, and address diverse viewpoints. For example, consider two AI systems evaluating an analogy or concept you have created. One system takes a “glass half full” approach, emphasizing clever aspects and a creative spin. Another offers a “glass half empty” view, pointing out flaws and potential oversights. Neither system is inherently wrong, but each elicits a distinct emotional or cognitive response from the user. Overly affirmative behavior can appear insincere or sycophantic, eroding trust if users suspect the AI is merely echoing their own ideas. Conversely, overly critical behavior can feel abrasive, particularly if users seek constructive support rather than blunt critique.

Striking a balance that invites engagement without alienating the user is an important part of building trust. This balancing act mirrors human interactions: we often look to trusted advisors for both encouragement and respectful criticism.

Post-Training and the “Trust Moat”

The same public dataset that informs broad linguistic competence cannot guarantee trust-building behavior. Instead, post-training efforts—such as reinforcement learning from human feedback (RLHF) or domain-specific fine-tuning—incorporate user preferences and contextual constraints. These processes develop the “trust moat,” in which the refined behavior of the AI system sets it apart from more generic models. Because the data and feedback loops that inform these refinements are often proprietary or at least highly specialized, replicating them is not straightforward.

Model providers develop proprietary datasets based on human feedback that capture nuanced preferences about how their AI should respond. These datasets aren't just about right/wrong answers, but about style, tone, and judgment calls in complex situations. The iterative process of refining behavior (through techniques like RLHF) involves countless micro-adjustments based on user interactions and expert feedback. It's similar to how a person's communication style is shaped by years of social interactions - you can't simply copy the end result without going through a similar learning process.

The result is a set of behaviors which are different from model to model, which are likely to get more different over time (of course, whether these differences amount to a competitive advantage remains to be seen.)

Measuring Behavior in Passing: The Role of “Eval”

Though traditional metrics still matter, specialized evaluation strategies enable us to track not just what a model says, but how it says it—and that ‘how’ is the foundation of trust. As I wrote in a previous post on “eval,” selecting the right benchmarks for both quality and trust is key. Standard tests quantify performance, whereas human-focused evaluations are usually necessary to reveal how users perceive a model’s helpfulness, neutrality, or tone. Although these processes can be more qualitative, they offer vital insights into how behavior shapes acceptance and trust.

Example in Action

My recent experience with two models illustrated the importance of behavioral “fit.” The “glass half full” model praised my analogy for its ingenuity, while the “glass half empty” model highlighted flaws and clumsiness. Neither response was universally better; each spurred a different reaction in me, the author. I appreciated the flattery yet also valued the candor.; the differences writ large were striking and meaningful.

On one hand, affirmation can breed skepticism if it feels disingenuous. On the other hand, excessive negativity can discourage continued exploration of an idea. Which should I choose? The answer, predictably, is 'both'. This interaction reinforced what humans know to be true, but we are still learning in AI: diverse opinions lead to stronger outcomes. As the 'human in the loop' in this case, I had a chance to weigh both perspectives, pick a path, and then stand by it.

For the record: in this instance, I selected the more optimistic interpretation, largely because it was the holiday season and I was inclined to maintain a festive mood—though I still recognized the critical feedback as valid.

A Path Ahead

As AI systems become more integrated into everyday workflows, trust will hinge on the balance between factual accuracy and perceived alignment with user needs. Behavior, shaped by post-training data and processes, can serve as a durable moat for AI developers, distinguishing one model from another even if both are trained on the same public corpus. By embracing rigorous yet nuanced evaluation methods (“eval”) and refining how models behave in varied situations, developers can ensure that their AI systems foster trust—ultimately guiding users toward meaningful, sustained adoption.

The balance of quality, trust, and nuanced behavior is a path forward for AI adoption—giving users not just results, but results they can rely on.