Why every GTM org will need AI skill marketplace
Your team's AI fluency gap is widening and what to do about it
If this was forwarded, welcome to the GTM Engineering newsletter. Hi, I’m Alex, one of the first GTM Engineers at Clay and here to share what I’m building for customers, how to build AI-native GTM, and resources for GTM Engineering. Join 7,000+ GTM operators, founders, and investors.
Most of the time - I have no idea what model to use. The subtitle on the model selection says ‘complex reasoning’. But what does that even mean?
Many of us default to the most powerful model for most tasks. There’s no incentive not to. It’s not problem yet, but it will be soon.
Think about what that means at scale. Anthropic’s flagship model costs roughly 15x more per token than its smallest one. A rep summarizing a call transcript doesn’t need the flagship. Neither does a workflow that formats CRM notes or drafts a follow-up email. But nobody told them that, there’s no system enforcing it, and nobody’s measuring the difference. Across a 200-person GTM org running AI workflows all day, that adds up to a real budget line nobody owns yet.
This is just one problem with AI adoption on GTM teams.
AI skill sprawl is another one. Lack of centralization is another, and the larger one is how inconsistent AI usage is and varied due to a wide spectrum of individual fluency.
I’ve been thinking about this problem for awhile. Then was introduced to Tessl, a company founded by Guy Podjarny. He also started Snyk (a cybersecurity unicorn where I used to work).
Tessl is a management layer that turns risky, sprawling, invisible skills into a governed, measurable system. They argue there’s 3 questions that every AI native team need to answer:
Security & governance - if a risky skill ran in your environment, would you even know?
Standardization & reuse - how much are duplicate, outdated skills costing your team?
Continuous optimization - your agents have the skills, but are they actually using them?
There’s a lot more to AI skills to unpack than I realized.
Skills are a new unit of software.
This has many implications for security, governance, and so on. But at a basic level, they’re building a product I believe will become a default layer of everyone’s GTM stack and approach to adopting, scaling, and using AI effectively across the org.
This is one symptom of a bigger problem: GTM teams are building AI skills with zero infrastructure underneath them. And we’ve seen this exact movie before.
A skill marketplace will solve for more than just cost and a single source to find your organizations built skills. But let’s dive into the cost problem first.
A parallel to cloud computing
In the early cloud days, every engineer could spin up whatever compute they wanted. It felt like freedom. Then the bills arrived, and an entire discipline — FinOps — was invented to clean up the mess. It still hasn’t fully worked: Flexera’s 2025 State of the Cloud Report found that 27% of cloud spend is wasted, a number that’s barely moved since 2019. At a $675B global cloud market, that’s roughly $180B a year evaporating into idle instances and over-provisioned resources.
Flexera’s own analysts drew the line to what’s next: just as early cloud usage produced unwieldy costs, AI spend is following the same arc — and “FinOps for AI” is already forming as a category to deal with it.
Most GTM leaders haven’t connected this yet: Skills, not infrastructure, decide what AI costs in your org. The prompts, instructions, and workflows your team feeds to Claude, ChatGPT, and your agents determine which model runs, how many tokens burn, and whether the output is even usable. Skills are the unit where cost, quality, and security all get decided. And right now, in almost every GTM org I see, skills are markdown files living in someone’s Slack DMs.
The wild west of AI exploration
The cost problem is the visible one. There are three quieter problems compounding underneath it.
Nobody is testing anything.
Software engineers learned long ago that untested code is broken code — you just don’t know it yet. The dev world is already applying this to AI: Tessl runs task-based evaluations on agent skills and found that evaluated, optimized context produced up to 3.3x improvement in agents using APIs correctly. Now ask yourself: who in your GTM org has ever run an eval on a prospecting skill? On an account research prompt? Anyone? In GTM, “testing” a skill means one person ran it twice and it seemed fine. We’re shipping un-evaluated skills to entire sales teams and wondering why output quality is inconsistent.
Everyone gets different outputs.
When ten reps each write their own version of “research this account before my call,” you get ten different research standards walking into ten different meetings. The whole point of GTM engineering is building systems that make the team’s floor higher than any individual’s ceiling. Unversioned, unshared skills do the opposite — your output quality becomes a lottery based on who happened to write the best prompt.
Everyone is rebuilding the same thing.
I see this constantly. Someone builds a great ICP-scoring skill. Three weeks later, someone two pods over builds a worse version of the same thing because they didn’t know the first one existed. In software, we solved this with package managers — npm, pip — so nobody rewrites a date-parsing library from scratch. GTM has no equivalent. Good work doesn’t travel; it just gets duplicated, badly.
Your team needs an AI skill marketplace
This is the thesis: every company is going to need an internal marketplace for AI skills — a governed, versioned, measurable registry — the same way every company eventually needed a package manager, a CI pipeline, and a FinOps function.
Why?
There’s a structural paradox at the center of modern GTM teams: the gap between what the best rep on the team knows how to do and what the average rep actually does is enormous — and it’s getting wider as AI tooling proliferates.
Tacit knowledge — intuitive skills learned through first-hand practice — is the hardest type to capture and scale. The best practices and tricks that high performers develop through years of customer interactions live inside those individuals, not inside any system. The enablement challenge is codifying that expertise and sharing it across the team.
Meanwhile, AI has made the technical surface area of GTM work dramatically larger. While 81% of sales teams are experimenting with AI, only 26% can scale it beyond pilot programs to generate real ROI. McKinsey reports that 60–70% of tasks performed by sales reps are technically automatable — but the gap between potential and performance is an operator problem.
This is the structural need for an “internal skills marketplace” that addresses: not a training catalog, but a living library of executable workflows that any rep or operator can pull from and deploy.
The skills gap is quietly widening
92% of sales professionals already use AI in some form, yet 84% say the main benefit is saving time or optimizing processes. That’s table stakes. The reps pulling ahead have moved past time savings into genuine competitive advantage. The gap comes down to AI fluency, not tool access. The top skills that separate them are prompt engineering, AI output verification, and data literacy. These are almost certainly the ones most companies ignore.
Is anyone building this right now?
The dev world is building this right now. Tessl is essentially npm for agent skills: install skills as versioned dependencies, track them in a manifest, run evals before they ship, scan them for security issues, and push updates across the whole team instead of copying markdown files between repos. The framing they use is the right one — skills are software, and software needs a lifecycle.
Skills are the new code
GTM should steal this wholesale.
Here’s the minimum viable version:
A single source of truth. One repo (or Notion database, or shared directory — the tool matters less than the discipline) where every team skill lives. If a skill isn’t in the registry, it doesn’t exist. Kill the Slack-DM distribution model.
An owner per skill. Every skill gets a named maintainer responsible for updates when the underlying tool, model, or playbook changes. Orphaned skills are how you end up with reps running a six-month-old prompt built for a model that’s been deprecated.
A model recommendation baked into every skill. This is the cheapest fix with the biggest cost impact. Each skill should declare which model tier it needs: fast-and-cheap for summarization and formatting, mid-tier for drafting, flagship only for genuinely hard reasoning. Most GTM skills don’t need the flagship — saying so explicitly, inside the skill itself, beats sending anyone a memo about token budgets.
Lightweight evals before anything ships team-wide. You don’t need an eval platform to start. Take 10 real inputs — 10 actual accounts, 10 real transcripts — run the skill against all of them, and have the skill’s owner grade the outputs against a simple rubric. If it can’t pass that, it’s not ready for 50 reps. This one habit alone puts you ahead of nearly every GTM org I’ve seen.
Versioning, even if it’s crude. v1, v2, v3 in the filename beats nothing. When a skill gets improved, the team adopts the improvement automatically — not whichever copy they happened to save. This is also how messaging changes ship: update the pitch skill once, and the whole team is running the new play tomorrow instead of next quarter.
What drives better outputs (relative to cost): context or model?
Tessl used it’s platform to answer this question by comparing Claude Fable 5 vs Opus 4.8 within the context of skills.
It’s a great example of why all of this matters.
Every scenario in the evaluation is a real agent task tied to a published skill, scored on two axes: instruction-following (does the agent do what it was told, in the way it was told) and task-completion (does it reach the goal). The overall score weights instruction-following at 4 and task-completion at 3, then divides by 7. Each task runs with and without the skill, so the lift from the skill is visible directly. The tasks and skills are public, in the task-evals-for-skills dataset, so you can inspect any scenario yourself.
This design is deliberate. The tasks come from published skills, so they mirror the real work teams write skills for, not frontier puzzles meant to find a model’s ceiling. That is why task-completion runs high for both models and why the signal that separates them is instruction-following: doing the work the specific way the skill asks. — from “Claude Fable 5 vs Opus 4.8: The Mythos Hype Meets Reality”
How a skill marketplace helps your team
The registry is the infrastructure. The value shows up in three places — and conveniently, they map to the three people who need to approve this.
For your VP Sales: day-one rep ramp.
Today, a new rep spends their first month reverse-engineering how the best people on the team work — shadowing calls, begging for prompt screenshots, slowly assembling their own janky toolkit. With a skill registry, day one looks different: the new rep gets the full team loadout auto-installed. The account research skill, the call prep skill, the follow-up writer — all the same versions your top performer runs. Ramp time has always been gated by how fast knowledge transfers from senior reps to new ones. A skill registry turns that transfer from months of osmosis into a download.
For your CRO: top-performer extraction.
Every team has one rep whose call prep is unreasonably good. Right now, that advantage retires when they do. Turn their process into a skill, eval it, publish it, and their edge becomes the team’s floor. This is the highest-ROI knowledge management most GTM orgs will ever do, and almost nobody is doing it.
For your CFO and legal team: governance by default.
Model recommendations baked into skills fix the default-to-expensive problem without policing anyone. And legal approves the claims in the outreach skill once, instead of auditing a thousand freelanced prompts after a rep promises something the product doesn’t do. For regulated industries, this is the only realistic way to let reps use AI at all.
The companies that built FinOps muscle early saved millions while their competitors argued about whose AWS bill it was. The same window is open right now for AI skills in GTM — and it’s wide open, because almost nobody is doing this yet. The first RevOps or GTM engineering leader at your company to stand up a skill registry will cut model spend. They’ll also end up owning rep ramp, playbook velocity, and the layer where every AI workflow in the org gets built.
The biggest impact?
Everyone operating at a higher level of execution with the same sharpened tool set.
Subscribe for upcoming content:
How to replace ZoomInfo with Clay
Marketing use cases with Clay you’ve never heard of
How to do context engineering in Clay (or for any agent) to get better outputs
GTM Engineering Benchmarks









