Perspectives in Spec Driven Development

There’s this great weekly online morning meetup I join when I can called “The Secrets of Product Management”, led by Nils Davis. Recently the topic of Spec Driven Development came up.

Full disclosure: I didn’t take notes in the meeting and there were a lot of concepts and thoughts shared verbally and in the chat. Some of what I recall may be off, and I hope that if anyone present reads this and has a better recall that they share their thoughts in the comments.

Some thought it was about a Product Manager gathering all of the specifications for a product in advance, and that it led back to waterfall style processes.

Some thought it was building a Proof-of-Concept to serve as the specification.

By the end of the discussion, the one thing everyone (mostly) agreed on is that it works much better when done iteratively, and includes direct references to standards.

As an architect who still codes, my understanding of SDD is that it is about the spec files that are carefully crafted to direct generative AI in how to write the code. It is a way to get better code from the AI that will require less refactoring after the first results.

The different perspectives made me think it was worth doing a little research and summing it up for my reader here. I admit to mentally vacuuming up a lot of content about AI in order to feed my own synthesis on its use, and the key thing that I saw differently was the ownership of the specification used for SDD.

When I presented the question “Who owns the spec in spec driven development?” to an AI, it responded with “…humans own the spec…”, which points out a whole new perspective.

So, that’s what drove me to dig in a little bit to improve my own understanding and share the results.

A Quick History Lesson

Like most things in IT, the earliest signals appear long before what we later label as “modern” computing (a term that conveniently tends to align with when each of us personally got excited about technology). As far back as 1987, Managing the Development of Large Software Systems: Concepts and Techniques outlined ideas that closely resemble what many now think of as Specification-Driven Development (SDD). Interestingly, its diagrams reflect structures similar to waterfall methodologies (an ironic reminder that many “new” ideas are refinements of older patterns rather than entirely novel inventions).

These concepts did not evolve in isolation. Over the following decades, they were reinforced by related disciplines such as formal methods and API design principles like Design by Contract* (which emphasized precision, verifiability, and clearly defined interfaces). Later, approaches like Behavior-Driven Development (BDD) carried some of this thinking forward, framing specifications as shared artifacts between humans and systems (but still largely as guidance rather than execution).

What has changed more recently is the role of AI in making specifications actionable. Around 2025, tools began to emerge that transformed specs from passive documentation into active drivers of implementation. Projects like AWS Kiro and GitHub’s spec-kit marked a shift. Specifications became executable guides for coding agents (not just references for developers). In this sense, “modern” has continued to compress (moving from spanning decades to evolving almost in real time), as specs shift from descriptive artifacts to operational components of the development process.

Opinions Still Differ

I don’t think my input in the recent conversation changed anyone’s mind about how they define SDD. And people will definitely have strong opinions on the value of SDD.

In a recent post, Allen Holub said:

“People talk about spec-driven design, but the best spec you can have is a test—a test you write before you write the code. You don’t write a test to see if the code adheres to a spec. The test IS the spec. Don’t write specs. Write tests.”

I agree with TDD proponents, because it is part of a Continuous Testing cycle, a process that was just starting to really catch on before GenAI went GA, and is even more important since. That said, tests are part of the spec, they are just managed a little differently because the developer doesn’t happen to be human. That’s the whole point of SDD. It is how developers work with agents through clear communication. Because, let’s face it, the Agile approaches of sitting with a user won’t work with AI until after the code has been written, and pair-programming with an AI was only modern for a moment.

Helpful Tools to Try

Tools make this less painful than it sounds.

GitHub’s spec-kit is a good entry point. It gives you Markdown templates for a “constitution” file with principles, then spec.md, plan.md, tasks.md. You slash-command it in your IDE, and AI fills in the gaps. They put it well: “The specification captures intent clearly, the plan translates it into technical decisions.” ([GitHub Blog], Spec-driven development with AI: Get started with a new open-source toolkit) Amazon’s Kiro does staged workflows, Tessl flips code to byproduct. Red Hat talks up “lessons learned” files to feed back into future specs, cutting errors over time ([Red Hat Developers], How spec-driven development improves AI coding quality).

Wrapping Up

All in all, my sense is to treat specs like your IaC or database schemas. Human owned from the start, iterated carefully, governed with some structure. Reference standards to ground it. Try it small, on a utility script maybe, and see how it holds up in real work.

If it fits your flow, it can add real velocity with AI. If not, no big loss; plenty of paths forward.

*Side note: Yes, I usually have these inline in parenthesis (a habit my AI editors hate), but this one seemed too long for that, so… I did some research with Gemini where it insisted on a correlation between design by contract and spec driven development, which at first I took to mean it prefers its training data rather than current information, so I switched to my usually research LLM wrapper, Perplexity. After some hind-brain thinking, it occurred to me that Gemini may have semantically equated specification with contract, which is another quirk of AI: it is so darn literal!

If you found this interesting, please share.

© Scott S. Nelson

Take the Tax out of Taxonomy

TL;DR: Your GenAI output is failing because your local workspace is a disaster. If your desktop is a dumping ground, your enterprise data lake is guaranteed to be a swamp. Stop blaming the model, establish a strict folder taxonomy, and kill the bad data habits before they scale.

For my regular reader, you know I can’t resist a pun, and the initial research note for this post was “Timely topic title: Take the Tax out of Taxonomy”. You also know I digress, so I thought I would get it out of the way at the start. Done. Moving on to the next level…

You are paying a massive hallucination tax. You bought a premium AI subscription or deployed a desktop agent. You pointed it at a project directory full of deprecated drafts, unstructured notes, “versioned” files, maybe even some sample code. Now the AI is confidently generating output based on requirements from three years ago, and maybe Wednesday’s lunch order.

The AI users assume a better foundation model or highly complex prompt engineering will fix output inaccuracy. They will not. According to the research paper A Comprehensive Taxonomy of Hallucinations in Large Language Models, hallucinations are not merely a bug, but a theoretically inevitable feature of computable LLMs, irrespective of architecture or training.

You cannot patch out hallucinations with a clever system prompt. You have to restrict their oxygen.

Generative AI operates entirely on the context it is fed. When you open a workspace (or upload a zip file, or point it at SharePoint), the model uses the folder structure to understand relationships. It assumes every file in the provided directory is equally valid, current, and relevant.

To get faster, accurate output, you must adopt a standardized, hierarchical folder taxonomy. This is not a housekeeping chore. It is a strict data contract for your AI. The academic consensus supports this structural approach. As outlined in A Systematic Framework for Enterprise Knowledge Retrieval, transforming a static blob of data into a navigable, context-rich knowledge architecture significantly improves model accuracy and reduces retrieval latency.

The Prep Station Metaphor

Think of an LLM as a highly skilled line cook with zero short-term memory. If you ask the cook to make an omelet, but point them to a kitchen counter where the fresh eggs are mixed in a pile with old receipts, bleach, and rotten produce, the resulting meal will be toxic.

You have to prep the station before you ask for the work.

This requires changing how you manage your local environments. You must segment your files and organize your folders explicitly by client, project, or (and sometimes “and”) specific activity. When the AI opens that specific folder, the taxonomy forces it to focus strictly on the given task.

The Micro-Macro Data Contagion

Local file structures often mirror enterprise data architecture. If your team’s shared drive is a chaotic dumping ground of nested, unnamed folders, your enterprise data lake is likely more of a content swamp.

Organizations often fund massive, top-down enterprise data transformation projects. They deploy tools to wrangle petabytes of unstructured data. Consultants are brought in to describe how it should be done, walk you through clean up, and leave you with a perfectly indexed wiki on how to maintain it.

The reason other organizations don’t do this kind of clean up? Aside from the few that don’t need it, the rest have someone recruited from an organization that did need it, then did it…then did it again a few years later. At least some had the excuse of acquisitions as a cause. The rest just forgot to make being organized part of the organization’s culture.

The report What Is Data Taxonomy? Use Cases & Best Practices points out that taxonomy programs do not fail because the classification structure was wrong. They fail because nobody owned it after launch, or the controlled vocabulary was written for data engineers rather than the business users who needed to adopt it. A taxonomy that nobody actively owns becomes outdated within twelve months.

If you build a pristine enterprise knowledge graph but your teams still save raw client notes to a local desktop folder named “Misc”, your clean data architecture will erode. Bad habits always defeat good infrastructure.

Start locally. Expand globally. Treat your team’s shared folder as the training ground for enterprise AI.

Here is the implementation baseline for engineering a reliable folder taxonomy.

  1. Force Local Discipline: The guide Document Taxonomy Simplified notes a critical reality: AI can read full text, but without consistent indexing and classification, it has a harder time understanding which documents are current or relevant for a specific question. Humans must define the taxonomy. Organizations that rely solely on AI risk amplifying bad data.
  2. Build a Strict Domain Hierarchy: Segment folders strictly by project and lifecycle status. Your AI should never have read access to a “Drafts” directory when you are asking it to write production documentation.
  3. Establish the Data Contract: Metadata like document type, owner, client, date, and status tells AI not just what a document says, but how it should be used. This context improves AI ranking and reduces irrelevant hits that happen to share keywords.
  4. Separate Human and AI-Native Formats: Segment your directories into files meant for humans and files meant for the AI. Lean towards using markdown files, text files, and CSV files for AI consumption. Keep heavy, formatting-rich files in a separate reference folder that the AI does not scan unless explicitly commanded.
  5. Isolate Contextual Boundaries: Open-ended prompts can generate answers that blend multiple disciplines or outdated content. When your library is indexed by department, project, and lifecycle stage, AI can narrow its focus and answer questions within the right slice of your information.

You’ll note that there is a lack of solid reference examples of good taxonomies. This is, again, related to data cleanliness being driven by culture. The same taxonomy may or may not work for another organization. But a solid taxonomy based on how the organization thinks and processes can easily be maintained through training, communication, and the occasional reminder (preferably automated).

The ROI of a strict folder taxonomy is immediate. Output precision goes up. Token waste goes down.

If your AI is only as reliable as the context it receives, your unstructured file storage is an active threat to your workflow. Build the hierarchy locally. Clean up the directories.

Credit to Dylan Davis‘s video 5 Changes That Make ChatGPT & Claude 10x Better for sparking this research.

If you found this interesting, please share.

© Scott S. Nelson

Learning AI and Going Broad or Deep First

It depends.

Yeah, I hate that answer, too, but it’s because we all prefer simple answers to the real ones. We also want to believe in overnight success, one size fits all, and that the plug and play option is all we need. And don’t get me started on long term weather predictions!

But it helps to know what it depends on, which is, in this case, where you are in your AI journey. The same approach really applies to any learning journey where there are multiple aspects, so we’re going to start with looking at it simply from the perspective of learning.

Why and How and When to Start Deep

If this is your first foray into a new realm of knowledge, start by going deep on one aspect.

Pick the area that you are most interested in. Intrinsic motivation is a better driver for learning than any reason that includes “have to”. Once you have picked that topic, dig in and follow your curiosity until you feel you can converse freely on the topic. This is how you build a baseline mental model.

Going deep in one specific corner makes other adjacent areas easier to absorb later because you actually have a frame of reference to hang new information on. When you encounter a new tool, you filter it through existing mental models to facilitate integration of new knowledge. This cognitive filtering means you aren’t starting from scratch every time a model updates. You are simply updating a specific branch of an existing tree. (See The Memory Paradox: Why Our Brains Need Knowledge in an Age of AI)

The Pivot to Breadth: Mapping the Landscape

Once that baseline mental model exists, going broad is more valuable.

This first accumulation of breadth is to understand what’s possible, or available. You aren’t trying to master everything. You’re mapping the space so you know where the boundaries are. This aligns with the “T-shaped professional” model, defined by having deep expertise in a specific area while also possessing broad knowledge across various disciplines. This structure ensures you have enough technical depth to contribute high-value work immediately. It also gives you enough breadth to collaborate with experts in adjacent fields without needing a translator.

Going broad makes it easier to know exactly when and where to go deep next. Knowing what exists and what is possible makes it easier to say “I have an idea of how that can be done” with conviction.

The Trap of Constant Depth

The problem with going deep on one thing at a time, after the initial deep dive, is that when you need the knowledge or skill in a practical situation, there may be something adjacent that will make it easier or is better suited.

If you’re buried in a single silo, you won’t see it. This is why pure specialists struggle when their niche technology shifts. Markets move faster than individual mastery, which is why modern engineering organizations must embed specialists into existing teams. Breadth prevents you from becoming a legacy asset the moment your specific tech is disrupted. It provides the foundation for transferring implicit knowledge, which is the exact kind of knowledge needed to generate creative ways of tackling business problems. Innovation happens at the intersection of two unrelated fields.

Managing the Hierarchy of Ideas

To move between breadth and depth effectively, you have to understand how to categorize information. A  practical framework to understand how to conceptualize those categories in a given realm of knowledge is The Hierarchy of Ideas. This concept allows you to mentally zoom in and out of a topic. It ensures you are always operating at the exact level of detail the current problem requires.

Think of a hierarchy using transportation as the frame. At the top, you have the abstract concept of “Transportation”, which includes planes, boats, trains, cars, skateboards, and ox carts. Moving down a level, you find “wheeled vehicles”, which is still broad enough to encompass trains and scooters. Further down, “Cars” will include internal combustion, electric, and peddle powered. As a mechanic, you will be more interested in learning the distinctions between a Ferrari and Hyundai, or between the Sonata and Kona. The higher you go, the more general and broad the idea becomes. The lower you go, the more specific and detailed it gets.

Navigating this hierarchy is done through “chunking.” When you chunk up, you move from the specific Tesla to the broader category of “Transportation” to understand the big picture. When you chunk down, you move from the general concept of “Cars” into the specific components like the “Battery Management System” to find depth. You can also chunk laterally, moving from “Cars” over to “Trains.” This allows you to find alternative solutions that exist at the exact same level of utility.

The AI Sandbox: Applying Levels and Chunks

AI is like transportation in the way that zoology is like geology. They aren’t one giant subject. There’s a hierarchy of distinct concepts, applications, audiences, and values that you have to navigate intentionally.

Start by chunking down into a specific primary aspect. Dive into development. If you choose Software Development, don’t just use a generic chatbot. Master how tools integrate directly into the developer’s workflow. Modern development is shifting toward a model where the AI handles low-level syntax. The human engineers and architects manages high-level logic and security.

If you choose Marketing, dive into tools capable of predicting future trends. These platforms move you from general demographic targeting to individual-level behavioral forecasting in real-time. This creates your first deep anchor.

Once you feel steady, chunk up. Skim through news and articles about the overall space so you get a sense of the capabilities. Map the broader landscape—from vector databases to multimodal generation. Staying informed at this high level prevents you from getting blindsided by architectural shifts.

As you build that breadth, you chunk laterally. Then, when something comes up that would benefit greatly from an aspect other than your first specialty, you will recognize that your current focus isn’t the right one in that context. If you are deep in software development but hit a bottleneck in data quality, your broad map will point laterally toward data architecture. You will have a good idea what aspect is better suited. Then you can partner with someone that has that expertise, or learn it deeply yourself, or both.

Effective mastery requires building a foundation deep enough to create your mental anchor, while maintaining a wide enough perimeter to spot the right tools for the job. You cannot specialize into obsolescence and expect to stay relevant in a field that moves this fast. Whether you are ready for your first technical deep dive or you are currently gathering seeds for future growth, the only wrong move is standing still. Pick a starting point and get to work.

For your convenience (plus, I hate to throw away interesting artifacts that AI outputs when researching my articles), below are some areas (i.e., non-exhaustive) to consider when conceptualizing the Hierarchy of Ideas around AI.

AI Deep Dive Reference Table

Specialization Primary Focus Practical Application
Software Development AI-assisted coding and autonomous agents. Tools handle boilerplate code and test generation. This shifts engineering cycles away from syntax and toward system architecture.
Marketing Campaigns Predictive analytics and forecasting. Systems are built for predicting future trends. Marketers deploy these models to adjust budgets preemptively rather than reacting to yesterday’s reports.
Prompt Engineering Advanced linguistics and logic structures. Mastery involves navigating how mental models assist in problem-solving. This discipline structures language strictly enough to force an LLM into predictable reasoning patterns.
Data Architecture AI-ready data pipelines and vector databases. Success requires establishing a comprehensive inventory of everything in your ecosystem. AI models hallucinate when fed fragmented data; clean pipelines act as the essential guardrail against garbage outputs.
Content Creation Generative text, image, and video workflows. AI enables executing multidisciplinary mental models for solving complex problems. Scaling content now relies on curating outputs against a strict brand voice, not writing from a blank page.
Business Intelligence Pattern detection and anomaly resolution. BI teams use AI to deploy real-time anomaly detection. This replaces static dashboards with active alerting systems that diagnose the drop in metrics before leadership even asks.
If you found this interesting, please share.

© Scott S. Nelson

Developers are now Agent Managers: Enter the New Matrix

TL;DR: The use of AI in development has shifted from a coding assistant to a team of agents doing the heavy lifting. This requires developers to skill up in management and forces a fundamental shift in how software roles collaborate.

The Managerial Migration

For those not watching closely (which is most people, and perhaps only a few reading this), the world of software delivery is in the midst of a tectonic shift. The use of AI is evolving from a simple coding assistant to a team of agents, or experts, performing the bulk of the work. This moves the developer role from a traditional software engineer to an agent manager.
The change in role definition and skills is another aspect of the paradigm shift the Age of AI is bringing. If this sounds new to you, it is because you may have missed the transition that the proliferation of the “WWW” subdomain brought to IT in the 90s. We are all going to come out better, but it is going to be a long haul as we re-learn lessons from last time and write new ones for this era.

The Expertise Gap in Management

One common misconception is that this automation means developers can be replaced by Product, Project, or Program managers. This is mostly the “Wall Street Rumor Mill,” which is only just now being revised from “replacing people with AI” to “shifting investment from employees to AI vendors”. At least that is more honest.
The “average” manager often lacks the technical depth to write a precise specification or review the complex output of an agentic workflow. Managing a digital workforce requires the same technical understanding and focus as writing source code. If you cannot perform a rigorous technical review of what AI agents produce, you should not put it into production (unless you suffer from a terminal case of technical hubris).

The Developer Drift

While many managers lack the depth to take over, developers are not guaranteed success in this new role without learning to view technical problems from different angles. Many developers tend to drift from the business context without a reminder, which is why lifecycle ceremonies exist to gather feedback from users and owners. For some, this is a forest-versus-trees effect, while for others, it is the temptation of a “cool” approach over a practical solution.
The speed of AI can take a minor gap in understanding and expand it into a costly chasm. When an agent can produce a week’s worth of logic in seconds, the cost of moving in the wrong direction scales exponentially. The team must find a way to collaborate where agents are a factor beyond just a tool choice.

Grit over Gift

This proficiency is not a magic gift: it is a byproduct of learning, practice, and pushing boundaries. There is a persistent myth that “prompt engineering” is an inherent talent or a shortcut for the lazy. It is actually the opposite. Real proficiency comes from hundreds of hours spent in invisible iteration. You have to break the agents to understand how to fix the workflow. These skills are then applied to context engineering, where the developer becomes the manager and the back-and-forth transitions to a human-in-the-loop system.
Deep experience can sometimes trigger intellectual rigor mortis, where you stop looking for a better way because you already know the “right” way. To succeed now, you need the grit to unlearn habits that are no longer efficient. High ROI in the age of AI belongs to the person who pushes boundaries through practice, not the one waiting for the “perfect” model to arrive.

The Practical Pivot

As we navigate this “.ai moment,” leadership, managers, and developers need a new way to interact. It is no longer about a ticket hand-off: it is about real-time orchestration.
  • Developers: Start treating your AI tools as interns, not calculators. An intern needs guidance, a clear spec, and a rigorous peer review. If they produce garbage, it is a reflection of your management. Mentor your agents by providing better context and documentation.
  • Managers: Help leadership understand that the “silver bullet” still requires expert aim. AI is a force multiplier, but it requires a human who knows where to point the barrel. Use these tools to bridge the communication gap, not to eliminate the experts.
  • Everyone: Support each other in cross-training. Incorporate big-picture product thinking with low-level solutioning. Document the new workflows immediately, as your team now includes transient sessions that lack long-term memory.
Incorporating this new layer requires new connections, shifts in responsibility, and overlaps that act as double-checks from different perspectives rather than simple redundancies.
If you found this interesting, please share.

© Scott S. Nelson

The Beginners Mind Stands on a Foundation

TL;DR: Experience is a liability when it kills curiosity. AI proficiency is a byproduct of hundreds of hours spent in invisible iteration.

The Expertise Trap

Deep experience often triggers intellectual rigor mortis. You have seen the “right” way to do things for a decade, so you stop looking for the better way. A beginner mindset is not about being a blank slate (which is just another word for useless). It requires enough of a foundation to know when you are headed in the right direction, or perhaps a parallel path with new perspectives, before getting back on track.

If you have twenty years of experience but no curiosity, you are just a legacy system waiting for a decommission date. The ROI on a “beginner” attitude is higher because it allows for rapid pivoting. You need the basics to provide a compass (to avoid spending days debugging a syntax error), but you need the mindset to explore the paths that lead to breakthroughs.

The AI Sweat Equity

There is a myth that prompting is a low-skill activity. It is not. Most people really good at prompting have iterated and learned. The developers currently running multiple agents and building software 4 to 10 times faster than they did last year have been in months of practice to get there.

This is iteration 0 work. It is messy and mostly undocumented (because the tech moves faster than the README files). What makes it daunting for those first starting is that the people who are now good at it did it without formal training. There is a tendency to forget how much effort went into the initial struggle.

Building the Mental Infrastructure

Learning new technical skills requires toggling between different cognitive states. Barbara Oakley (author of Learn Like a Pro: Science-Based Tools to Become Better at Anything and Learning How to Learn, among other great books) describes this as the tension between focused and diffuse modes. Focused mode is for the granular syntax: the structure of a prompt or a script. Diffuse mode is where the beginner’s mindset lives. It is the relaxed, curious state that allows your brain to make the non-linear connections required to solve a problem that does not have a documentation entry yet.

She emphasizes chunking: breaking complex concepts into small, functional units until they become second nature. This prevents cognitive overload when the system throws an error you have never seen before. Curiosity is a tool that keeps you in the diffuse mode long enough to see the “big picture” before diving back into the details. I took her class on Coursera at the start of my AI journey, and I recommend everyone do the same, even if your interests are in other areas. It applies to learning anything, and you will thank yourself for doing so.

They say “oh, it’s easy, you just do this”, which looks like magic to the beginner. It is not entirely different from visiting a new area and asking a local for directions. Every time they start with “Oh, that’s easy”, there is a good chance you are going to get lost following their directions.

Locals navigate by landmarks that either do not stand out to an outsider or have disappeared from all but the local’s memories. They tell you to turn where the oak tree used to be or past the shop that changed names five years ago. They have internalized the route so deeply they forget the friction of finding it the first time.

And sometimes, that makes the trip more fun. Stop looking for the “perfect” prompt or the “right” workflow. Spend more time being “lost” in the tool. The goal is not to avoid the detour: the goal is to have a strong enough foundation to know how to get back to the main road once the detour stops being productive.


A Simple Roadmap

If you haven’t begun your journey with Generative AI, or feel a bit lost, here’s a simple roadmap to help you along:

  1. Pick one model and stay there: Stop comparing benchmarks and just use one tool (Claude, GPT, or an LLM via API) for a week straight to understand its specific “personality.”
  2. Iterate on a single prompt 50 times: Don’t just accept the first output. Change one variable at a time until you understand exactly what triggers a hallucination vs. a logic block.
  3. Read the system prompt documentation: Most users treat AI like a search engine. Read the actual technical guides on “system roles” and “temperature” to understand the controls.
  4. Practice manual orchestration: Before you try to automate a multi-agent system, act as the agent yourself. Copy the output of one model into another and manually fix the “gotchas” in between.
  5. Fail on purpose: Try to make the model break. If you don’t know the edges of the tool, you won’t know when you are standing on a cliff.
If you found this interesting, please share.

© Scott S. Nelson