Meet Deadlines and Manage Technical Debt with AI-Assisted Architecture

tl;dr: New platform. Deadline. The instinct is to move fast and clean it up later. That’s where technical debt is born. A well-constructed Claude project, loaded with curated platform documentation and queried with the experience to know what to ask and how to evaluate the responses, tactically compresses the ramp-up without sacrificing strategic design principles.


The Sharp Fork in the Road

Every architect and engineering lead who has given a project to deliver on a new platform or with new technology under deadline pressure knows this fork. Pushing for proper preparation can get you marked (ironically) as a risk from the leadership perspective. Plowing forward using old techniques without understanding the new nuances keeps you up at night…either knowing you are missing something up front, or fixing what you didn’t know during the final death-march phase of a waterfall project that just happens to use Kanban boards, daily stand ups, and sprint ceremonies.

One path: move fast. Learn just enough to ship. Ask support when you hit a wall. Request exceptions when you hit limits. Get it working and tell yourself you’ll revisit the architecture when there’s more time. (There is never more time.) What you build in that mode becomes the foundation everything else is built on, and the cost of fixing it compounds with every sprint.

The other path: slow down. Read the documentation properly. Understand the platform’s constraints before you design around them. Make the right call the first time. This is correct and often impractical. Deadlines are real. The platform is new. The documentation is dense. The team is waiting.

The Contentstack project that prompted this post took the first path and ran into a SaaS governance constraint that happens to be measured recursively. The first time it was hit, the response was typical for teams working with a new SaaS vendor and release date that was set before the first line of code was written: Ask for an exception. Which was granted, hit again and raised again. Fortunately, the third time it happened, an experienced vendor support manager recommended reviewing best practices to avoid the issue. And an experienced architect was on the receiving end of that suggestion, one who had previously dealt with a Salesforce solution that went down three months after launch from relying on similar exceptions.

This post is not about Contentstack architecture. It is about the challenge many teams face with balancing target dates and defensive design decisions, and a tool set to apply in order to keep from tipping too far in either direction.


Claude as a Platform Research Partner

Giving Claude access to a curated set of platform documentation and then working interactively to explore solutions is not a replacement for architectural experience. It is an accelerant for it. It is also not a way to do away with architects or the inclusion of design tasks at the feature or story level. It is how to fulfill the expectation that AI can provide ROI immediately when applied by experienced technologists.

These distinctions matter. It’s never about “ask Claude what to do” (because if you need to ask “what” you won’t know how to ask “how” when the time comes). It is “I understand how systems like this behave, I know which constraints are likely to compound, and I need to move through the analysis faster than I could alone.” Experienced architects and engineers bring the judgment: familiarity with how content models fail at scale, how schema resolvers typically handle recursion, how vendor-imposed limits usually reflect real constraints in the underlying system. Claude brings the recall, the scripting, the cross-referencing, and the tireless patience for the kind of recursive schema analysis that would take a senior engineer the better part of a day.

For those that follow my posts you know that I will often describe theoretical solutions backed by a combination of personal experience where they would have worked linked to examples from others who demonstrated that they work. In this case the experience came before the theory, working backwards from a result where I noticed the process while documenting the solution (because, hey, that is what architects do after they solve something).

The working example was with a Contentstack implementation. It took one focused 2-hour session to identify an obscure root cause, define a strategic solution, discover other areas to apply the same solution, and identify where the solution would cause more harm than good. A second 30-minute session was applied after the first round of refactoring to validate the impact and prioritize the remaining effort. Before Generative AI, this would have been several days of effort that would not have been attempted until the risk was realized in production delay.


The Project is the Architecture

Before a single question gets asked, the project has to be built. This is not setup overhead. This is the work.

A blank Claude chat window and a well-constructed project will give you very different results on the same question. The difference is not the AI. It is the knowledge boundary, the taxonomy, the instructions, and the accumulated session output. Strip those away and you have a general-purpose assistant guessing at context. Keep them and you have something that behaves like a senior researcher who has been on the project for months.

What goes in the project folder:

Downloaded documentation as markdown files, not links. Links go stale, require fetches, and introduce latency. Pull the platform docs that matter, save them as markdown, put them in the folder. For Contentstack: the Global Fields limitations page, the Content Modeling Best Practices guide, the Custom Fields documentation. Not every page in the docs. The ones relevant to the work. Knowing which ones matter is the first place architectural experience shows up.

Actual data from the platform. In this case, exported stack JSON. Claude can read it directly in the sandbox, run scripts against it, and cross-reference findings against the loaded documentation in the same session. That combination of curated docs and live data is what makes the diagnosis precise instead of speculative.

Session summaries. After each working session, have Claude produce a structured summary: the original problem, the conclusions, the evidence, the next steps. That file becomes the cold-start document for the next session. You don’t re-explain the context. You hand Claude the prior session’s output and continue. The knowledge compounds.

At some point (again, much of this requires human intuition gained through real-world experience), have Claude work with you to turn the summaries into a skill scoped to the specific platform, technology, or tool so that when they are in context these lessons learned will be applied automatically going forward.


The Taxonomy Is Not an Afterthought

Separate downloaded reference content from working session output. Nest folders by topic. /reference/, /sessions/, /data/ serve different purposes and should live in different places. This is not pedantry. It is how you make the project instructions work correctly, and how you find things six weeks later without rebuilding context from scratch.

If the platform has extensive documentation, don’t try to enumerate allowed URLs in the project instructions directly. Create a reference-urls.md, or per-topic files like contentstack-docs-urls.md, with an annotated, categorized list of approved sources. Claude works from the list. You maintain the list. It stays current and searchable.

The discipline compounds the same way the session summaries do. A well-organized project from session three makes session fifteen faster than session one.


The Project Instructions Are the Rules of Engagement

The instructions define how Claude behaves inside this knowledge space. Three things they need to do:

Challenge assumptions. If a question implies something not supported by the loaded documentation, say so. Don’t fill gaps with plausible-sounding answers. The most dangerous thing a research assistant can do is answer confidently on insufficient evidence. This instruction eliminates a whole category of hallucination risk before it starts.

Point out mistakes. If the framing of a problem is wrong, say so. This is the instruction most people skip and then complain about later. You want an assistant that pushes back, not one that validates your bad hypothesis and helps you build a case on sand.

Limit web searches to specific URLs. Unconstrained web search in a technical investigation introduces noise: outdated content, inconsistent sourcing, SEO-optimized answers that aren’t accurate. Lock it down. Specify which domains are permitted. For a Contentstack project, that’s contentstack.com/docs. Everything else requires explicit permission. If the approved URL list is long, store it in a markdown file in the project folder and point the instructions at it.


This Requires an Architect

Here is the part that does not get said enough.

You cannot point Claude at an unfamiliar platform, load a few docs, and expect it to diagnose architecture problems. You can try. What you’ll get is fluent, confident, and partially wrong.

There are many engineers capable of setting this up. The value of an architect doing the work is separation of concerns in roles. The architect’s role is to nail down processes and choices that allow engineers to focus on the best way to apply them.

In our Contentstack use case, the single session worked because the person directing it brought a deep understanding of adjacent technologies and the experience to know both what to ask and how to evaluate the responses. Specifically:

  • Recognizing that the error message pointed to a schema limit, not a code problem, because that’s how content platform resolvers typically surface constraint violations
  • Understanding that “recursive” in the documentation meant multiplicative compounding, not additive, based on how similar systems handle nested references
  • Knowing the fix had to leave the content model intact for authors, which ruled out several otherwise obvious approaches
  • Reading a Claude-generated Python script’s output and recognizing that the confident result provided the first time was due to looking in the wrong parts of the schema
  • Looking at a before/after instance table and determining whether the fix was actually complete or just moved the problem

None of that knowledge lives in the documentation itself. It transfers in from adjacent experience: content modeling, schema design, how platform resolvers work under the hood. Claude surfaces the platform-specific detail. The architect determines what it means.

The tool doesn’t replace experience. It supercharges it with speed and specific knowledge.


The Interaction Pattern

What the Contentstack session actually looked like, stripped of the platform specifics:

  1. State the problem. Provide the evidence: the error message, the exported schema, the documentation.
  2. Claude generates a hypothesis. Test it against the data.
  3. Diagnostic script written and run in the sandbox.
  4. Root cause confirmed. Fix designed. Impact predicted before any schema changes are made.
  5. Fix implemented. Follow-up session loads the new export and verifies the result.
  6. Summary file created. Next session’s candidates identified.

No magic. An architect with relevant adjacent experience, a fast and patient research partner, and a well-stocked project folder.


Prompts That Did Actual Work

These are worth examining because the techniques transfer to any platform.

“Describe in detail the cause of home_page_template having 24 instances, and instances of what?”

The second half of that question is the important part. Asking Claude to define what it is counting before giving the count forces precision on both sides. In technical sessions on an unfamiliar platform, jargon can mask shallow understanding without anyone noticing until the fix doesn’t work. The ability to ask that follow-up, to know that “instances” needed a definition before the number meant anything, comes from having debugged similar problems elsewhere. Use this pattern whenever an answer could be technically correct but operationally ambiguous.

“Create a summary file to feed to the next analysis session that includes the conclusions from this session combined with the original inputs. Format and sequence the file so that the next session can be as efficient as possible.”

Besides being familiar with adjacent technology, experience solving complex issues with Generative AI is why this is an approach for architects and engineers. Yes, Claude will now start compacting sessions on its own to improve efficiency, but having the sense that it is time to move to a new session is again an area where human experience beats relying entirely on the AI.

This prompt converts a working session into a durable asset. The phrase “format and sequence for efficiency” is carrying real weight: it tells Claude to think about how the file will be consumed, not just what it contains. The output becomes the cold-start document for the next session. Without it, every session re-derives context the previous one already established.

“Read the attached to get full context of the original issue, then review the contents of [folder] and determine if and how the issue has been improved.”

Sequencing does the work here. Claude gets the full prior-session summary before it touches the new data, so “improved” arrives with a precise definition attached. Without that order, it analyzes the new export without knowing what it’s comparing against. Prime with context before assigning the task, every time.

All three follow the same pattern. Context before task. Output format stated up front. It is not a methodology. It is just how you would brief a colleague who needs to be useful on short notice.


The Setup Is the Differentiator

Two teams, same platform, same error.

Team A has Claude. No curated project, no loaded docs, no taxonomy, no instructions. They get generic answers that feel helpful until they don’t hold up under the actual constraints of the platform.

Team B has a project built by someone with deep experience in adjacent technologies, content modeling, schema design, API behavior under constraint, who knows both what to ask and how to evaluate what comes back. Downloaded reference docs. Exported platform data. Session summaries that carry forward. Instructions that push back on bad assumptions.

Team B gets a root cause analysis, a fix, and a forward-looking roadmap. More importantly, they get it without accumulating the kind of structural debt that shows up six months later as an emergency.

A Note about Choosing Cowork

What I’m describing is not the typical use case for Claude’s project-based workspace. It is aimed at knowledge workers automating routine tasks: organizing files, generating reports, drafting communications. Productivity stuff. This is not that.

My choice of Cowork is based on my day-to-day work being mostly in documents and decks. This could also likely be done using Claude Code in an IDE for those that prefer that interface.

I became aware of how far outside the lines I was operating when someone asked what tool I was using, I explained it, and I watched the look on their face. You know the look.

I have been here before. I spent years using JMeter for continuous functional and regression API testing, which is not what JMeter is for. JMeter is a load and performance testing tool, and there are entire communities of people who will tell you this. They are correct and also missing the point, because once you understand how JMeter handles realistic randomized inputs and configuration-driven test selection, you end up with one codebase doing the work of four. I wrote about it. People told me I was doing it wrong. The tests kept passing, so.

It is common to analogize the similarities between physical tools and technical tools. “When all you have is a hammer, everything looks like a nail”, and “You can use a screwdriver as a chisel, but you really shouldn’t.” I’ve often used those myself. But the opposite analogies are also true. Most tools can be a weapon, and many tools can have multiple uses. While screwdrivers are still terrible chisels, some are great prybars, hole punches, and, yes, weapons. Same with software. Excel has spellcheck, but I’d never paste text into it before posting to a blog, but I have used formulas to parse text rather than writing a script to apply regex rules because it is faster and just as accurate. Use your tools to the extent of their value, and don’t underestimate their value or your ability to innovate.

If you found this interesting, please share.

© Scott S. Nelson

The Gold Rush Was Never Just About Gold

TL;DR: Most people who chased the Gold Rush didn’t know what they were getting into. They saw headlines about fortunes and stories about how easy it was. Many went because their livelihoods were already threatened. Sound familiar?


Let’s be honest about who the average Gold Rush prospector actually was.

Not a rugged adventurer with a prospecting education and a solid savings account. Not someone who had studied geology or mapped the terrain. The typical forty-niner was a farmer whose crops had failed, a tradesman who had lost his shop, or a clerk who had read a breathless newspaper account and decided a long-shot bet beat a certain slow decline.

The California Gold Rush of 1848 and the Klondike rush of 1896 were separated by nearly fifty years and thousands of miles, but they drew from the same well: economic desperation dressed up as opportunity.

The context matters here, because without it the behavior doesn’t make sense.

The years leading up to the California rush included a global recession following the Panic of 1837, crop failures across the Midwest, and a population of young men with limited options. When James Marshall found gold at Sutter’s Mill in January 1848, the news didn’t just spread quickly, it spread selectively. The people who acted first were the ones who needed it most. Same story in 1896, when word of the Klondike strike reached Seattle and San Francisco during a prolonged economic depression that had pushed national unemployment past 20 percent. The ships heading north were not full of people with a plan. They were full of people with a problem.

Not everyone was running from something. Some were adventurers who wanted something different, or already had a good life and wanted something better. And not everyone coming out of a bad situation went in blindly. What almost everyone had in common were expectations that diverged sharply from how things turned out.

The relevant point is not that these people were reckless. It’s that economic pressure meant the average participant arrived undercapitalized, underprepared, and motivated primarily by someone else’s story of overnight success. They were chasing a headline, not a thesis. The results reflected that, in aggregate, almost immediately.

That pattern matters because it is not a 19th-century phenomenon. It is what every hype cycle looks like from the inside.

Each rush also moved in distinct waves. The rules that determined who succeeded in the first wave had almost nothing in common with what it took to win in the second. Most people who got swept up never stopped to ask which wave they were actually in. That question turned out to matter more than almost anything else.


First Wave: Right Place, Right Time, Right Creek

The first wave of California gold hunters had a genuine advantage. Here is what that advantage actually was. Not superior skill. Not better research. Proximity to the news.

Many of the earliest California prospectors were already in the territory: soldiers, settlers, and tradespeople who heard about Marshall’s discovery within weeks and moved fast. The surface deposits in the Sierra Nevada foothills were accessible, concentrated, and required almost no expertise to extract. A pan, a creek, and a willingness to stand in cold water for twelve hours were the main requirements. In that environment, showing up early mattered more than showing up prepared.

The Klondike told a similar first-chapter story. The initial claims along Bonanza and Eldorado Creeks were staked by prospectors already in the Yukon when George Carmack’s group made their discovery in August 1896. They were not the product of a coordinated strategy. They were in the right place when the right thing happened.

First-mover advantage is real. The people who moved fast in that window got a return no amount of later preparation could have replicated. But the window was short, the geography was finite, and it closed before most people had even heard the news.


Second Wave: The Pan Is Not Going to Save You

By 1852, the dynamics of the California Gold Rush had fundamentally changed. The surface deposits were gone. The creek beds that had yielded fortunes with a simple sluice box were picked clean by the first wave. The second wave arrived to find a very different landscape than the one the newspaper stories had described.

The prospectors who succeeded in the Second Wave did so through entirely different means. Hydraulic mining operations used high-pressure water jets to blast entire hillsides and process material through sluices, yielding gold at scale but requiring capital investment and systematic planning. Geologically-informed prospectors who understood quartz reef formations studied where gold veins actually formed and discovered productive sites where random panning had repeatedly failed. Syndicates pooled resources to fund deep shaft mines that reached deposits unreachable by individual surface workers.

Preparation was no longer an advantage. It was the entry requirement.

The Klondike replicated this pattern almost exactly. By the time the mass wave arrived in 1898 after a brutal trek over the Chilkoot Pass, which the Canadian government required each prospector to complete while carrying a year’s worth of supplies, the accessible claims were long staked. The prospectors who completed that crossing and still found nothing with a pan were not unlucky. They were late, and they were underprepared for the wave they had actually entered.

This is also where technology shows up on both sides of the ledger. The Industrial Revolution had already been displacing Eastern tradespeople and artisans for a generation, which goes a long way toward explaining why those gold rushes had the human fuel they did. Factory looms had replaced hand weavers. Steam-powered equipment had displaced skilled craftsmen. The Gold Rush was, in no small part, a downstream consequence of technological disruption seeking an economic escape valve. And then, within the rushes themselves, industrial technology, hydraulic systems, and organized mining operations began displacing the individual prospector. The image of the lone miner with a pan was already obsolete while people were still forming it.


Gold Wasn’t the Only Thing in Them Thar Hills

Some prospectors did strike it rich. The early arrivals at Coloma, the men who staked Bonanza and Eldorado before the word spread, the syndicates that scaled hydraulic operations with enough capital to actually move mountains. These were real winners. Gold was there. People found it. Fortunes were made.

But a parallel economy was running alongside the prospectors, quieter in the moment and, in the long run, more durable.

Sam Brannan did not own a gold claim. He owned a hardware store, and before he told anyone about the gold discovery, he bought up every pick, pan, and shovel in Northern California he could find. Then he walked through San Francisco holding a vial of gold dust, shouting about gold from the American River. He became California’s first millionaire. He did not find a single ounce himself.

Levi Strauss did not mine. He figured out that miners destroyed pants at an extraordinary rate and needed something that could survive the work. He made pants. Generational brand.

Wells Fargo did not mine. They moved money and packages for people who did. They are still here.

The common thread is not that these people were smarter than the prospectors. It is that they studied what the prospectors would certainly need rather than betting on where the gold might be. The uncertain bet was “this particular creek has gold.” The certain bet was “whoever finds the gold will need pants, tools, and a way to move money.” One of those bets required luck. The other required observation.

This path was available in the First Wave and Second Wave equally. It did not depend on timing. It scaled with the rush rather than competing within it. And it generated more durable wealth than almost anyone who was actually in the river.


The Roaring 20’s

Not the flapper and speakeasy era. This is the era of data centers and solopreneurs; dueling model metrics and learning evaluations; digital assistants evolving into personal agents and agentic automation that builds new automation agents. Billion-dollar funding rounds for companies that did not exist three years ago. Job titles that nobody had in 2021, now listed as critical hires. Entire industries trying to figure out if they are the disrupted or the disruptors, and running low on time to decide.

Models released on a Monday that are obsolete by Friday. Consultants who barely knew what a prompt was in 2022, now billing as AI transformation architects. Boardrooms demanding AI strategies before anyone has agreed on what problem they are solving. Vendors with “AI-powered” on the label whether the product has meaningfully changed or not.

The energy is real. The stakes are real. And unlike some previous cycles, so is the underlying technology.

The dot-com boom was real too. It produced Amazon, Google, and the infrastructure of the modern internet alongside thousands of spectacular failures. The AI shift is already demonstrating measurable productivity gains across industries, and the underlying technology is improving faster than most predictions have accounted for. Dismissing it as pure hype is the wrong read, and the people making that call loudest will look exactly like the analysts who declared the internet a fad in 1997.

The problem is not that people are excited about a real thing. The problem is that when real opportunity appears, it activates the same psychological patterns that sent underprepared people over a mountain pass in 1898. The gold rush mentality does not require the gold to be absent. It just requires the promise of gold to be louder than the instructions.

The opportunity is real. The question is whether you are building toward it, or just rushing toward it.


The AI First Wave Already Happened

From roughly 2022 through 2023, companies that moved aggressively into AI-native product development, workflow automation, or customer-facing AI features got real first-mover advantage: lower competition, compounding productivity gains, and a learning curve head start that is genuinely hard to close. Some of this was vision. Some was access. Some was timing. The window was real, and the returns were real.

Most businesses did not catch it. Large organizations move slowly by design, and procurement cycles are not calibrated for technology windows that last 18 months. That is not a criticism. It is a description of how large organizations actually work. (I have been in those rooms. Guilty.)

What it means is that most businesses are now in the Second Wave, whether they have acknowledged that or not.


Second Wave Requires a Different Playbook

The companies treating AI adoption as a First Wave problem in 2025 and 2026 are showing up in California in 1852 with a pan. The accessible value has been captured. What remains requires the methodical approach.

Imagine you could see exactly where your organization loses an hour a day to rework, manual handoffs, and decisions made on bad data. That is what a process audit produces. It is not glamorous. It does not show up in the conference keynote. But it is the difference between knowing where the gold is and hoping the next creek looks promising.

Start there, not with tool selection. Map where time, money, and errors concentrate in your current operations. Identify which problems AI can address with reasonable reliability, and which ones it will make worse by hallucinating confidently inside a business-critical workflow. Run contained pilots with defined success criteria before scaling anything. Build internal AI literacy and governance at the same time you build capability, not after something goes wrong publicly.

Then, only after you understand what AI can reliably do in your specific context, start redesigning processes to take advantage of it rather than bolting it onto what already exists. The order matters. Inverting it is how you end up running hydraulic equipment you do not know how to operate into a hillside you have not assessed.

[True story placeholder: add an example of a project or initiative where the stated plan and the available path did not match, and what it cost to discover that. A rollout, a migration, or a vendor implementation where the “easy button” turned out not to exist.]

Preparation is not glamorous. But it is the entry requirement now. That distinction matters.


The Niche Play Nobody Is Talking About

Here is the thing about Sam Brannan, Levi Strauss, and Wells Fargo: none of them would have been described as gold rush companies.

Brannan was a merchant. Strauss was a dry goods trader. Wells Fargo was an express and banking operation. The Gold Rush was the economic context that made their businesses thrive and scale, but their identity was not “gold rush business.” Their success was driven by the rush. They were not of it.

While the gold rush era was a boon to the merchant class, imagine if technology had been more advanced then. Gold is one of the most effective electrical conductors on earth. It does not corrode. It does not tarnish. It carries signal reliably in conditions that defeat most other materials. Today it is in every smartphone, every circuit board, every aerospace connector, and every implantable medical device. The miners panning those California creek beds were sitting on the raw material for the digital age and had no way to know it. They were chasing the obvious use. The compounding value was in applications that had not been invented yet.

AI is playing the same role for business processes right now, visible to anyone paying attention. It is the super conductor of this moment, not for electrons but for decisions, workflows, and the intelligence buried inside operations that were built for a different era. And just as the real gold economy grew around refining, transporting, and applying the metal rather than simply extracting it, the real AI economy is growing around discovering, implementing, and refining how AI connects to the work that organizations actually do.

Every organization trying to adopt AI will need clean, well-governed data. They will need people who can actually work alongside these tools rather than just technically access them. They will need integration between new AI capabilities and legacy systems that were built for a different era. They will need expertise in figuring out which processes actually benefit from AI involvement and which ones just look like they should.

None of that requires building a foundation model. None of it requires a large AI research budget. All of it requires observation, the same skill that made Sam Brannan wealthy while everyone else was panning creeks.

The businesses that build toward serving those needs may never be described as AI companies. They will be managed service providers, training firms, systems integrators, compliance consultants, data governance specialists. The AI boom will be the context that defines their era, even if it is not the label on their door.

That is not the consolation prize. That is the long game, and it has the most reliable odds.


Your Actual To-Do List

Three questions worth answering honestly before the next AI initiative.

Which wave are you actually in? If you are evaluating AI tools for general business adoption in 2025 or 2026, you are in the Second Wave. The First Wave is not waiting. Adjust your expectations and your approach accordingly.

Are you prospecting or supplying? If you are using AI to improve your own operations, you are prospecting. If you are building toward serving the certain needs AI adoption creates in your industry, you are supplying. Both are valid strategies with very different playbooks.

Are you auditing before you automate? The methodical prospectors of the Second Wave studied the geology before they dug. The equivalent is understanding your current processes, your data quality, your organizational readiness, and your actual use cases before purchasing a platform and announcing an AI strategy.

The Gold Rush did not reward the desperate or the hasty at scale. It rewarded the timely, the prepared, and the observant, in that order, depending on which wave you caught.

The AI boom is running the same playbook. The question is not whether the opportunity is real. It is whether you are building toward it the right way.


WTW Influence Note

Principles applied from Words_That_Work_Reference.md: – Brevity (Rule 2): TL;DR tightened from v2. The long opening paragraph of “The Setup” was split into three shorter paragraphs for better pacing. – Visualization (Rule 8): “Imagine you could see exactly where your organization loses an hour a day to rework, manual handoffs, and decisions made on bad data” in the Second Wave playbook section. Also, the gold-as-conductor passage in The Niche Play gives a concrete physical image before the abstract AI transition. – Aspiration (Rule 7): Forward-looking close added to “The Roaring 20’s.” “That is not the consolation prize” reframes the long game in The Niche Play. Final closing line oriented toward opportunity rather than risk. – Novelty (Rule 5): The gold-as-conductor bridge in The Niche Play section gives readers a genuine “I never thought of it that way” moment, connecting a familiar historical asset to its modern technological applications before pivoting to AI. – Context before claim (Rule 10): The Niche Play now builds through the gold-conductor frame before making the AI claim, rather than asserting the parallel directly.

Not applied / deferred to Scott’s voice: – Personalize and Humanize: WTW recommends specific named individuals in the reader’s demographic. Scott’s voice uses historical examples instead, and the Brannan/Strauss/Wells Fargo structure does that work more effectively for his audience. – Positive beats negative (full inversion): WTW would push harder toward leading with the upside throughout. Scott’s anti-hype voice depends on naming the problem first. Aspiration applied only at section closes.

If you found this interesting, please share.

© Scott S. Nelson

Take the Tax out of Taxonomy

TL;DR: Your GenAI output is failing because your local workspace is a disaster. If your desktop is a dumping ground, your enterprise data lake is guaranteed to be a swamp. Stop blaming the model, establish a strict folder taxonomy, and kill the bad data habits before they scale.

For my regular reader, you know I can’t resist a pun, and the initial research note for this post was “Timely topic title: Take the Tax out of Taxonomy”. You also know I digress, so I thought I would get it out of the way at the start. Done. Moving on to the next level…

You are paying a massive hallucination tax. You bought a premium AI subscription or deployed a desktop agent. You pointed it at a project directory full of deprecated drafts, unstructured notes, “versioned” files, maybe even some sample code. Now the AI is confidently generating output based on requirements from three years ago, and maybe Wednesday’s lunch order.

The AI users assume a better foundation model or highly complex prompt engineering will fix output inaccuracy. They will not. According to the research paper A Comprehensive Taxonomy of Hallucinations in Large Language Models, hallucinations are not merely a bug, but a theoretically inevitable feature of computable LLMs, irrespective of architecture or training.

You cannot patch out hallucinations with a clever system prompt. You have to restrict their oxygen.

Generative AI operates entirely on the context it is fed. When you open a workspace (or upload a zip file, or point it at SharePoint), the model uses the folder structure to understand relationships. It assumes every file in the provided directory is equally valid, current, and relevant.

To get faster, accurate output, you must adopt a standardized, hierarchical folder taxonomy. This is not a housekeeping chore. It is a strict data contract for your AI. The academic consensus supports this structural approach. As outlined in A Systematic Framework for Enterprise Knowledge Retrieval, transforming a static blob of data into a navigable, context-rich knowledge architecture significantly improves model accuracy and reduces retrieval latency.

The Prep Station Metaphor

Think of an LLM as a highly skilled line cook with zero short-term memory. If you ask the cook to make an omelet, but point them to a kitchen counter where the fresh eggs are mixed in a pile with old receipts, bleach, and rotten produce, the resulting meal will be toxic.

You have to prep the station before you ask for the work.

This requires changing how you manage your local environments. You must segment your files and organize your folders explicitly by client, project, or (and sometimes “and”) specific activity. When the AI opens that specific folder, the taxonomy forces it to focus strictly on the given task.

The Micro-Macro Data Contagion

Local file structures often mirror enterprise data architecture. If your team’s shared drive is a chaotic dumping ground of nested, unnamed folders, your enterprise data lake is likely more of a content swamp.

Organizations often fund massive, top-down enterprise data transformation projects. They deploy tools to wrangle petabytes of unstructured data. Consultants are brought in to describe how it should be done, walk you through clean up, and leave you with a perfectly indexed wiki on how to maintain it.

The reason other organizations don’t do this kind of clean up? Aside from the few that don’t need it, the rest have someone recruited from an organization that did need it, then did it…then did it again a few years later. At least some had the excuse of acquisitions as a cause. The rest just forgot to make being organized part of the organization’s culture.

The report What Is Data Taxonomy? Use Cases & Best Practices points out that taxonomy programs do not fail because the classification structure was wrong. They fail because nobody owned it after launch, or the controlled vocabulary was written for data engineers rather than the business users who needed to adopt it. A taxonomy that nobody actively owns becomes outdated within twelve months.

If you build a pristine enterprise knowledge graph but your teams still save raw client notes to a local desktop folder named “Misc”, your clean data architecture will erode. Bad habits always defeat good infrastructure.

Start locally. Expand globally. Treat your team’s shared folder as the training ground for enterprise AI.

Here is the implementation baseline for engineering a reliable folder taxonomy.

  1. Force Local Discipline: The guide Document Taxonomy Simplified notes a critical reality: AI can read full text, but without consistent indexing and classification, it has a harder time understanding which documents are current or relevant for a specific question. Humans must define the taxonomy. Organizations that rely solely on AI risk amplifying bad data.
  2. Build a Strict Domain Hierarchy: Segment folders strictly by project and lifecycle status. Your AI should never have read access to a “Drafts” directory when you are asking it to write production documentation.
  3. Establish the Data Contract: Metadata like document type, owner, client, date, and status tells AI not just what a document says, but how it should be used. This context improves AI ranking and reduces irrelevant hits that happen to share keywords.
  4. Separate Human and AI-Native Formats: Segment your directories into files meant for humans and files meant for the AI. Lean towards using markdown files, text files, and CSV files for AI consumption. Keep heavy, formatting-rich files in a separate reference folder that the AI does not scan unless explicitly commanded.
  5. Isolate Contextual Boundaries: Open-ended prompts can generate answers that blend multiple disciplines or outdated content. When your library is indexed by department, project, and lifecycle stage, AI can narrow its focus and answer questions within the right slice of your information.

You’ll note that there is a lack of solid reference examples of good taxonomies. This is, again, related to data cleanliness being driven by culture. The same taxonomy may or may not work for another organization. But a solid taxonomy based on how the organization thinks and processes can easily be maintained through training, communication, and the occasional reminder (preferably automated).

The ROI of a strict folder taxonomy is immediate. Output precision goes up. Token waste goes down.

If your AI is only as reliable as the context it receives, your unstructured file storage is an active threat to your workflow. Build the hierarchy locally. Clean up the directories.

Credit to Dylan Davis‘s video 5 Changes That Make ChatGPT & Claude 10x Better for sparking this research.

If you found this interesting, please share.

© Scott S. Nelson

Support Agents Can Increase Deflection…and Defection

TL;DR: Large Language Models (LLMs) are excellent at retrieval but fail at self-awareness. Without a structured hand-off architecture, they become expensive loop-engines that drive users to competitors.

The Evolution of Deflection

Support deflection is a three-act play where the customer usually loses.

It started with static FAQs. Digital filing cabinets where users did the manual labor of hunting for answers.

Then came semantic support search. This used natural language processing to map a query to a relevant article.

Eventually, we got chat bots that acted as conversational proxies for that search. Each step was pitched as a win-win: faster answers for the customer and massive overhead savings for the business.

The problem is the “savings” side of the equation. Businesses have prioritized it so aggressively that automation is now being forced beyond its logical limits. We have moved from helping the user find an answer to preventing the user from reaching a human at all costs. (This is usually where the ROI calculation ignores churn.)

A recent example of this friction is an interaction with the Substack support agent.

To be clear, I like the Substack platform.

This interaction is only representative of many similar bad support experiences, and it just happened to be the one cached in my browser history when I was writing this.

The Context Leak

In the first interaction, the user uploads a PDF detailing a failure. The bot acknowledges the delivery failure but immediately drops the core issue (the fact that this was a Substack-recommended configuration that failed).

BOT: The screenshot you uploaded shows that an email you sent to an author at their @substack.com address was not delivered because it was flagged by Substack's automated security protections. To connect with Substack writers, please use the comments, chat, or direct message features by creating a Substack profile or using the Substack app. Let me know if you need help with any of these options or have other questions about contacting authors.

This is a classic context window failure. The bot is reading the “now” but has forgotten the “why.” (Pro tip: Support bots require a persistent Problem Summary in the system prompt to prevent the initial state from being pushed out by chat history.)

The “Loop of Death”

The next phase is the retrieval loop. The bot recommends settings the user has already confirmed are active.

CUSTOMER: I set up Stripe and want to change the email address to my substack address as recommended but email sent to that address are being refused. Content attached and below (in case you block attachments). BOT:Your Substack alias (scottsnelson1@substack.com) is recommended for use as the Stripe support email, but direct emails sent to this address may be blocked by Substack's automated security protections. To ensure you can receive replies forwarded from your Substack alias, set

The bot suggests setting “Who can reply” to “Everyone.” The user confirms this was done before the test. The bot’s response is a generic “thanks for confirming,” followed by an admission that its own security protections might still block the mail.

CUSTOMER: It is already set to Everyone and was before the test. BOT: Thank you for confirming that your

This is the architectural equivalent of a car that won’t start because it’s out of gas, while the manual keeps explaining how to turn the key.


Logic vs. Retrieval

LLMs typically prioritize the most relevant documentation. The “gotcha” is that documentation often contains circular logic or edge cases the model isn’t weighted to resolve.

When the user explicitly asks how to use the recommended address if Substack is going to block it, the bot resets to a template. It is no longer “solving.” It is just “responding.”

CUSTOMER: How can I use the substack address for Stripe if substack is going to block the emails sent to that address? BOT: Your Substack alias (scottsnelson1@substack.com) is intended to forward replies from your newsletter readers, not to serve as a general-purpose support inbox for all external emails. While we recommend using your Substack alias as the Stripe support email for privacy, Substack’s automated security protections may block emails sent directly from services like Stripe or other automated systems, meaning you might not receive those messages. If you need to ensure you receive all communications from Stripe, it’s best to use your main account email or another address you control as your Stripe support email. Would you like guidance on how to update your support email in Stripe, or do you have other questions about managing your publication’s contact details?


Tactical Breakdown

The agent failed to pivot. It was out of its depth by the second prompt, but it lacked the internal logic to admit it. Instead of acknowledging the mismatch between the “recommended setup” and the “blocking error,” it reverted to a script.

The ultimate UX insult happened at the end. After a prolonged exchange full of specific data and a technical PDF, the bot asked the customer to “describe the issue” to conclude the session. It essentially hit ‘Refresh’ on the customer’s frustration.

CUSTOMER: I would prefer you fix your invalid instructions so other people don't have to be aggravated like this in the future. BOT: Thank you for sharing your feedback about the instructions for using your Substack alias as the Stripe support email. To help us address this issue, could you describe the specific bug or confusion you encountered with the Substack product? For example, let us know exactly what happened when you tried to use your Substack alias for Stripe support, and what you expected to happen instead. Your detailed description will help us better understand the problem and improve the experience for other users.

Even when the user mentioned leaving for a competitor, the bot stayed in its polite, useless loop. It missed every red flag in the conversation.


Architecture: The Exit Ramp

This is a design failure, not a model failure. You cannot expect an LLM to manage its own boundaries. You have to build them. (Think of it as adding a “break glass” protocol to your chat logic.)

The Resilient Support Stack

  1. Sentiment-Based Escalation: Use a secondary, lightweight model to monitor the chat for frustration signals. If terms like “frustrated,” “waste of time,” or competitor names appear, kill the bot and ping a human.

  2. Repetition Kill-Switch: Track the number of times a specific documentation link or advice block is served. If the bot repeats itself twice, it has failed. Trigger an automatic hand-off.

  3. Persistent State Management: Store the Original Problem Description outside the chat context window. Force the bot to reference this core state in every turn so it doesn’t forget why the user is there.

  4. The “Depth” Check: If a user’s prompt complexity (like an uploaded log file) exceeds a certain threshold, the bot should immediately preface its response with: “I’m looking at your logs. If I can’t solve this in two steps, I’ll get a human involved.”

Architecture is the differentiator between a tool and a liability in AI adoption. The model is just a component. The success of the implementation depends on the plumbing that surrounds it.

If your current AI adoption feels more like a liability than a multiplier, or is stalled by FUD, Contact Me to discuss options.

If you found this interesting, please share.

© Scott S. Nelson

Developers are now Agent Managers: Enter the New Matrix

TL;DR: The use of AI in development has shifted from a coding assistant to a team of agents doing the heavy lifting. This requires developers to skill up in management and forces a fundamental shift in how software roles collaborate.

The Managerial Migration

For those not watching closely (which is most people, and perhaps only a few reading this), the world of software delivery is in the midst of a tectonic shift. The use of AI is evolving from a simple coding assistant to a team of agents, or experts, performing the bulk of the work. This moves the developer role from a traditional software engineer to an agent manager.
The change in role definition and skills is another aspect of the paradigm shift the Age of AI is bringing. If this sounds new to you, it is because you may have missed the transition that the proliferation of the “WWW” subdomain brought to IT in the 90s. We are all going to come out better, but it is going to be a long haul as we re-learn lessons from last time and write new ones for this era.

The Expertise Gap in Management

One common misconception is that this automation means developers can be replaced by Product, Project, or Program managers. This is mostly the “Wall Street Rumor Mill,” which is only just now being revised from “replacing people with AI” to “shifting investment from employees to AI vendors”. At least that is more honest.
The “average” manager often lacks the technical depth to write a precise specification or review the complex output of an agentic workflow. Managing a digital workforce requires the same technical understanding and focus as writing source code. If you cannot perform a rigorous technical review of what AI agents produce, you should not put it into production (unless you suffer from a terminal case of technical hubris).

The Developer Drift

While many managers lack the depth to take over, developers are not guaranteed success in this new role without learning to view technical problems from different angles. Many developers tend to drift from the business context without a reminder, which is why lifecycle ceremonies exist to gather feedback from users and owners. For some, this is a forest-versus-trees effect, while for others, it is the temptation of a “cool” approach over a practical solution.
The speed of AI can take a minor gap in understanding and expand it into a costly chasm. When an agent can produce a week’s worth of logic in seconds, the cost of moving in the wrong direction scales exponentially. The team must find a way to collaborate where agents are a factor beyond just a tool choice.

Grit over Gift

This proficiency is not a magic gift: it is a byproduct of learning, practice, and pushing boundaries. There is a persistent myth that “prompt engineering” is an inherent talent or a shortcut for the lazy. It is actually the opposite. Real proficiency comes from hundreds of hours spent in invisible iteration. You have to break the agents to understand how to fix the workflow. These skills are then applied to context engineering, where the developer becomes the manager and the back-and-forth transitions to a human-in-the-loop system.
Deep experience can sometimes trigger intellectual rigor mortis, where you stop looking for a better way because you already know the “right” way. To succeed now, you need the grit to unlearn habits that are no longer efficient. High ROI in the age of AI belongs to the person who pushes boundaries through practice, not the one waiting for the “perfect” model to arrive.

The Practical Pivot

As we navigate this “.ai moment,” leadership, managers, and developers need a new way to interact. It is no longer about a ticket hand-off: it is about real-time orchestration.
  • Developers: Start treating your AI tools as interns, not calculators. An intern needs guidance, a clear spec, and a rigorous peer review. If they produce garbage, it is a reflection of your management. Mentor your agents by providing better context and documentation.
  • Managers: Help leadership understand that the “silver bullet” still requires expert aim. AI is a force multiplier, but it requires a human who knows where to point the barrel. Use these tools to bridge the communication gap, not to eliminate the experts.
  • Everyone: Support each other in cross-training. Incorporate big-picture product thinking with low-level solutioning. Document the new workflows immediately, as your team now includes transient sessions that lack long-term memory.
Incorporating this new layer requires new connections, shifts in responsibility, and overlaps that act as double-checks from different perspectives rather than simple redundancies.
If you found this interesting, please share.

© Scott S. Nelson