AI Assisted Archives

Lessons Learned, Lessons Lost: Why AI Retrospectives Rule

August 1, 2026Scott S NelsonLeave a comment

TL;DR: The Sprint Retrospective would be the most valuable ceremony in Scrum…if it weren’t also the one with the worst follow-through. AI can run the same ceremony and actually keep the results, because the lesson gets loaded before the next session instead of filed and forgotten. It’s a simple habit to learn, and there are plenty of free examples of how to automate it. It’s a small investment of time and thought that pays real dividends in productivity and confidence.

The Ceremony That Everyone Likes and Nobody Finishes

The 2020 Scrum Guide puts the Sprint Retrospective at the end of the Sprint, timeboxed to three hours for a one-month Sprint (not that many places do one-month Sprints anymore, or allot three hours for improving things). The process is simple: provide a safe space where people can speak their mind about what went well, what didn’t, and how they would change things to be better in the next Sprint. The goal is to capture ways to increase quality and effectiveness while reducing friction and wasted time.

In theory it’s the smartest ceremony of any SDLC. A room full of people who just lived through the same thing, comparing notes while it’s fresh, with permission to say what actually happened.

I was introduced to this ceremony before Agile approaches were common in the enterprise. We ran it after a major release, and I found the process very enlightening. Someone wrote up a summary of the findings and posted it on the company intranet.

This was a long time ago. Years later I remembered that document and got curious about whether anyone had ever gone back to it.

I was the only one who ever looked at it again.

I can’t recall a single piece of wisdom it captured. Not one. What I do remember is the aura of “this makes so much sense” when we discussed the stickies on the wall. And that the same mistakes were repeated across later projects, many with some of the same people in the room who had sat through the meeting where we agreed not to make them again. Sigh.

What Happens to the Findings

Follow the output and the pattern is easy to see.

Some items become real assignments. Someone owns it, someone completes it, and the team is measurably better off. That’s a real win and it’s worth saying so.

The problem is, it is too often a one-off win. The task closes. It’s checked off and moved to Done. Then the same thing comes up three Sprints later. Or the next Sprint.

Because the good stuff that gets surfaced in a retro isn’t something that can be fixed with a story. It’s a lesson. “We committed before we understood the integration.” “We let the staging environment drift.” “We assumed the client would review in three days and they took three weeks.” Those aren’t tickets. They’re rules for how to behave next time, and there is nowhere to put them where people will remember them (yes, there’s an exception, for those who are already ahead of the game with a living PROJECT.md, aka CLAUDE.md, AGENTS.md, or whatever your favorite harness uses).

So they go into a document. Which goes onto an intranet page. Which nobody opens during the next project, because during the next project everyone is busy doing the next project.

The retrospective produces knowledge with no delivery mechanism. Six weeks later a new team, or the same team with two new people, walks into the identical wall and holds an identical meeting about it. We call that continuous improvement. It’s closer to continuous rediscovery.

That failure, organizations not learning from their own experience even after making the capture of lessons a required part of the process, is a big part of why the Best Practices section on this site exists at all.

Run the Same Ceremony, by Hand

Now try it with AI at the end of a working session. At its most basic, this is two prompts.

Review this session. What worked, what didn't, what should change next time?
Give it to me as Start, Stop, Continue.

Start doing this, stop doing that, keep doing this other thing. Read what comes back, edit anything that’s wrong or too vague to act on, then:

Save that to memory and apply it going forward.

That’s the entire ceremony. Under two minutes, no tooling, no setup. (YMMV on the prompt, depending on your type and style of work…BWTM)

Here’s the part that changes everything. It doesn’t file the lesson. It applies it, every time, forever. Next session, and every session after that, those rules are loaded before the first word of work gets done. There is no folder, no onboarding deck, no hoping somebody remembers. The correction is simply in effect.

And it updates. Session ten refines what session three concluded. A rule that turned out to be wrong gets replaced instead of quietly ignored.

Same ceremony. Same format. The difference is entirely in what happens after the meeting ends.

The productivity gain is obvious enough. The part that sneaks up on you is the confidence. After a few weeks you stop bracing for the mistake you already corrected once, because it doesn’t come back.

The Gap Was Never Honesty

The lesson here isn’t that AI is disciplined and people are sloppy. People are perfectly capable of discipline. What people don’t have is a place to put a lesson where it will find them again at the moment they need it.

That’s the gap the retrospective has always had. Not insight. Distribution.

AI closes it almost by accident, because persistent memory is just how the thing works. The lesson arrives in the same channel as the work, automatically. You don’t have to remember to go check.

Which reframes a lot of process failures. Most of the time the problem isn’t that we didn’t learn. It’s that the learning had no route back into the work.

Almost Nobody Does This

Here’s the honest part. Very few people take the time to run a retrospective on their AI sessions, and fewer still do it consistently.

It isn’t hard and it isn’t slow, which is exactly what makes it easy to skip. Two minutes is nothing. Two minutes you never spend is also nothing, and it’s what most people land on by about the third busy week. The ceremony that depends on remembering to hold it is the ceremony that quietly stops happening, which is the same failure mode as the intranet page, just faster.

That’s the argument for automating the capture rather than relying on the habit.

You Don’t Have to Invent This

Run it by hand for a few weeks first, though. You want a feel for what a useful captured lesson looks like before you hand the job to something that writes them without asking, because you’ll be reviewing that output later and you need to be able to spot a bad one. Automated capture drifts. It records the frustrated aside instead of the actual rule, or it generalizes a one-off into a standing instruction. Doing it manually first is how you learn to recognize that when you see it.

And you don’t have to tune that prompt from scratch, or build the automation yourself either. There’s an entire cottage industry of shared instruction sets, skills, and rule files built to automate this loop, and most of them are free and MIT-licensed. Read a few and you’ll find better phrasing than mine sitting right there in someone’s repo.

claude-reflect is the most widely adopted of the ones I found. Hooks watch your prompts for correction patterns like “no, use X” or “actually” and queue them automatically. You then run a command to review the queue and approve what gets written into your persistent instructions. Capture is automatic, application requires your sign-off. It also mines your session history for repeated requests and offers to turn them into reusable commands.

bokan’s self-improvement skill takes a different angle: it fans out parallel agents across your past sessions, ranks friction patterns by how often they recur, and attaches the raw quote where things went sideways. Frequency plus evidence, which makes the finding arguable rather than just asserted.

claude-improve is worth reading even if you never install it, because its README openly credits the six community approaches it was assembled from. One of those, a one-prompt reflection pattern, contributes the single most practical idea in the whole space: write the captured lesson as an enforceable rule. Lead with why, use NEVER and ALWAYS, include a concrete example. Most captured lessons fail because they’re written as vague observations that nothing can act on.

Cursor users have the same pattern in a different wrapper, usually a self_improve.mdc rule paired with a rules file, as described in Stop Babysitting your AI.

Pick one, use it as-is, or read three of them and build the version that fits how you actually work. Standing on someone else’s structure beats staring at a blank file.

What the Evidence Says

One thing to calibrate before you go shopping, because I’d rather say it than have you find out later.

None of these tools has been independently reviewed or benchmarked. What they have is stars, forks, and enthusiastic author write-ups. A thousand stars measures whether an idea sounds good, not whether it works. There’s no head-to-head comparison, so pick on fit rather than on popularity.

The underlying mechanic, though, has real support. The Reflexion paper from NeurIPS 2023 found that agents that write a short natural-language reflection on what went wrong outperform agents that store the raw record of what happened by eight percentage points absolute. Storing the transcript is worth less than storing the conclusion. That is the academic version of Start, Stop, Continue.

Anthropic’s own context management benchmark reports a 39% performance improvement over baseline when memory and context editing are combined, and 84% fewer tokens consumed on a long multi-step task. Vendor-run, so discount it accordingly, but it’s a real evaluation with published numbers.

The mechanism is sound. The specific tooling is a matter of taste.

Prune It, or You’ve Just Rebuilt the Intranet Page

The one part none of this tooling can do for you: every system here accumulates rules, and none of them has a convincing answer for pruning. claude-reflect ships a dedupe command, which tells you the problem is real enough to need a command.

So budget a few minutes every month or so to read what’s piled up and delete what’s gone stale. That’s the intervention the manual practice trained you for. A memory file nobody audits becomes the same dead document, except this one is actively steering your work instead of quietly sitting on an intranet.

Making the Time Before You Save the Time

All of this runs into the oldest problem in efficiency work: you have to spend the time before you get the time. Every initiative that eventually saves hours starts by costing hours you didn’t have when you decided to do it. Automating your retrospectives is no different. The build is small, but it’s still a build, and it competes with the actual work that made you want the efficiency in the first place.

I get it. New habits are hard. I confess that I don’t do this as regularly as I should, and that even though I have some prompts pre-written in UpNote to use that are more involved than the example given in this post, I usually write one off the cuff when I do remember. Maybe if I spent less time writing about techniques and more time automating them…

Which is exactly why the two-prompt version matters. It costs nothing to start, it works on its own, and it buys you the judgment you’ll need before you spend an afternoon wiring up anything fancier. Start there. Automate when the habit is annoying you enough to be worth removing.

An Answer to the Question Everyone Keeps Asking

“What are we doing with the time we save using AI?”

Fair question, and most of the answers are vapor. Here’s one that isn’t.

Spend a slice of it teaching the tool what it just learned. Those two minutes are the highest-leverage part of the whole workflow, because they’re the only part that compounds. Every other minute you save gets saved once. This one gets saved again on every future session.

It shouldn’t take all of it. And it should take less as time goes on, which is the tell that it’s working. Early sessions produce a lot of corrections because there’s a lot the tool doesn’t know about how you work. Later sessions produce a few refinements. Eventually you’re mostly confirming that the rules still hold.

Use the rest to go touch grass.

That’s not a throwaway. The entire argument for this technology is that it gives us capacity back. If we spend all of the reclaimed time on more work, we didn’t gain capacity, we just raised the quota.

Now Tell Me What You’re Actually Doing

This is the part where a post like this usually promises a follow-up with results. I’d rather ask than promise.

If you’re already running some version of this, I want to hear it. Which tool, or which few lines you wrote yourself. What the capture prompt actually says. Whether you review what gets written or let it accumulate. What broke.

If you’re not doing it yet and this pushed you into trying, say so at the start and tell me what happened three weeks in. Including if the answer is that you did it twice and stopped, because that result is more useful than another enthusiastic write-up. The failure modes are what’s missing from every article I read while putting this together.

If enough of it turns out to be interesting, there’s probably a follow-up in it, built mostly out of what readers send rather than what I think. Credited unless you’d rather not be. A comment, an email, a reply on LinkedIn, whatever’s easiest.

The retrospective finally works. Let’s find out whether we actually run it.

If you found this interesting, please share.

Keep Claude from Forgetting You When Moving to a New PC

July 26, 2026July 26, 2026Scott S NelsonLeave a comment

TL;DR: Copying your project folders to a new laptop moves the files and leaves the work behind. Everything Claude Desktop actually remembers, your Cowork sessions, your MCP connector config, your Claude Code settings, lives in a Windows package sandbox that a normal search will never find. Zip that folder with the app closed and your history comes with you. Skip it and you start over with a very clean, very empty sidebar.

New Laptop Day

New laptop day is supposed to be a good day. Copy the projects, sign in, get back to work.

For most non-developers it generally is pretty straightforward, especially if your company manages backups and you use defaults. Sign in, wait for the sync icon to stop spinning, open the same three apps you always open. Whatever you’d customized was mostly customized inside something that follows your account, so it follows your account.

For developers and other power users it has never been that straightforward. There are a million little tweaks accumulated over years for personal preference and productivity, and not one of them is written down anywhere. The PATH entry you added at 11pm to make a build work. The tool installed from a zip file into a folder you invented. The config that lives three levels below AppData because that’s where the installer decided to put it. You don’t have a list. You have a vague sense that something will be missing, and you find out which something at the worst possible moment, usually about four days later.

Thanks to the popularity of Claude, now everyone gets those extra steps. And they’re not that straightforward even for the power users.

Here’s how it goes wrong. You copy the project folders. OneDrive handles the synced ones, a USB drive handles the local-only ones, file counts check out, you spot-check a few documents and nothing’s corrupted. Install Claude Desktop, sign in, open Cowork.

Empty.

The files are all there. The work is not. Every session, every bit of accumulated context, every conversation where you finally got Claude to understand your codebase, none of it came along. Because none of that lives in the project folder. It lives somewhere else entirely, and the app gives you no indication where.

Worth sorting out before you wipe the old machine, which is the point of no return most people hit about an hour after they think they’re done.

Two Kinds of Sessions, Only One of Which Is Your Problem

Some good news first. Cowork sessions now run remotely and are tied to your Claude account rather than your hardware. Sign into a new machine and those sessions are already there, waiting, automagically. Nothing to move, nothing to find. For once, the defaults crowd wins outright.

The catch is that this only holds for sessions created after remote sessions shipped. Anything older is a local artifact sitting on the old hard drive. If you’ve been using Cowork for a while, that’s likely most of your history.

So the question isn’t whether to migrate. It’s whether your sessions predate the cutover.

If none of them do, close the tab and enjoy the new laptop. Although before you go, it’s worth a moment’s honest reflection: if you’ve been at this for a while and have nothing accumulated that’s worth the trouble of moving, that’s telling you something. Not about the migration. About how much of Claude you’re actually using. A tool you’ve spent months teaching your projects, your standards, and your preferences to is a different tool than one you open to write emails. If none of this applies to you, the interesting question isn’t how to migrate. It’s what you’ve been leaving on the table.

For everyone else, keep reading.

Where It Actually Lives

Ask Claude where its own session data lives on Windows and it will tell you %APPDATA%\Claude\local-agent-mode-sessions\. It will tell you this confidently. Community migration tooling references the same path, so it isn’t inventing it from nothing.

It isn’t there.

Nor is it at %LOCALAPPDATA%\Claude\, which is the natural second guess and the next thing you’ll be told to try. You can run a recursive search across your entire user profile for audit.jsonl, the conversation log written inside every Cowork session, and get nothing back at all. That result is disorienting enough that you start to wonder whether the sessions ever existed on disk in the first place.

Whether that documented path was ever correct on Windows, or whether it’s a Mac convention that got generalized, isn’t something I can tell you. What matters is that on a current Windows install it’s a dead end, and it’s the dead end you’ll be pointed at first.

The real reason is architectural. Claude Desktop installs as a packaged Windows app, and packaged apps get their own private storage sandbox. Writes to the conventional locations get quietly redirected somewhere like this:

C:\Users\<username>\AppData\Local\Packages\Claude_<packageid>\LocalCache\Roaming\Claude\

Note the shape of that path. It ends in Roaming\Claude, exactly as documented. It’s just sitting under a package container that no reasonable person would think to check, which is why searching for the documented path fails while the documented path is, in a sense, still accurate.

The package ID is a short string of characters, and there’s no reason to guess at yours. Find it:

Get-ChildItem -Path $env:APPDATA, $env:LOCALAPPDATA, $env:USERPROFILE -Recurse -Filter "*claude*" -ErrorAction SilentlyContinue |
  Where-Object { $_.PSIsContainer } |
  Select-Object FullName |
  Out-File "$env:USERPROFILE\Desktop\claude-folders.txt"

That writes every Claude-related folder on the machine to your desktop. Open the file, look for the one containing local-agent-mode-sessions, and you have your real path. Two notes: run it in PowerShell, not Command Prompt, and expect it to take a minute or two while it walks the profile.

Depending on how and when Claude was installed, your data may genuinely sit in the plain %APPDATA%\Claude\ path. Both are possible. The one holding session subfolders with long UUID-style names is the one that matters. Don’t assume, check.

Claude Will Help You Move, and You Don’t Even Need to Buy It a Pizza Afterwards

Here’s the part that’s mildly funny in hindsight: the way to find all of this is to ask Claude, in Cowork, on the machine you’re migrating away from. It has shell access to the profile it’s running in. Point it at the problem and it will run the searches, read what comes back, and narrow the path down with you.

It will also send you to those two wrong folders on the way there, because the documented path and the actual path aren’t the same thing, and the documented path is what it reaches for first. That’s not a knock. It’s the ordinary pattern of working with these tools, and the fix is the same as always: make it show you the output rather than accept the summary. The searches above came out of exactly that back and forth. Claude proposed, the filesystem disagreed, and the third attempt found it.

Which is the useful lesson here, more than the folder path. The tool is genuinely good at the mechanical part, walking a filesystem, reading a directory listing, writing the PowerShell you’d otherwise be looking up. It is not reliable about where things live, because that’s a fact about your specific install and not something it can know from training. Use it for the legwork. Verify the destination yourself.

What to Copy and What to Leave

Open that Claude folder and you’ll find far more than sessions. Most of it is cache, crash logs, GPU state, and machine-specific scratch that regenerates on its own. Copying it doesn’t help and can actively cause conflicts.

Bring these:

local-agent-mode-sessions is the whole reason you’re here. Your Cowork session history.
claude_desktop_config.json holds your MCP server configuration. Lose it and every connector gets set up again from scratch. This is the one people don’t realize they needed until it’s gone.
claude-code and claude-code-vm carry Claude Code settings and VM data, if you use it.
vm_bundles for Claude Code VM bundles.
ChromeNativeHost for the Claude in Chrome integration.
config.json and cowork-enabled-cli-ops.json for general app and Cowork settings.
git-worktrees.json if you run Claude Code against git worktrees.
pending-uploads if anything is sitting in it.

Leave IndexedDB alone. It stores app state tied to your current signed-in session, and the new machine builds its own. Overwriting it invites problems you’ll spend an evening diagnosing.

Also grab C:\Users\<username>\.claude\ on the old machine. That’s a separate folder outside the sandbox, holding configuration and skills, and it goes to the same place on the new profile.

Zip It on the Old Machine

Rather than dragging ten items across a USB drive and hoping you got them all, package them once. Quit Claude Desktop completely first. Not the window, the app. Right-click the tray icon and choose Exit. Zipping live data files is how you end up with a corrupted history and no idea why.

Then, in PowerShell, substitute your own package ID into the first line and run this:

$Claude = "$env:LOCALAPPDATA\Packages\Claude_<packageid>\LocalCache\Roaming\Claude"

$Items = @(
  "local-agent-mode-sessions",
  "claude-code",
  "claude-code-vm",
  "vm_bundles",
  "ChromeNativeHost",
  "pending-uploads",
  "claude_desktop_config.json",
  "config.json",
  "cowork-enabled-cli-ops.json",
  "git-worktrees.json"
) | ForEach-Object { Join-Path $Claude $_ } | Where-Object { Test-Path $_ }

Compress-Archive -Path $Items -DestinationPath "$env:USERPROFILE\Desktop\claude-migration.zip" -CompressionLevel Optimal

The Where-Object { Test-Path $_ } line quietly skips anything you don’t have, so you can run it as-is whether or not you use Claude Code.

One detail that makes the other end easy: the archive stores paths relative to the items you named, not their full absolute paths. local-agent-mode-sessions\ and config.json land at the root of the zip. The archive is effectively a snapshot of the Claude folder itself, which means unpacking it is a single step with nothing to rearrange.

Then package the profile folder separately, since it lives elsewhere:

Compress-Archive -Path "$env:USERPROFILE\.claude" -DestinationPath "$env:USERPROFILE\Desktop\claude-dotfolder.zip"

Two zips on your desktop. Move them however you like: USB drive, OneDrive, network share.

If Compress-Archive throws a path-length error, that’s the 260-character Windows limit biting, and it’s plausible here given how deep the sandbox path already runs before your session UUIDs even start. Enable long paths in Windows, or fall back to robocopy with the /E switch to stage the folders somewhere shallow like C:\ClaudeMigration\ first, then zip from there.

Unzip It on the New Machine

Install Claude Desktop, sign in, and open Cowork once. This creates the directory structure you’re about to unpack into. Then quit completely again.
Find the package path on this machine. Run the same Get-ChildItem search from earlier. The package ID can differ between installs, so confirm rather than assume.
Unpack straight into the Claude folder. Because the archive is relative to that folder, everything lands where it belongs on its own:

$Target = "$env:LOCALAPPDATA\Packages\Claude_<packageid>\LocalCache\Roaming\Claude"

Expand-Archive -Path "$env:USERPROFILE\Desktop\claude-migration.zip" `
               -DestinationPath $Target -Force

-Force overwrites files that already exist and leaves everything else untouched. The only collisions are the config files the fresh install generated a minute ago, and replacing those with yours is the entire point.

Restore the profile folder:

Expand-Archive -Path "$env:USERPROFILE\Desktop\claude-dotfolder.zip" `
               -DestinationPath "$env:USERPROFILE" -Force

Restart Claude Desktop. Your old sessions should appear in the sidebar.
Reconnect your workspace folders. These are machine-specific paths and won’t follow you. Copy your project files first, then point Cowork at their new locations.
Reinstall plugins that rely on local MCP servers. Cloud-only plugins come across with your account.
Keep the old machine intact until you’ve verified all of it. Open a few migrated sessions, confirm your connectors work, and only then wipe.

The Same-Account Assumption

Everything above assumes one thing that’s easy to overlook: the Windows username is identical on both machines. When it is, every path inside those session files still resolves and the unpack just works.

When it isn’t, you have a problem the zip won’t solve. Sessions store absolute local paths, so C:\Users\OldName\Documents\... follows you onto a machine where no such user exists. The files arrive fine. The references inside them point at nothing.

If your usernames differ, this is the one case that needs an extra stop. Unpack to a scratch folder instead of straight into the target, run a find-and-replace across the JSON and JSONL files inside the extracted sessions to swap the old profile path for the new one, verify a single session opens correctly, and only then copy the corrected folders into place. Apply the same scrutiny if anything else moved, a different drive letter, a relocated Documents folder, OneDrive redirection that was on before and off now. Any of those breaks the same assumption in the same way.

Setting expectations honestly here: migrated sessions may land as readable history rather than fully resumable conversations. You get the record of what was discussed and decided, which is the part that took months to accumulate. Whether Cowork lets you pick up where you left off is a separate question, and one worth testing with a single session before you count on it for all of them.

With AI, Every How-To Has an Unknown Expiration Date

Everything above is a workaround for a design that’s already being replaced. That much is normal. What’s different now is that nobody can tell you when the replacement lands.

Software used to telegraph its changes. Version numbers, release notes, deprecation warnings, a beta period where the community wrote up what moved. A how-to written against version 4 stayed true until version 5, and version 5 announced itself. You could read a three-year-old post, check the version in the header, and know within seconds whether it still applied.

AI tools don’t work that way. They ship continuously, the client updates itself, and the thing you’re actually interacting with changes underneath a version number that may not move at all. Storage locations migrate. Local features become account features. The answer that was correct on Tuesday is wrong on Thursday, and nothing announces it. There’s no header to check.

Which is exactly what’s happening to this article. With sessions running remotely and tied to your account, the machine stops being where the work lives and becomes just a window onto it. Sign in anywhere and your history, your projects, and your scheduled tasks are already present. No folder to find, no zip to move, no package sandbox to go spelunking in. For anyone starting fresh today, this is already unnecessary.

That’s the right direction, and worth saying plainly rather than grumbling about the transition. Local-first storage bought you privacy and offline access at the cost of making your work a hostage to one piece of hardware. Account-first storage trades that the other way. Reasonable people weigh those differently, but nobody has ever been glad their context was trapped on a laptop with a failing battery.

The catch is the seam, and seams are always where the work is. Tools in transition leave a cohort stranded on the old model, holding data in a format the new model doesn’t automatically reach for. That cohort is anyone who adopted early, which is to say the people with the most accumulated context and the most to lose. Migration guides exist for the gap between what a tool used to be and what it’s becoming.

So treat this one, and every AI how-to you find, as perishable goods with no date stamped on the carton. Verify the path before you trust the instructions. If the folder isn’t where the article says it is, the article is probably older than the software. That will be true of this post eventually, and I can’t tell you when. Neither can anyone else. Until then, somebody has to move the boxes.

What This Actually Costs You

A little perspective on why any of us write these things down.

My first how-to post, back in 2002, was about building web services with Apache Axis. Working it out took the better part of two days: documentation that assumed you already knew the answer, examples that didn’t compile, and a long stretch of staring at a stack trace. The post that came out of it took a reader about twenty minutes. Two days of my confusion, compressed into twenty minutes of somebody else’s afternoon.

This one took about an hour of prompting and probing with Claude. It should take you fifteen minutes. Ten, if you run this post through an AI and ask it to strip out all my digressions, which I’d encourage, though I’d like it noted that the digressions are the part I enjoy.

The ratio held. It’s just faster on both ends now.

Fifteen minutes, then, against rebuilding context that took months to accumulate. That’s the entire calculation, and it’s the same calculation behind every unglamorous piece of groundwork in this business. Nobody gets excited about verifying a folder path. Everybody gets excited about the new laptop. The gap between those two feelings is where work disappears.

The broader lesson has nothing to do with Claude. Any tool you’ve spent real time training to your context is holding state somewhere, and “somewhere” is rarely where the marketing implies. Before you decommission a machine, ask what the tools on it know that the files don’t. Then go find out where they keep it.

Then wipe the old laptop. Not before.

If you found this interesting, please share.

Meet Deadlines and Manage Technical Debt with AI-Assisted Architecture

June 21, 2026August 1, 2026Scott S NelsonLeave a comment

tl;dr: New platform. Deadline. The instinct is to move fast and clean it up later. That’s where technical debt is born. A well-constructed Claude project, loaded with curated platform documentation and queried with the experience to know what to ask and how to evaluate the responses, tactically compresses the ramp-up without sacrificing strategic design principles.

The Sharp Fork in the Road

Every architect and engineering lead who has given a project to deliver on a new platform or with new technology under deadline pressure knows this fork. Pushing for proper preparation can get you marked (ironically) as a risk from the leadership perspective. Plowing forward using old techniques without understanding the new nuances keeps you up at night…either knowing you are missing something up front, or fixing what you didn’t know during the final death-march phase of a waterfall project that just happens to use Kanban boards, daily stand ups, and sprint ceremonies.

One path: move fast. Learn just enough to ship. Ask support when you hit a wall. Request exceptions when you hit limits. Get it working and tell yourself you’ll revisit the architecture when there’s more time. (There is never more time.) What you build in that mode becomes the foundation everything else is built on, and the cost of fixing it compounds with every sprint.

The other path: slow down. Read the documentation properly. Understand the platform’s constraints before you design around them. Make the right call the first time. This is correct and often impractical. Deadlines are real. The platform is new. The documentation is dense. The team is waiting.

The Contentstack project that prompted this post took the first path and ran into a SaaS governance constraint that happens to be measured recursively. The first time it was hit, the response was typical for teams working with a new SaaS vendor and release date that was set before the first line of code was written: Ask for an exception. Which was granted, hit again and raised again. Fortunately, the third time it happened, an experienced vendor support manager recommended reviewing best practices to avoid the issue. And an experienced architect was on the receiving end of that suggestion, one who had previously dealt with a Salesforce solution that went down three months after launch from relying on similar exceptions.

This post is not about Contentstack architecture. It is about the challenge many teams face with balancing target dates and defensive design decisions, and a tool set to apply in order to keep from tipping too far in either direction.

Claude as a Platform Research Partner

Giving Claude access to a curated set of platform documentation and then working interactively to explore solutions is not a replacement for architectural experience. It is an accelerant for it. It is also not a way to do away with architects or the inclusion of design tasks at the feature or story level. It is how to fulfill the expectation that AI can provide ROI immediately when applied by experienced technologists.

These distinctions matter. It’s never about “ask Claude what to do” (because if you need to ask “what” you won’t know how to ask “how” when the time comes). It is “I understand how systems like this behave, I know which constraints are likely to compound, and I need to move through the analysis faster than I could alone.” Experienced architects and engineers bring the judgment: familiarity with how content models fail at scale, how schema resolvers typically handle recursion, how vendor-imposed limits usually reflect real constraints in the underlying system. Claude brings the recall, the scripting, the cross-referencing, and the tireless patience for the kind of recursive schema analysis that would take a senior engineer the better part of a day.

For those that follow my posts you know that I will often describe theoretical solutions backed by a combination of personal experience where they would have worked linked to examples from others who demonstrated that they work. In this case the experience came before the theory, working backwards from a result where I noticed the process while documenting the solution (because, hey, that is what architects do after they solve something).

The working example was with a Contentstack implementation. It took one focused 2-hour session to identify an obscure root cause, define a strategic solution, discover other areas to apply the same solution, and identify where the solution would cause more harm than good. A second 30-minute session was applied after the first round of refactoring to validate the impact and prioritize the remaining effort. Before Generative AI, this would have been several days of effort that would not have been attempted until the risk was realized in production delay.

The Project is the Architecture

Before a single question gets asked, the project has to be built. This is not setup overhead. This is the work.

A blank Claude chat window and a well-constructed project will give you very different results on the same question. The difference is not the AI. It is the knowledge boundary, the taxonomy, the instructions, and the accumulated session output. Strip those away and you have a general-purpose assistant guessing at context. Keep them and you have something that behaves like a senior researcher who has been on the project for months.

What goes in the project folder:

Downloaded documentation as markdown files, not links. Links go stale, require fetches, and introduce latency. Pull the platform docs that matter, save them as markdown, put them in the folder. For Contentstack: the Global Fields limitations page, the Content Modeling Best Practices guide, the Custom Fields documentation. Not every page in the docs. The ones relevant to the work. Knowing which ones matter is the first place architectural experience shows up.

Actual data from the platform. In this case, exported stack JSON. Claude can read it directly in the sandbox, run scripts against it, and cross-reference findings against the loaded documentation in the same session. That combination of curated docs and live data is what makes the diagnosis precise instead of speculative.

Session summaries. After each working session, have Claude produce a structured summary: the original problem, the conclusions, the evidence, the next steps. That file becomes the cold-start document for the next session. You don’t re-explain the context. You hand Claude the prior session’s output and continue. The knowledge compounds.

At some point (again, much of this requires human intuition gained through real-world experience), have Claude work with you to turn the summaries into a skill scoped to the specific platform, technology, or tool so that when they are in context these lessons learned will be applied automatically going forward.

The Taxonomy Is Not an Afterthought

Separate downloaded reference content from working session output. Nest folders by topic. /reference/, /sessions/, /data/ serve different purposes and should live in different places. This is not pedantry. It is how you make the project instructions work correctly, and how you find things six weeks later without rebuilding context from scratch.

If the platform has extensive documentation, don’t try to enumerate allowed URLs in the project instructions directly. Create a reference-urls.md, or per-topic files like contentstack-docs-urls.md, with an annotated, categorized list of approved sources. Claude works from the list. You maintain the list. It stays current and searchable.

The discipline compounds the same way the session summaries do. A well-organized project from session three makes session fifteen faster than session one.

The Project Instructions Are the Rules of Engagement

The instructions define how Claude behaves inside this knowledge space. Three things they need to do:

Challenge assumptions. If a question implies something not supported by the loaded documentation, say so. Don’t fill gaps with plausible-sounding answers. The most dangerous thing a research assistant can do is answer confidently on insufficient evidence. This instruction eliminates a whole category of hallucination risk before it starts.

Point out mistakes. If the framing of a problem is wrong, say so. This is the instruction most people skip and then complain about later. You want an assistant that pushes back, not one that validates your bad hypothesis and helps you build a case on sand.

Limit web searches to specific URLs. Unconstrained web search in a technical investigation introduces noise: outdated content, inconsistent sourcing, SEO-optimized answers that aren’t accurate. Lock it down. Specify which domains are permitted. For a Contentstack project, that’s contentstack.com/docs. Everything else requires explicit permission. If the approved URL list is long, store it in a markdown file in the project folder and point the instructions at it.

This Requires an Architect

Here is the part that does not get said enough.

You cannot point Claude at an unfamiliar platform, load a few docs, and expect it to diagnose architecture problems. You can try. What you’ll get is fluent, confident, and partially wrong.

There are many engineers capable of setting this up. The value of an architect doing the work is separation of concerns in roles. The architect’s role is to nail down processes and choices that allow engineers to focus on the best way to apply them.

In our Contentstack use case, the single session worked because the person directing it brought a deep understanding of adjacent technologies and the experience to know both what to ask and how to evaluate the responses. Specifically:

Recognizing that the error message pointed to a schema limit, not a code problem, because that’s how content platform resolvers typically surface constraint violations
Understanding that “recursive” in the documentation meant multiplicative compounding, not additive, based on how similar systems handle nested references
Knowing the fix had to leave the content model intact for authors, which ruled out several otherwise obvious approaches
Reading a Claude-generated Python script’s output and recognizing that the confident result provided the first time was due to looking in the wrong parts of the schema
Looking at a before/after instance table and determining whether the fix was actually complete or just moved the problem

None of that knowledge lives in the documentation itself. It transfers in from adjacent experience: content modeling, schema design, how platform resolvers work under the hood. Claude surfaces the platform-specific detail. The architect determines what it means.

The tool doesn’t replace experience. It supercharges it with speed and specific knowledge.

The Interaction Pattern

What the Contentstack session actually looked like, stripped of the platform specifics:

State the problem. Provide the evidence: the error message, the exported schema, the documentation.
Claude generates a hypothesis. Test it against the data.
Diagnostic script written and run in the sandbox.
Root cause confirmed. Fix designed. Impact predicted before any schema changes are made.
Fix implemented. Follow-up session loads the new export and verifies the result.
Summary file created. Next session’s candidates identified.

No magic. An architect with relevant adjacent experience, a fast and patient research partner, and a well-stocked project folder.

Prompts That Did Actual Work

These are worth examining because the techniques transfer to any platform.

“Describe in detail the cause of home_page_template having 24 instances, and instances of what?”

The second half of that question is the important part. Asking Claude to define what it is counting before giving the count forces precision on both sides. In technical sessions on an unfamiliar platform, jargon can mask shallow understanding without anyone noticing until the fix doesn’t work. The ability to ask that follow-up, to know that “instances” needed a definition before the number meant anything, comes from having debugged similar problems elsewhere. Use this pattern whenever an answer could be technically correct but operationally ambiguous.

“Create a summary file to feed to the next analysis session that includes the conclusions from this session combined with the original inputs. Format and sequence the file so that the next session can be as efficient as possible.”

Besides being familiar with adjacent technology, experience solving complex issues with Generative AI is why this is an approach for architects and engineers. Yes, Claude will now start compacting sessions on its own to improve efficiency, but having the sense that it is time to move to a new session is again an area where human experience beats relying entirely on the AI.

This prompt converts a working session into a durable asset. The phrase “format and sequence for efficiency” is carrying real weight: it tells Claude to think about how the file will be consumed, not just what it contains. The output becomes the cold-start document for the next session. Without it, every session re-derives context the previous one already established.

“Read the attached to get full context of the original issue, then review the contents of [folder] and determine if and how the issue has been improved.”

Sequencing does the work here. Claude gets the full prior-session summary before it touches the new data, so “improved” arrives with a precise definition attached. Without that order, it analyzes the new export without knowing what it’s comparing against. Prime with context before assigning the task, every time.

All three follow the same pattern. Context before task. Output format stated up front. It is not a methodology. It is just how you would brief a colleague who needs to be useful on short notice.

The Setup Is the Differentiator

Two teams, same platform, same error.

Team A has Claude. No curated project, no loaded docs, no taxonomy, no instructions. They get generic answers that feel helpful until they don’t hold up under the actual constraints of the platform.

Team B has a project built by someone with deep experience in adjacent technologies, content modeling, schema design, API behavior under constraint, who knows both what to ask and how to evaluate what comes back. Downloaded reference docs. Exported platform data. Session summaries that carry forward. Instructions that push back on bad assumptions.

Team B gets a root cause analysis, a fix, and a forward-looking roadmap. More importantly, they get it without accumulating the kind of structural debt that shows up six months later as an emergency.

A Note about Choosing Cowork

What I’m describing is not the typical use case for Claude’s project-based workspace. It is aimed at knowledge workers automating routine tasks: organizing files, generating reports, drafting communications. Productivity stuff. This is not that.

My choice of Cowork is based on my day-to-day work being mostly in documents and decks. This could also likely be done using Claude Code in an IDE for those that prefer that interface.

I became aware of how far outside the lines I was operating when someone asked what tool I was using, I explained it, and I watched the look on their face. You know the look.

I have been here before. I spent years using JMeter for continuous functional and regression API testing, which is not what JMeter is for. JMeter is a load and performance testing tool, and there are entire communities of people who will tell you this. They are correct and also missing the point, because once you understand how JMeter handles realistic randomized inputs and configuration-driven test selection, you end up with one codebase doing the work of four. I wrote about it. People told me I was doing it wrong. The tests kept passing, so.

It is common to analogize the similarities between physical tools and technical tools. “When all you have is a hammer, everything looks like a nail”, and “You can use a screwdriver as a chisel, but you really shouldn’t.” I’ve often used those myself. But the opposite analogies are also true. Most tools can be a weapon, and many tools can have multiple uses. While screwdrivers are still terrible chisels, some are great prybars, hole punches, and, yes, weapons. Same with software. Excel has spellcheck, but I’d never paste text into it before posting to a blog, but I have used formulas to parse text rather than writing a script to apply regex rules because it is faster and just as accurate. Use your tools to the extent of their value, and don’t underestimate their value or your ability to innovate.

If you found this interesting, please share.

The Gold Rush Was Never Just About Gold

June 4, 2026July 21, 2026Scott S NelsonLeave a comment

TL;DR: Most people who chased the Gold Rush didn’t know what they were getting into. They saw headlines about fortunes and stories about how easy it was. Many went because their livelihoods were already threatened. Sound familiar?

Let’s be honest about who the average Gold Rush prospector actually was.

Not a rugged adventurer with a prospecting education and a solid savings account. Not someone who had studied geology or mapped the terrain. The typical forty-niner was a farmer whose crops had failed, a tradesman who had lost his shop, or a clerk who had read a breathless newspaper account and decided a long-shot bet beat a certain slow decline.

The California Gold Rush of 1848 and the Klondike rush of 1896 were separated by nearly fifty years and thousands of miles, but they drew from the same well: economic desperation dressed up as opportunity.

The context matters here, because without it the behavior doesn’t make sense.

The years leading up to the California rush included a global recession following the Panic of 1837, crop failures across the Midwest, and a population of young men with limited options. When James Marshall found gold at Sutter’s Mill in January 1848, the news didn’t just spread quickly, it spread selectively. The people who acted first were the ones who needed it most. Same story in 1896, when word of the Klondike strike reached Seattle and San Francisco during a prolonged economic depression that had pushed national unemployment past 20 percent. The ships heading north were not full of people with a plan. They were full of people with a problem.

Not everyone was running from something. Some were adventurers who wanted something different, or already had a good life and wanted something better. And not everyone coming out of a bad situation went in blindly. What almost everyone had in common were expectations that diverged sharply from how things turned out.

The relevant point is not that these people were reckless. It’s that economic pressure meant the average participant arrived undercapitalized, underprepared, and motivated primarily by someone else’s story of overnight success. They were chasing a headline, not a thesis. The results reflected that, in aggregate, almost immediately.

That pattern matters because it is not a 19th-century phenomenon. It is what every hype cycle looks like from the inside.

Each rush also moved in distinct waves. The rules that determined who succeeded in the first wave had almost nothing in common with what it took to win in the second. Most people who got swept up never stopped to ask which wave they were actually in. That question turned out to matter more than almost anything else.

First Wave: Right Place, Right Time, Right Creek

The first wave of California gold hunters had a genuine advantage. Here is what that advantage actually was. Not superior skill. Not better research. Proximity to the news.

Many of the earliest California prospectors were already in the territory: soldiers, settlers, and tradespeople who heard about Marshall’s discovery within weeks and moved fast. The surface deposits in the Sierra Nevada foothills were accessible, concentrated, and required almost no expertise to extract. A pan, a creek, and a willingness to stand in cold water for twelve hours were the main requirements. In that environment, showing up early mattered more than showing up prepared.

The Klondike told a similar first-chapter story. The initial claims along Bonanza and Eldorado Creeks were staked by prospectors already in the Yukon when George Carmack’s group made their discovery in August 1896. They were not the product of a coordinated strategy. They were in the right place when the right thing happened.

First-mover advantage is real. The people who moved fast in that window got a return no amount of later preparation could have replicated. But the window was short, the geography was finite, and it closed before most people had even heard the news.

Second Wave: The Pan Is Not Going to Save You

By 1852, the dynamics of the California Gold Rush had fundamentally changed. The surface deposits were gone. The creek beds that had yielded fortunes with a simple sluice box were picked clean by the first wave. The second wave arrived to find a very different landscape than the one the newspaper stories had described.

The prospectors who succeeded in the Second Wave did so through entirely different means. Hydraulic mining operations used high-pressure water jets to blast entire hillsides and process material through sluices, yielding gold at scale but requiring capital investment and systematic planning. Geologically-informed prospectors who understood quartz reef formations studied where gold veins actually formed and discovered productive sites where random panning had repeatedly failed. Syndicates pooled resources to fund deep shaft mines that reached deposits unreachable by individual surface workers.

Preparation was no longer an advantage. It was the entry requirement.

The Klondike replicated this pattern almost exactly. By the time the mass wave arrived in 1898 after a brutal trek over the Chilkoot Pass, which the Canadian government required each prospector to complete while carrying a year’s worth of supplies, the accessible claims were long staked. The prospectors who completed that crossing and still found nothing with a pan were not unlucky. They were late, and they were underprepared for the wave they had actually entered.

This is also where technology shows up on both sides of the ledger. The Industrial Revolution had already been displacing Eastern tradespeople and artisans for a generation, which goes a long way toward explaining why those gold rushes had the human fuel they did. Factory looms had replaced hand weavers. Steam-powered equipment had displaced skilled craftsmen. The Gold Rush was, in no small part, a downstream consequence of technological disruption seeking an economic escape valve. And then, within the rushes themselves, industrial technology, hydraulic systems, and organized mining operations began displacing the individual prospector. The image of the lone miner with a pan was already obsolete while people were still forming it.

Gold Wasn’t the Only Thing in Them Thar Hills

Some prospectors did strike it rich. The early arrivals at Coloma, the men who staked Bonanza and Eldorado before the word spread, the syndicates that scaled hydraulic operations with enough capital to actually move mountains. These were real winners. Gold was there. People found it. Fortunes were made.

But a parallel economy was running alongside the prospectors, quieter in the moment and, in the long run, more durable.

Sam Brannan did not own a gold claim. He owned a hardware store, and before he told anyone about the gold discovery, he bought up every pick, pan, and shovel in Northern California he could find. Then he walked through San Francisco holding a vial of gold dust, shouting about gold from the American River. He became California’s first millionaire. He did not find a single ounce himself.

Levi Strauss did not mine. He figured out that miners destroyed pants at an extraordinary rate and needed something that could survive the work. He made pants. Generational brand.

Wells Fargo did not mine. They moved money and packages for people who did. They are still here.

The common thread is not that these people were smarter than the prospectors. It is that they studied what the prospectors would certainly need rather than betting on where the gold might be. The uncertain bet was “this particular creek has gold.” The certain bet was “whoever finds the gold will need pants, tools, and a way to move money.” One of those bets required luck. The other required observation.

This path was available in the First Wave and Second Wave equally. It did not depend on timing. It scaled with the rush rather than competing within it. And it generated more durable wealth than almost anyone who was actually in the river.

The Roaring 20’s

Not the flapper and speakeasy era. This is the era of data centers and solopreneurs; dueling model metrics and learning evaluations; digital assistants evolving into personal agents and agentic automation that builds new automation agents. Billion-dollar funding rounds for companies that did not exist three years ago. Job titles that nobody had in 2021, now listed as critical hires. Entire industries trying to figure out if they are the disrupted or the disruptors, and running low on time to decide.

Models released on a Monday that are obsolete by Friday. Consultants who barely knew what a prompt was in 2022, now billing as AI transformation architects. Boardrooms demanding AI strategies before anyone has agreed on what problem they are solving. Vendors with “AI-powered” on the label whether the product has meaningfully changed or not.

The energy is real. The stakes are real. And unlike some previous cycles, so is the underlying technology.

The dot-com boom was real too. It produced Amazon, Google, and the infrastructure of the modern internet alongside thousands of spectacular failures. The AI shift is already demonstrating measurable productivity gains across industries, and the underlying technology is improving faster than most predictions have accounted for. Dismissing it as pure hype is the wrong read, and the people making that call loudest will look exactly like the analysts who declared the internet a fad in 1997.

The problem is not that people are excited about a real thing. The problem is that when real opportunity appears, it activates the same psychological patterns that sent underprepared people over a mountain pass in 1898. The gold rush mentality does not require the gold to be absent. It just requires the promise of gold to be louder than the instructions.

The opportunity is real. The question is whether you are building toward it, or just rushing toward it.

The AI First Wave Already Happened

From roughly 2022 through 2023, companies that moved aggressively into AI-native product development, workflow automation, or customer-facing AI features got real first-mover advantage: lower competition, compounding productivity gains, and a learning curve head start that is genuinely hard to close. Some of this was vision. Some was access. Some was timing. The window was real, and the returns were real.

Most businesses did not catch it. Large organizations move slowly by design, and procurement cycles are not calibrated for technology windows that last 18 months. That is not a criticism. It is a description of how large organizations actually work. (I have been in those rooms. Guilty.)

What it means is that most businesses are now in the Second Wave, whether they have acknowledged that or not.

Second Wave Requires a Different Playbook

The companies treating AI adoption as a First Wave problem in 2025 and 2026 are showing up in California in 1852 with a pan. The accessible value has been captured. What remains requires the methodical approach.

Imagine you could see exactly where your organization loses an hour a day to rework, manual handoffs, and decisions made on bad data. That is what a process audit produces. It is not glamorous. It does not show up in the conference keynote. But it is the difference between knowing where the gold is and hoping the next creek looks promising.

Start there, not with tool selection. Map where time, money, and errors concentrate in your current operations. Identify which problems AI can address with reasonable reliability, and which ones it will make worse by hallucinating confidently inside a business-critical workflow. Run contained pilots with defined success criteria before scaling anything. Build internal AI literacy and governance at the same time you build capability, not after something goes wrong publicly.

Then, only after you understand what AI can reliably do in your specific context, start redesigning processes to take advantage of it rather than bolting it onto what already exists. The order matters. Inverting it is how you end up running hydraulic equipment you do not know how to operate into a hillside you have not assessed.

Preparation is not glamorous. But it is the entry requirement now. That distinction matters.

The Niche Play Nobody Is Talking About

Here is the thing about Sam Brannan, Levi Strauss, and Wells Fargo: none of them would have been described as gold rush companies.

Brannan was a merchant. Strauss was a dry goods trader. Wells Fargo was an express and banking operation. The Gold Rush was the economic context that made their businesses thrive and scale, but their identity was not “gold rush business.” Their success was driven by the rush. They were not of it.

While the gold rush era was a boon to the merchant class, imagine if technology had been more advanced then. Gold is one of the most effective electrical conductors on earth. It does not corrode. It does not tarnish. It carries signal reliably in conditions that defeat most other materials. Today it is in every smartphone, every circuit board, every aerospace connector, and every implantable medical device. The miners panning those California creek beds were sitting on the raw material for the digital age and had no way to know it. They were chasing the obvious use. The compounding value was in applications that had not been invented yet.

AI is playing the same role for business processes right now, visible to anyone paying attention. It is the super conductor of this moment, not for electrons but for decisions, workflows, and the intelligence buried inside operations that were built for a different era. And just as the real gold economy grew around refining, transporting, and applying the metal rather than simply extracting it, the real AI economy is growing around discovering, implementing, and refining how AI connects to the work that organizations actually do.

Every organization trying to adopt AI will need clean, well-governed data. They will need people who can actually work alongside these tools rather than just technically access them. They will need integration between new AI capabilities and legacy systems that were built for a different era. They will need expertise in figuring out which processes actually benefit from AI involvement and which ones just look like they should.

None of that requires building a foundation model. None of it requires a large AI research budget. All of it requires observation, the same skill that made Sam Brannan wealthy while everyone else was panning creeks.

The businesses that build toward serving those needs may never be described as AI companies. They will be managed service providers, training firms, systems integrators, compliance consultants, data governance specialists. The AI boom will be the context that defines their era, even if it is not the label on their door.

That is not the consolation prize. That is the long game, and it has the most reliable odds.

Your Actual To-Do List

Three questions worth answering honestly before the next AI initiative.

Which wave are you actually in? If you are evaluating AI tools for general business adoption in 2025 or 2026, you are in the Second Wave. The First Wave is not waiting. Adjust your expectations and your approach accordingly.

Are you prospecting or supplying? If you are using AI to improve your own operations, you are prospecting. If you are building toward serving the certain needs AI adoption creates in your industry, you are supplying. Both are valid strategies with very different playbooks.

Are you auditing before you automate? The methodical prospectors of the Second Wave studied the geology before they dug. The equivalent is understanding your current processes, your data quality, your organizational readiness, and your actual use cases before purchasing a platform and announcing an AI strategy.

The Gold Rush did not reward the desperate or the hasty at scale. It rewarded the timely, the prepared, and the observant, in that order, depending on which wave you caught.

The AI boom is running the same playbook. The question is not whether the opportunity is real. It is whether you are building toward it the right way.

If you found this interesting, please share.

Take the Tax out of Taxonomy

April 6, 2026April 15, 2026Scott S NelsonLeave a comment

TL;DR: Your GenAI output is failing because your local workspace is a disaster. If your desktop is a dumping ground, your enterprise data lake is guaranteed to be a swamp. Stop blaming the model, establish a strict folder taxonomy, and kill the bad data habits before they scale.

For my regular reader, you know I can’t resist a pun, and the initial research note for this post was “Timely topic title: Take the Tax out of Taxonomy”. You also know I digress, so I thought I would get it out of the way at the start. Done. Moving on to the next level…

You are paying a massive hallucination tax. You bought a premium AI subscription or deployed a desktop agent. You pointed it at a project directory full of deprecated drafts, unstructured notes, “versioned” files, maybe even some sample code. Now the AI is confidently generating output based on requirements from three years ago, and maybe Wednesday’s lunch order.

The AI users assume a better foundation model or highly complex prompt engineering will fix output inaccuracy. They will not. According to the research paper A Comprehensive Taxonomy of Hallucinations in Large Language Models, hallucinations are not merely a bug, but a theoretically inevitable feature of computable LLMs, irrespective of architecture or training.

You cannot patch out hallucinations with a clever system prompt. You have to restrict their oxygen.

Generative AI operates entirely on the context it is fed. When you open a workspace (or upload a zip file, or point it at SharePoint), the model uses the folder structure to understand relationships. It assumes every file in the provided directory is equally valid, current, and relevant.

To get faster, accurate output, you must adopt a standardized, hierarchical folder taxonomy. This is not a housekeeping chore. It is a strict data contract for your AI. The academic consensus supports this structural approach. As outlined in A Systematic Framework for Enterprise Knowledge Retrieval, transforming a static blob of data into a navigable, context-rich knowledge architecture significantly improves model accuracy and reduces retrieval latency.

The Prep Station Metaphor

Think of an LLM as a highly skilled line cook with zero short-term memory. If you ask the cook to make an omelet, but point them to a kitchen counter where the fresh eggs are mixed in a pile with old receipts, bleach, and rotten produce, the resulting meal will be toxic.

You have to prep the station before you ask for the work.

This requires changing how you manage your local environments. You must segment your files and organize your folders explicitly by client, project, or (and sometimes “and”) specific activity. When the AI opens that specific folder, the taxonomy forces it to focus strictly on the given task.

The Micro-Macro Data Contagion

Local file structures often mirror enterprise data architecture. If your team’s shared drive is a chaotic dumping ground of nested, unnamed folders, your enterprise data lake is likely more of a content swamp.

Organizations often fund massive, top-down enterprise data transformation projects. They deploy tools to wrangle petabytes of unstructured data. Consultants are brought in to describe how it should be done, walk you through clean up, and leave you with a perfectly indexed wiki on how to maintain it.

The reason other organizations don’t do this kind of clean up? Aside from the few that don’t need it, the rest have someone recruited from an organization that did need it, then did it…then did it again a few years later. At least some had the excuse of acquisitions as a cause. The rest just forgot to make being organized part of the organization’s culture.

The report What Is Data Taxonomy? Use Cases & Best Practices points out that taxonomy programs do not fail because the classification structure was wrong. They fail because nobody owned it after launch, or the controlled vocabulary was written for data engineers rather than the business users who needed to adopt it. A taxonomy that nobody actively owns becomes outdated within twelve months.

If you build a pristine enterprise knowledge graph but your teams still save raw client notes to a local desktop folder named “Misc”, your clean data architecture will erode. Bad habits always defeat good infrastructure.

Start locally. Expand globally. Treat your team’s shared folder as the training ground for enterprise AI.

Here is the implementation baseline for engineering a reliable folder taxonomy.

Force Local Discipline: The guide Document Taxonomy Simplified notes a critical reality: AI can read full text, but without consistent indexing and classification, it has a harder time understanding which documents are current or relevant for a specific question. Humans must define the taxonomy. Organizations that rely solely on AI risk amplifying bad data.
Build a Strict Domain Hierarchy: Segment folders strictly by project and lifecycle status. Your AI should never have read access to a “Drafts” directory when you are asking it to write production documentation.
Establish the Data Contract: Metadata like document type, owner, client, date, and status tells AI not just what a document says, but how it should be used. This context improves AI ranking and reduces irrelevant hits that happen to share keywords.
Separate Human and AI-Native Formats: Segment your directories into files meant for humans and files meant for the AI. Lean towards using markdown files, text files, and CSV files for AI consumption. Keep heavy, formatting-rich files in a separate reference folder that the AI does not scan unless explicitly commanded.
Isolate Contextual Boundaries: Open-ended prompts can generate answers that blend multiple disciplines or outdated content. When your library is indexed by department, project, and lifecycle stage, AI can narrow its focus and answer questions within the right slice of your information.

You’ll note that there is a lack of solid reference examples of good taxonomies. This is, again, related to data cleanliness being driven by culture. The same taxonomy may or may not work for another organization. But a solid taxonomy based on how the organization thinks and processes can easily be maintained through training, communication, and the occasional reminder (preferably automated).

The ROI of a strict folder taxonomy is immediate. Output precision goes up. Token waste goes down.

If your AI is only as reliable as the context it receives, your unstructured file storage is an active threat to your workflow. Build the hierarchy locally. Clean up the directories.

Credit to Dylan Davis‘s video 5 Changes That Make ChatGPT & Claude 10x Better for sparking this research.

If you found this interesting, please share.