Too big to survive: There is no bailout for technical debt

The only difference between technical debt and financial debt is that costs are more often known in advance when taking on financial debt. Both types of debt are a tool when used intelligently with purpose and a plan to manage it and can take a devastating toll when used recklessly or imposed through misdirection or miscommunication.

Acceptable vs unnecessary debt

The original heading here was “Necessary vs unnecessary debt”. On further reflection, though, I realized that the only good reasons for incurring debt are time drive. If time is removed as a factor there is no reasonable need for debt. So then it becomes a question of when time is important enough of a factor to make debt acceptable.  The only context I can think of where time is universally an acceptable driver for debt is in an emergency situation.

Beyond an emergency, the evaluation for whether debt is acceptable because of time becomes a value proposition. In our personal lives, the first car and house is generally considered to be a good reason to accept debt because both have a large enough cost where they are likely to become more expensive over time, making it harder and harder to save for them in a reasonable period of time.

Similarly, building in-house custom applications rather than waiting for a Common Off The Shelf (COTS) solution that will incur technical debt in minimally reviewed code and the inevitable maintenance costs is worth it for functionality that is key to business value. Having worked for software vendors, I can honestly say that it if it isn’t already Generally Available (GA) as at least a patch one then it should still be considered unavailable as a COTS solution.

The other common time driver that should generally not be an acceptable reason to take on debt is impatience. Using a home equity loan to buy the latest television is a poor financial decision and implementing a new solution without a thorough evaluation and proper training is a gamble that will usually result in higher maintenance cost or a potential system failure.

The old adage “patience is a virtue” is not only true, it is a vast understatement of the value of patience.

Stop debt before it happens

The reason technical debt is becoming an increasing concern at many companies is because it tends to grow exponentially, just like financial debt. And for the same reasons. Of the three drivers for debt mentioned previously (emergency, long-term value, short-viewed impatience), the most frequent cause is the least necessary. Impatience. Problems arising from bad habits will grow until the habit has been replaced by actions that have a more positive effect.

Without getting too psychological here, impatience is a result of either wanting very much to move towards a reward or away from loss. For some odd reason, the drive forward doesn’t seem to repeat in the same context nearly as much as the drive to move away from. In technology, the drive to move away from is so common that the three key emotions related with impatience driven by escape have an acronym: FUD (fear, uncertainty, doubt).  In the case of IT decisions all three are essentially redundant, or at least a sequence. Fear driven by uncertainty and/or doubt. When the decision is around taking on technical debt, the fear is that business owners or customers will be upset if the feature is delayed or reduced and the uncertainty and doubt are the result of either not asking these stakeholders or asking only half the question.

Asking a stakeholder “Is it a problem if feature X is not in the release?” will frequently have a different answer than “Would you prefer we include feature X in a later release or risk certain delays to all future feature releases by pushing it before we have time to include it in a maintainable manner”? My experience is that most of the time neither question is asked and it is just assumed the world will end if users don’t have access right now to a new option that only 3% will ever use. It is also my experience that when the tradeoff of reliability and stability versus immediacy is explained to stakeholders they usually opt for the delay. I know many people believe that businesses have lost sight of long term implications and I believe that in many cases it not because they are deliberately ignoring them but because the people that should tell them when and why to be cautious are afraid of saying anything that will be considered “negative”.

To summarize, the best way to reduce the accumulation of technical debt is to have open, honest communication with stake holders about when decisions involve technical debt, the consequences of that debt, and the options for avoiding taking on the debt. Then, if the decision is to still choose the right now over the right way, immediately request buy-in for a plan, timeline and budget to reduce the technical debt. Again, my experience is that when the business is presented with a request to ensure functional reliability they frequently say yes.

Getting out of unavoidable or accepted debt

Taking on some technical debt is inevitable. This is why the modifiers usually, most often, and frequently were used in the previous section rather than more-comforting-yet-inaccurate always, definitely, and every time. Even in a theoretically perfect process where business always opts for debt-free choices and emergencies never happen, there are still going to be debt-inducing choices made either from lack of information or usage of imperfect vendor releases.

In the case where the debt is incurred unknowingly, once it is discovered be sure to document it, communicate and plan for its correction. The difference with cases where the debt is taken on knowingly because it is unavoidable without a much larger cost in vendor change, monitor the item with every project and when there is a reasonable option to correct it, do it. I once had to build something that was a bit kludgey because the vendor application clearly missed an implication of how a particular feature was implemented. We created a defect in the defect tracker which was reviewed in every release. 18 months later, the vendor found the error, corrected it and we replaced the work-around with the better approach in the next release. For major enterprises it is a good idea to raise a support case with the vendor when such things are identified, which I did not do at the time because the company I was managing this application for was too small to get vendor attention and the feature was not in broad use.

Originally published at InfoWorld.

Facebooktwittergoogle_plusredditlinkedinmail
© Scott S. Nelson

Why you need to change your monolithic architecture

In a perfect world the contents of this section belong at the end of the article, as part of a conclusion. But a key theme to this article is that there is a lot of unintentional imperfection in the world and one of those imperfections is a tendency for some to draw conclusions early, so I will start with the end and see if we can meet in the middle.

There will be people that strongly disagree with this article. There will be others that share the sense of epiphany I experienced formulating the outline, and probably more than a few who will have come to the same conclusion before this article was written.

For everyone else, I ask that you look at your own enterprise and decide for yourself if the architectural decisions that drive your IT solution are based on corporate culture more than the best way of providing business value.

The most commonly stated reasons to migrate

Skim the thousands of recent articles and community postings about enterprises adopting a new architecture or process (Microservices and DevOps are the buzzwords at the time of this writing, and I expect those will change several times before this article is no longer relevant) and the driver behind the move will generally translate with ease to one of the following:

  • Improved operational efficiency
  • Higher reliability
  • Faster time to market
  • Better support of business needs (arguably redundant to the first three items)

All of those are excellent reasons to change how things are done.  Moving from the current way of doing things to the new way of doing things will definitely yield those benefits in many (though assuredly not all) enterprises. I’ve been in this industry for over 25 years and here are some of the shifts that I have seen made for the exact same reasons:

  • Single tier to two tier architecture
  • Two tier to n-tier architecture
  • Fat client to thin client
  • Single server to redundant services
  • Redundant services to remote procedure calls (RPC)
  • RPC to Web Services
  • Thin client to app

Yes, Virginia, there are exceptions to every rule and observation

Every one of the above-mentioned shifts resulted in some level of success. And, except for the last (which I include because irony fascinates me) reflects a cultural shift towards distribution of overall responsibility, isolation of specific responsibility and increased specialization. I can already hear the exclamations of “There is an increase in demand for full-stack developers, which refutes this observation”.  I agree that more companies are looking for and hiring full-stack developers.  I have observed, with some delightful exceptions, that once the person is hired they are pushed in to some type of specialization within a couple of years (often less).

The most frequent real reasons a change is needed

There was a behavioral study done almost 100 years ago that resulted in an a concept known as The Hawthorne Effect where changes in worker conditions resulted in increased productivity resulting from the expectation of improvement rather than the change itself (my spin on the conclusions, many of which are still being debated).  When an enterprise architecture or IT process is changed, the result is similar.

There are many common examples of why a change is needed to achieve improvements, regardless of what that change is. Here are some that I have seen from working with dozens of different enterprises in several different industries.

The person that wrote that doesn’t work here anymore

My first few IT-related roles were as an FTE and a consultant for companies that were small enough where I was the sole IT resource involved. While I’m proud of the fact that some of my earliest applications are still in use over two decades later, it has dawned on me while writing this article that it may be simply because I had not learned to properly document applications back then and no one has been able to make any changes for fear of putting the company out of business.

I learned about the value of good documentation when I did my first project for a large multi-national manufacturing company, still as an independent consultant. I knew that I would be leaving these folks on their own with the application once my part was done and that people who would be hired after the project was complete would inherit the code and functionality without the benefit of any knowledge transfer meetings. At that time, I was not very unique in providing this service as part of my work. What I have learned since is that, like myself when I first started, many full-time employees either see no need to document their work or know how to.

In later years, many consultants either reduced or completely stopped providing documentation as a way to ensure more work or (to be fair) decrease costs in an increasingly competitive market.

The string that broke the camel’s back up

Even when best practices are followed in regards to simplicity and reuse for the first release of an application, by the Nth release/enhancement/bug fix the application can reach a state where attempts at any but the most minor modifications result in something else breaking. Did the team’s skill atrophy or is this a result of a less-capable team owning maintenance? No.

Fragility creeps into solutions over time because technical debt piles up. If “technical debt” is a new term for you, I strongly suggest reading up a bit on it. In short, like credit card debt, if it isn’t dealt with early and often it will increase until more effort is allocated to dealing with the problems then solutions that caused them.

A culture where identifying, documenting and correcting potential issues and enhancements identified throughout the lifecycle of projects will extend the longevity of an application’s value and reduce IT costs minimizing the frequency of technology refreshes driven by failing systems rather than adding business value.

String theory is an anti-pattern

Another heading could be “Spaghetti and hairballs”.  This driver to move is similar to the previously described scenario except it occurs at a lower level. The architecture may still resemble something that is comprehensible and even sensible, but some of the implementation code and configuration has become unmaintainable. Frequent causes of unmaintainable code are:

  • Changes in personnel with little, poor, or no documentation to reference upon inheritance.
  • Changes in personnel with plenty of documentation and no time allotted in the “project plan” to review it before diving in to the next set of “enhancements”.
  • No change in personnel and no time allotted for code reviews.
  • No change in personnel and no time allotted to address technical debt.

The common theme here is that haste makes waste. The irony is that the haste is always driven by a desire to reduce waste (or perceived waste in the form of costs associated with the activity that would have actually prevented the waste).

Growing Pains

Earlier I mentioned some of the transitions that I have experienced first-hand.  Here is the list again for context:

  • Single tier to two tier architecture
  • Two tier to n-tier architecture
  • Fat client to thin client
  • Single server to redundant services
  • Redundant services to remote procedure calls (RPC)
  • RPC to Web Services
  • Thin client to app

A side-effect of each of these is that they tended to increase the number of teams necessary to build and maintain solutions. By itself, the sharing of responsibility is a good thing. Efficiencies can be realized by having teams focused on specific areas as long as both technical and human interfaces are aligned to support the same goals. Unfortunately, cultures of competition and departmental isolation can also result from the same growth, resulting in a focus to improve efficiency at the expense of the original goal.

Your true story here

I would be flabbergasted if I have exhausted the causes here and would really enjoy it if you were to add your own experiences in the comments section for inclusion in future revisions of this article.

How to delay the changes until they are needed

The phrase “Wherever you go, there you are” applies just as aptly to migrating from one IT solution set to another as it does to trying to leave your troubles behind by relocating. If all of the bad patterns come along for the ride, the new will surely resemble what was just left behind sooner or later.

To be both fair and clear, most (if not all) of the common issues enterprises face today that drive them to move to a new platform to resolve their issues did not crop up because someone deliberately sabotaged the processes…they came about because the intention behind a move in the right direction at some point was forgotten and only the motion was left.

Documentation started falling by the wayside driven by two trends. The first was more intuitive user interfaces that required minimal or no documentation. This was a great idea with the best of intentions. However, some of the results of this trend are not so great, usually with the end users being the ones to suffer. There were many open source projects that ditched documentation by initially simplifying the interfaces. As the projects became popular, books, paid consulting and blogs with advertised were much more lucrative than documenting the more complex version. Since people were used to the software not having documentation (because it originally didn’t need it), this became acceptable.

Within the enterprise, the adoption of Agile practices and the philosophy of documentation being no more than necessary eventually evolved into little or no documentation both because the skills to properly document became atrophied and budget-pressured management convinced themselves it was no longer needed. While I am probably the most vocal about documentation problems resulting from fractionally resembling Agile (frAgile for short), there have been many long-standing Agile proponents who have recently been calling BS on how enterprises have claimed to adopt Agile and are actually destroying it by calling what they are doing Agile (or Extreme Programming or Scrum, etc.). Two example posts are The Failure of Agile and Dark Scrum.

Ideally, make it part of your project process to capture opportunities for improvement and document any technical debt knowingly incurred. Additionally, make it part of your SDLC to review the backlog of technical debt and technical enhancement recommendations at the start of project planning and make it mandatory to budget it reducing some level or debt and/or including some improvement.

Alternatively, what I have done for most of my consulting career is to keep a running catalog of such items throughout the project. Towards the end I will assemble my notes into a single document (occasionally happily checking off items that were addressed before project completion) as a handoff to management at the completion of each project.  Later, I would re-circulate the document prior to any follow-on projects.

I’m optimistic enough to expect that there will eventually come a time when this article is no longer relevant, and cynical enough to doubt that it will happen in my lifetime. The way I cope with this is to do things as best I can with the resources I can muster and continue to write articles like this to remind people that technology was supposed to make things simpler and easier so that we could spend time focusing on more interesting problems. Please share your coping mechanisms in the comments section.

Originally published at InfoWorld.

Facebooktwittergoogle_plusredditlinkedinmail
© Scott S. Nelson