Is Your Data Ready for AI?

These days, everyone is either trying AI (rare), considering AI (most common), tried it before they were ready (with mixed results) or just AI curious (which doesn’t necessarily preclude the other possibilities). Sooner or later, your organization is going to be in the trying category, and then you will be either in the group of folks that excelled with it or stumbled. One of the key factors that will determine that result is the quality of your data going in, and the integrity of your data moving forward.

Let’s take a little time now to consider the relationship between data quality, data integrity, and generative technologies, and then think about potential how to improve the odds of landing in the successful adopter side of the coming AI divide.

The Human Edge: Fuzzy Thinking and Pattern Recognition

The current differentiation between AI and human intelligence lies in our capacity for fuzzy thinking and nuanced pattern recognition. Humans possess an innate ability to identify when information doesn’t fit a pattern or context, a skill that AI systems are still developing. While AI can process vast amounts of data at incredible speeds, it may struggle with contextual understanding and adaptability in novel situations.

This limitation in AI’s cognitive flexibility can lead to inefficiencies, particularly when dealing with complex, real-world scenarios. As AI systems attempt to process and make sense of imperfect or inconsistent data, they will consume more computational resources, leading to higher operational costs.

The Rising Costs of Using AI Inefficiently

The inefficiencies in AI processing are already manifesting at a macro level. Major tech companies and AI research institutions are reporting significant increases in power consumption as they scale up their AI offerings and user base. These escalating costs will (eventually and inevitably) be passed on to consumers, likely in the form of changes to service billing structures. Consider the current use of paying per token where either the cost per token will go up or the number of tokens require to complete common operations, or both. Think of how coffee used to be sold in 1-lb bags and now we pay more per bag where the bag now holds 10 ounces. AI may become the first digital form of shrinkflation.

Garbage In, Garbage Out…More Garbage In?

Recognizing these challenges, forward-thinking organizations are prioritizing data cleanup as an important first step on their AI adoption journey. However, it’s important to note that data integrity is not the result of a a one-time effort. It requires ongoing policies, procedures, processes to support what is likely the most import commodity any organization owns.

When data stores are initially created, they are typically clean and well-structured (don’t get me started on garbage test data, that is a separate article…coming soon!). The data becomes messy over time (how much time depends on many factors) simply through regular use (and sometimes irregular, but that is also beyond the scope of this post). When AI is added to that use, trained on that same use, it will get messier faster unless the processes that led to the mess are also addressed.

It may be tempting to consider this a training issue. Inadequate training can certainly lead to bad data, but good training may not be sufficient to correct the problem. This is because training is costly to create, costly to deliver, will need to be delivered again for every new team member, will likely need to be repeated periodically for all team members, and still may not always be remembered or followed.

The most reliable and cost-effective way to improve those processes is to automate those that can be automated. Automation may cost more to create than the training process, but then it is one-and-done until the process itself needs to change. The key to cost-effective automation is determining when it is still OK to kick an edge case out for a human to deal with it and have a good process for the human to be notified and the task tracked to completion.

Automation offers several advantages over traditional training methods:

  1. Consistency: Automated processes perform tasks the same way every time, reducing human error.
  2. Scalability: Once implemented, automated processes can handle increasing volumes of data without proportional increases in cost.
  3. Long-term cost-effectiveness: While initial implementation may be costly, automation provides ongoing benefits without the need for repeated training sessions.

Moving forward

Once the organization’s data has been cleaned up and processes put in place to maintain the integrity of that data, automated where possible, then the opportunity to get ahead of the competition through generative technologies is real for your organization. Like many adventures into new territory, there will be plenty of new challenges that will require urgent attention and decisive action. Preparing for what is known and predictable first will leave more resources for managing the unexpected.

And remember, most people heading into new territory seek the help of an experienced guide. Being new territory, it isn’t so important that the guide be experienced with the specific territory, but that they have experience of venturing into other new areas and have lived to tell about it.


Shout out to Jon Ewoniuk and his new podcast The 360 Salesforce Mastermind Podcast. This article was inspired by his first episode, where his guest spoke about niches (mine being a leadership in digital innovation and automation adoption) and the importance of good data to support generative technologies.

Facebooktwitterredditlinkedinmail
© Scott S. Nelson
Freepik rendering of the prompt 6 cats in a line one whispering to the next playing the telephone game

Realizing Agile’s Efficiency

(Feature image by Freepik)
TL;DR: Fostering a culture of trust that leads to calm collaboration up front will yield the benefits that Agile principles promise.
Preface: While agile is in the title of this post, no claim is made that the post is about how to do agile or how SAFe is or is not agile. It is about how the Manifesto for Agile Software Development is self-clarifying in that it concludes with “while there is value in the items on the right, we value the items on the left more.” (italics mine), and how the value of the items on either side should be measured by their effectiveness in a given organization and the organizations influence on the “self-organizing teams” referenced in the Principles behind the Agile Manifesto. That said…
The value of architecture, documentation, and design reviews in SAFe was illustrated in a scenario that played out over several weeks.
The situation started with the discovery that a particular value coming from SAP had two sources. Well, not a particular value from the perspective of the source. The value had the same name, was constrained to the same list of options, but could and did have different values depending on the source, both of which were related to the same physical asset. For numerous reasons not uncommon to SAP implementations that have evolved for over a decade, it was much more prudent to fetch these values from SAP in batches and store them locally.
The issue of the incorrect source was identified by someone outside the development team when it was found to be commonly missing from the source selected for work prioritization. For various reasons that will be common across a variety of applications that support human workflow, this was considered something that needed to be addressed urgently.
The developer who had implemented the fetch to the correct source was tapped to come up with a solution. Now, one thing about this particular application is that it was a rewrite of a previous version where the value of “Working software over comprehensive documentation” was adhered to without considering the contextual reality that the team developing release one would neither be the team working on the inevitable enhancements nor ever meet that team. The re-write came about when the system was on its third generation of developers and every enhancement was slowed because there was no way to regression test all of the undocumented parts. Unsurprisingly, the organizational context that resulted in the first version missing documentation also resulted in some tables schemas being copied wholesale from the original application and not reviewed because requirements were late, resources were late, and the timeline was unchanged. So, with no understanding of why not to, the developer provided a temporary solution of copying the data from one table to the other because it had only been communicated that the data from one source was the correct data for the prioritization filter. Users were able to get their correctly prioritized assignments and  the long-term fix went to the backlog.
As luck and timing would have it, when the design phase of the long term fix was picked up by the architect, the developer was on vacation. Further, while this particular developer had often made time to document his designs, the particular service the long-term fix depended on was one of the few that were not documented. Still further, it had been re-design as another service had been discovered to obtain the same data more reliably. But all of the data currently loaded was from the previous version, so even the attempt of reverse engineering the service to get sample data for evaluation was not possible. These kinds of issues can lead to frustration, which in turn dampens creative thinking, which is to say that had the architect looked at the data instead of following the assumption from the story that the data wasn’t yet readily available, he would have discovered that it was already present.
Eventually the source of the correct value was identified and a design created that would favor the correct value over the incorrect value but use the incorrect value if the correct one was not available to allow for the assignments to continue because sometimes the two actual values were the same (which is inspiration about a future post discussing the value of MDM). The design also included updating to the correct value if it became available after the initial values were set. The architect, being thorough, noted in the design a concern about what should be done when the correct value came into the system after the record that was prioritized based on that value has been assigned and processed by a user. After much back and forth, it was finally communicated that while the data was retrieved from the same system and labeled with the same name, the two values were not different because one was incorrect but because they were in fact to separate values meant for two different viewpoints. Which means that the design of attempting to choose and store a single correct value in both tables was invalid and that the records altered for the work-around were now (potentially) invalid. This made the correct solution a (relatively) simple change to the sorting query.
With the full 20/20 vision of hindsight, it is now clear that if the team did not feel that ever issue needed to be treated as an emergency and all of the product, design, and development stakeholders had discussed the issue prior to taking action, about 80 hours of work would have been reduced to 4 hours. Yes, there were other factors that impacted the need of 80 hours to deal with what is a fairly minor flaw, but those factors would not have come in to play had the questions been asked up front and clarity reached through collaboration.
Facebooktwitterredditlinkedinmail
© Scott S. Nelson

Salesforce Native vs App vs Connector

 

Fair warning: This is more about not having written anything in a while than the value of the topic…and the subject matter is more about drawing your own conclusions than relying on what is easily available, so…

App is one of the most over-used and ill-defined terms in the IT lexicon. This is greatly due to it being used by people outside the IT domain. The domain itself has had some whoppers, like the DHMTL that was a must-have at the turn of the century even though the only honest definitions of the term were that it had no real definition. Microservices runs a close second simply because there is an invisible grey line between SOA and Microservices that is a mile wide and an inch short. But I digress, as is often the case.

What I’m really thinking about today is apps in the world of Salesforce.com. Specifically, apps that run inside the Salesforce CRM platform. I started thinking about this because I was looking into CPQ vendors over the weekend to refresh myself on the market to support a project proposal to select the best option for a particular business. It’s a large space, so it always helps to find someone else’s list to start with and someone had given me a list from a major analyst group as that starting point.

Other than analysts, no one likes long lists with lots of details, so I first wanted to narrow it by those that integrated with Salesforce. It didn’t take me long to remember that Salesforce is the gold standard for CRM and there were only two that didn’t. I didn’t go through the whole list to get to that count because I’ve done these kind of evaluations before and figured out after the first half dozen that this was not how I was going to narrow the list. The two were just what was noticed while skinning this cat another way.

The first trimming of the list was by industry focus. The potential client is a tech service, sort of SaaSy, and “High-tech products” was one of the categories, which was much closer to what they did than “Financial services” (though they have customers in that domain) or “Industrial products” (which the analyst seemed to think usually included high-tech, though not sure why).

To spare you the tedium of the several hours of wading through 1000’s of lines of marketing prose that could have been delivered in a table (ahem, yes, I know, kettle, black, etc.), from just the perspective of Salesforce CRM integration, I found it useful to divide them into three basic styles:

Native: An application that is built entirely in Salesforce
App: An app that runs inside Salesforce that depends on data and/or functionality managed outside of Salesforce.
Connector: An application that runs independently of Salesforce and has a way to share data with Salesforce.

The terms for these distinctions change often over time and between sources. These definitions are for clarification of the table below and are purposely simplified as deeper distinctions are less relevant about integration than other aspects.

In this particular exercise, the ask was to provide some pros and cons to these different styles. My style being one of adapting general terms to technical solutions, I responded with a non-exhaustive list of Benefits and Concerns:

Integration Styles Native App Connector
Benefits
  • Easily accessible in the sales process context.
  • Seamless integration with other native apps.
  • Has gone through Salesforce security review.
  • No data latency.
  • Easily accessible in the sales process context.
  • Access is managed within Salesforce.
  • Has gone through Salesforce security review (only if installed through App Exchange).
  • Control over storage impacts.
  • Broader range of vendors to choose from.
Concerns
  • May require additional Salesforce licensing.
  • May have impacts on storage limitations.
  • Frequently limited functionality.
  • Support may require coordinating the vendor and Salesforce.
  • High potential for latency.
  • Difficult to trouble-shoot.
  • Users must use multiple applications.

Of course, the next question is usually “which is best”, and I must respond with the architect/consultant/writer-needing-higher-word-count with “it depends”. And it depends on lots of things, such as who will be maintaining the solution; how are capex and opex prioritized and managed; how do different stake holders actually need to interact with the solution; and is it clearly understood that this only one aspect of a vendor selection process and all known aspects must be documented and weighted before giving a recommendation?

The real reminder for me when I finished this brief analysis was that context is everything when doing any type evaluation. The list that I started with included products that were questionable as to whether they really belonged in the report and many of the products were listed as serving domains that there was no mention of on the vendor’s site and no compelling reason why the unmentioned-domain would want to use it. If I had direct access to the author(s) I may learn something by asking, but the important thing is that I used their input only as a starting point and applied my own analysis because when the recommendations are provided to a client, those author’s name will not be on the agenda and they will not be there to answer the questions that hadn’t yet been thought of.

Facebooktwitterredditlinkedinmail
© Scott S. Nelson

5 Steps to Enterprise Privacy 

Some of my opinions on how to deal with privacy regulations concluded with a 5 step process for managing the technical aspects recently published at https://www.logic2020.com/insight/tactical/5-step-technical-approach-privacy-protection.

Facebooktwitterredditlinkedinmail
© Scott S. Nelson