These days, everyone is either trying AI (rare), considering AI (most common), tried it before they were ready (with mixed results) or just AI curious (which doesn’t necessarily preclude the other possibilities). Sooner or later, your organization is going to be in the trying category, and then you will be either in the group of folks that excelled with it or stumbled. One of the key factors that will determine that result is the quality of your data going in, and the integrity of your data moving forward.
Let’s take a little time now to consider the relationship between data quality, data integrity, and generative technologies, and then think about potential how to improve the odds of landing in the successful adopter side of the coming AI divide.
The Human Edge: Fuzzy Thinking and Pattern Recognition
The current differentiation between AI and human intelligence lies in our capacity for fuzzy thinking and nuanced pattern recognition. Humans possess an innate ability to identify when information doesn’t fit a pattern or context, a skill that AI systems are still developing. While AI can process vast amounts of data at incredible speeds, it may struggle with contextual understanding and adaptability in novel situations.
This limitation in AI’s cognitive flexibility can lead to inefficiencies, particularly when dealing with complex, real-world scenarios. As AI systems attempt to process and make sense of imperfect or inconsistent data, they will consume more computational resources, leading to higher operational costs.
The Rising Costs of Using AI Inefficiently
The inefficiencies in AI processing are already manifesting at a macro level. Major tech companies and AI research institutions are reporting significant increases in power consumption as they scale up their AI offerings and user base. These escalating costs will (eventually and inevitably) be passed on to consumers, likely in the form of changes to service billing structures. Consider the current use of paying per token where either the cost per token will go up or the number of tokens require to complete common operations, or both. Think of how coffee used to be sold in 1-lb bags and now we pay more per bag where the bag now holds 10 ounces. AI may become the first digital form of shrinkflation.
Garbage In, Garbage Out…More Garbage In?
Recognizing these challenges, forward-thinking organizations are prioritizing data cleanup as an important first step on their AI adoption journey. However, it’s important to note that data integrity is not the result of a a one-time effort. It requires ongoing policies, procedures, processes to support what is likely the most import commodity any organization owns.
When data stores are initially created, they are typically clean and well-structured (don’t get me started on garbage test data, that is a separate article…coming soon!). The data becomes messy over time (how much time depends on many factors) simply through regular use (and sometimes irregular, but that is also beyond the scope of this post). When AI is added to that use, trained on that same use, it will get messier faster unless the processes that led to the mess are also addressed.
It may be tempting to consider this a training issue. Inadequate training can certainly lead to bad data, but good training may not be sufficient to correct the problem. This is because training is costly to create, costly to deliver, will need to be delivered again for every new team member, will likely need to be repeated periodically for all team members, and still may not always be remembered or followed.
The most reliable and cost-effective way to improve those processes is to automate those that can be automated. Automation may cost more to create than the training process, but then it is one-and-done until the process itself needs to change. The key to cost-effective automation is determining when it is still OK to kick an edge case out for a human to deal with it and have a good process for the human to be notified and the task tracked to completion.
Automation offers several advantages over traditional training methods:
- Consistency: Automated processes perform tasks the same way every time, reducing human error.
- Scalability: Once implemented, automated processes can handle increasing volumes of data without proportional increases in cost.
- Long-term cost-effectiveness: While initial implementation may be costly, automation provides ongoing benefits without the need for repeated training sessions.
Moving forward
Once the organization’s data has been cleaned up and processes put in place to maintain the integrity of that data, automated where possible, then the opportunity to get ahead of the competition through generative technologies is real for your organization. Like many adventures into new territory, there will be plenty of new challenges that will require urgent attention and decisive action. Preparing for what is known and predictable first will leave more resources for managing the unexpected.
And remember, most people heading into new territory seek the help of an experienced guide. Being new territory, it isn’t so important that the guide be experienced with the specific territory, but that they have experience of venturing into other new areas and have lived to tell about it.
Shout out to Jon Ewoniuk and his new podcast The 360 Salesforce Mastermind Podcast. This article was inspired by his first episode, where his guest spoke about niches (mine being a leadership in digital innovation and automation adoption) and the importance of good data to support generative technologies.
© Scott S. Nelson