Are Your GenAI Use Cases Serving Up American Cheese Or Rich Stilton?

Elise Carmichael is the CTO of Lakeside Software, where she oversees the design and delivery of its digital employee experience platform.

getty

It was only three years ago that the Nobel Laureate Kazuo Ishiguro published Klara and the Sun, a fascinating novel about an artificial friend powered by the sun. Even then, such an imagined application of artificial intelligence seemed far-fetched. Today, I don’t think a future populated by artificial friends is too distant.

In fact, you could argue that they are already working among us, thanks to the conversational Q/A use cases enabled by GPT-4 and other large language models (LLMs). I’m not talking about a sentient robot buddy who goes on and on about Taylor Swift or Asimov’s robots. Instead, I mean copilot chatbots that can make your job easier and more productive if they are fundamentally great at natural language processing.

Three things must happen, though, before the use cases for conversational generative AI can expand beyond the core ones McKinsey already has identified.

1. The data must be smarter, more relevant and more nuanced.

As I’ve mentioned before, LLM hallucinations are still a problem (e.g., it happens about 15% to 20% of the time with the GPT-4 model). There are a few main reasons why LLM outputs are not always great. First, they are trained on public-domain internet data, which we all know isn’t always accurate, relevant or unbiased.

MORE FROMFORBES ADVISOR

Best High-Yield Savings Accounts Of 2024

Kevin Payne

Contributor

Best 5% Interest Savings Accounts of 2024

Cassidy Horton

Contributor

Second, the models aren’t really "smart"—they work by predicting the next word based on the first set of words provided. Without truly understanding nuance or context, generative AI models often serve up American cheese (bland) when you really want a strong Stilton instead.

So, how do you marry the capabilities of LLMs with the advancement of real-world needs, such as data analysis? I wish I could say it’s as simple as feeding a big data schema into an LLM, asking it to query the data with the provided data query language, and, poof, out comes a clean response, but it’s so much more complex than that. What is clear, however, is that the first step to being able to do a complex task, such as querying datasets with natural language, is to have well-structured, well-defined and broad data in place.

You could say that we’ve seen only the waxy rind of the generative AI cheese wheel. Don’t ask me why I went with the cheese analogy—just bear with me. Let’s continue with the data analysis example because, frankly, that’s what we look at in IT departments when trying to resolve an issue (e.g., "Why is my machine slow?") or find out information (e.g., "Who has Visio installed and actually uses it?"). Many more use cases are possible if the data is there to give context and explainability to answers to natural language prompts or queries that the model can understand.

In the case of IT, you want the model to be smarter than your best systems engineer because it can access and compute much more data than a simple cheeseball human (I’m so sorry). Advanced natural language queries the model can accurately understand will democratize IT for both long-time practitioners from the IT trenches who once had to know all the inner workings of computers and code and early-entry team members who don’t need to be as technical as generations before, generally.

Don’t worry; there’s still room for all the technical gurus, but they’ll be spending their time on the truly challenging problems, not racing down rabbit holes in war rooms in the middle of the night.

2. The trust in the model must outweigh the risk.

Depth, breadth, history and quality of data are crucial for accuracy, as Enterprise Strategy Group Senior Analyst Gabe Knuth explains in “The Essential Role of Data and Data Quality in IT-related AI Model Training.” Large levels of data, and the right data, are what help ML models become more accurate, which leads to trust in the model. It's kind of like aging cheese to delectable perfection (please forgive me)—time matters. The challenge with leveraging conversational models for mission-critical use cases, however, is that the output must be on target nearly 100% of the time and repeatable.

LLMs do not yet have unwavering trust for good reason in any industry. However, that does not mean generative AI doesn’t have high-value uses. As my fellow Forbes Technology Council member Matt Hollingsworth points out, "Generative AI is not sufficiently accurate for healthcare applications" since healthcare "strive[s] for 100% accuracy."

Therefore, keeping a “human in the loop” is imperative for building conversational models you can count on. The human can review results and re-train the model accordingly. It’s worth noting that the “human in the loop” is one undeniable reason why AI is not going to replace humans anytime soon. (Sorry, Klara.)

3. The AI model must deliver pragmatic value.

Many of us have written about generative AI as the latest shiny object. Unless there is pragmatic, proven value to the AI model, then what’s the point? As a Chief Technology Officer, I have made “Show me the value” our team mantra as our data scientists and software engineers are perfecting IT-related use cases for conversational AI.

Specifically, we are proving the value of AI that speaks IT™. What I mean by that is that our integrated LLM will understand the context of a natural language question (“When was the last time each machine was patched, and what was the latest patch?” or “How many people used Adobe Acrobat in the past year for more than five minutes?” and serve up real-time, relevant results.

AI is inevitable.

The rapid acceleration of GPT models has been one of the more exciting things in my career to watch unfold. When we talk about the future of AI, I would say that the future is already here. When GPT-3.5 came out, it was pretty good at scoring well on standardized tests. Now, since GPT-4 came out, it’s like the valedictorian. Why? Because training the model on more and better data made all the difference. Now, please pass me the Brie (and please don’t ban me for being cheesy).

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Follow me on LinkedIn. Check out my website.

More From Forbes

Are Your Generative AI Use Cases Serving Up American Cheese Or Rich Stilton?