Arthur shows ‘what’s cooking’ in large language models for optimal performance
ChatGPT has sparked the explosion of interest in large language models. LLMs are going next-gen.
Since LLMs and foundation models are being used to redesign traditional operations, research style and logistics problems, ArthurAI Inc. is working to offer enhanced visibility about what’s cooking in these models using monitoring and tracking tools, according to John Dickerson (pictured, right), co-founder and chief scientist of Arthur.
“Monitoring in general is extremely important once you have one of these LLMs in production, and there have been some changes versus traditional monitoring that we can dive deeper into that LLMs are really accelerated,” Dickerson said. “The underlying environment of data streams, the way users interact with these models, these are all changing over time. Any performance metrics that you care about, traditional ones like an accuracy, if you can define that for an LLM, ones around, for example, fairness or bias, those need to be tracked.”
Dickerson and Adam Wenchel (left), chief executive officer of Arthur, spoke with theCUBE industry analyst John Furrier at the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how Arthur offers more insights into AI models like LLMs for better results and returns. (* Disclosure below.)
Arthur helps calculate AI model ROI
Based on the macroeconomic conditions happening, expenditures are being slashed. Arthur helps businesses determine AI model return on investment for optimization purposes, according to Wenchel.
“One of the things that we really help our customers with is really calculating the ROI on these things,” he stated. “If you have models out there performing and you have a new version that you can put out that lifts the performance by 3%, how many tens of millions of dollars does that mean in business benefit?”
Since Arthur streamlines artificial intelligence models using automation and real-time optimization and metrics, LLMs have started to tick because they are revamping real applications, according to Wenchel.
“What we’re seeing every single day is … applying LLMs to everything from generating code and SQL statements to generating health transcripts and just legal briefs; everything you can imagine,” he pointed out. “When you actually sit down and look at these systems and the demos we get of them, the hype is definitely justified.”
Tooling is an important aspect in the LLM space, according to Dickerson, who said that it plays an instrumental role in comprehending better ways to train the models.
“When I think about the areas where people are really, really focusing right now, tooling is certainly one of them,” he noted. “Like you and I were chatting about LangChain right before this interview started. Two or three people can sit down and create an amazing set of pipes that connect different aspects of the LLM ecosystem.”
Commercial AI still in its infancy
Even though commercial AI has not fully kickstarted, Arthur is accelerating speed in this area through aspects like performance analytics, according to Wenchel.
“I think AI as much as it’s been hyped for a while, commercial AI at least is really in its infancy,” he pointed out. “The way we’re able to pioneer new ways to think about performance for computer vision, NLP, LLMs is probably the thing that I’m proudest about. But I think it’s really being able to define what performance means for basically any kind of model type and give people really powerful tools to understand that on an ongoing basis.”
Since a cloud-like experience is being created for LLMs, caution should not be thrown to the wind so that these models can become even better, given that they are based on human feedback, Dickerson explained.
“From my side is, it’s just the input data streams, because humans are also exploring how they can use these systems to begin with,” he stated. “It’s really, really hard to predict the type of inputs you’re going to be seeing in production. To me, it’s … an unnatural shifting input distribution of like prompts that you might see for these models.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event:
(* Disclosure: ArthurAI Inc. sponsored this segment of theCUBE. Neither Arthur nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU