Rapid tech advances and real-time data drive the next wave of innovation
The rapid evolution of technology is not just a buzzword; it’s the driving force behind today’s most significant innovations. Real-time data, once a niche aspect of tech, has now taken center stage, influencing everything from artificial intelligence to cloud ecosystems.
“We started to see the formation a few years ago, where if you’re building an ecosystem on top of a cloud, you have platforms, and platforms require data,” John Furrier said, during a recent discussion on theCUBE, SiliconANGLE Media’s livestreaming studio. “We were always saying that the role of data would be very important, where developers would be coding with data in line, and not just using data.”
This shift has become more pronounced with generative AI, making data’s role not just important, but critical. The number of large data centers operated by hyperscale providers surpassed the thousand mark in early 2024, as noted in a Synergy Research Group report. This explosive growth underscores a new paradigm where access to and the management of real-time data are not just advantages, but necessities for staying competitive in a rapidly evolving technological landscape.
“Meanwhile it has taken just four years for the total capacity of hyperscale data centers to double, as the number of facilities grows rapidly and their average capacity continues to climb,” the research report reads. “Looking ahead, Synergy forecasts that total hyperscale data center capacity will double again in the next four years.”
This feature is part of SiliconANGLE Media’s exploration of real-time data trends ahead of Aerospike Inc.’s Real-time Data Summit, taking place from June 25 to 26.*
How real-time data and AI are transforming development and computing power
Data being generative means it is generating a runtime environment for assembling data and content. It represents a sort of shift left for developers, according to Furrier.
“You’re starting to see early signs, with the large language models and the complex neural network multimodal capabilities,” he said. “We are right in line. That’s where all the hype is, and that’s where all the action is from a developer standpoint. This is the perfect storm, where this is going to shift the game a bit from where it was and to a new paradigm.”
Getting one’s hands around a new wave of processing power is certain to be a fluid and competitive challenge. Take the ongoing global battle to control the future of computing power.
“The present battle involves several of the world’s biggest tech companies, including Google, Apple, Microsoft, Amazon and Meta, making their own AI chips because they want to control key assets and, thus, their own destiny,” according to Casey Logan, senior principal, supply chain, at Gartner Inc., in a recent report.
The road ahead for innovation in a new gen AI world is anything but clear. However, access to real-time data and the ability to handle vast amounts of it from various sources is becoming the standard. Companies such as Aerospike Inc. have long been involved in real-time data. However, it’s becoming increasingly clear that customers need to utilize multiple streaming technologies simultaneously to manage this influx of data, including solutions like Apache Kafka.
What’s the road ahead, both for those coming out of college and for those established in their careers? And how does everyone keep pace when the pace of innovation moves so fast?
The paradigm shift in higher education
Even as generative AI was in its early stages, the commentary came quickly: Should one rethink their college major in this new age of AI? After all, generative AI is changing everything, and that figures to be the case for higher education too.
Recently, the Wall Street Journal published an article stating that computer science is hotter than ever at United States universities. But it isn’t always immediately translating into a career. Still, AI talents are in high demand from companies such as Amazon.com Inc., Nvidia Corp. and Meta Platforms Inc., and that’s leading to big changes. It represents a paradigm shift, according to Tim Faulkes, chief developer advocate at Aerospike.
“Technology is moving at such a pace, and the ecosystem has moved at such a pace that the people who are coming straight out of university are more versed than the people who are more mentoring,” Faulkes told theCUBE in a recent interview. “There’s so many moving pieces, and they’re so novel. How they hang together, it’s almost frustrating for the experienced developers … you hear all these terms, how do you put them all together?”
Of course, various domain-specific products can be designed to meet the specific needs of individual industries, such as e-commerce or healthcare. But the ongoing paradigm shift does bring with it new challenges.
“Computers were not built for hallucinating. That is a side effect of the new platform shift. And the new platform shift went from the old way of, you code stuff and you get a response,” Furrier said during a recent interview. “Things are programmatic, it’s deterministic in some levels. Now you have a new model where you don’t know what you’re going to get, and oftentimes it’s not the same answer because it’s generating and the data drives that.”
Harnessing AI to overcome new challenges in data processing
In the past, real-time data used to be focused on certain applications. Now, it’s mainstream because of AI, which has enabled rapid processing and analysis of vast amounts of information. This shift has transformed industries, enhancing decision-making and operational efficiency across the board.
“Apple pointed out at their recent Worldwide Developer Conference that you’re processing inference in the device. You have device to core, end-to-end work streams. It’s the horizontal scalability, but yet you need to have low latency,” Furrier said. “This is a technical challenge.”
Given these challenges, what’s to be done? For Aerospike, it’s clear that work needs to be done now to facilitate real-time data challenges. The mashup of external data and internal data becomes very important, according to Lenley Hensarling, chief product officer of Aerospike Inc. Being able to supply it in real time so that it’s up-to-date is key when it comes to differentiation and competitiveness.
“The most important thing companies can do is be able to ingest that data, but also to make that data available to other applications, whether they’re ML for training, whether it’s creating the context that you’re going to decide on, the stream of data that provinces a signature, a signal that you have to make a decision on in the moment,” Hensarling said.
It’s all about being able to ingest all the data in a way so that it immediately becomes available, Hensarling added. Many different tools do that, including Databricks, Confluent and Redpanda.
Understanding LLMs and real-time data
Ultimately, how does everyone keep pace with innovation, and how best might one respond to a developer seeking a readiness plan? For some, it involves turning to learning tools and readiness frameworks.
But developers must also set the table for building robust applications with multimodal capabilities. The first part involves large language models, according to Faulkes.
“They’ve got to be able to understand what they’re doing with their inputs,” he said. “I’ve got inputs of video and audio and even things like business objects. So, if you’re doing a recommendation engine, you want the business objects that represent your data in there. It’s a combination of all these factors. And you’ve got to use a large language model to do something useful with it.”
However, without context for LLMs, they’re likely to hallucinate. It’s also important to note that LLMs aren’t writing code, they’re writing English.
“You’ve got the LLMs, you’ve got the prompt engineering, you’ve got your vector databases so that you can get the right information out of all your inputs and give it the answer,” Faulkes said. “And then accuracy. It’s a fuzzy search. It’s meant to be, ‘I don’t necessarily want an exact answer. I want things that are related to it.’”
When it comes to a traditional relational database, if one gives the LLM the same input, it’ll always be the same answer. But things are different when it comes to fuzzy searches.
“There’s going to be set of ecosystem tools around, ‘Is this the right answer? How do I actually know it’s an approximate nearest neighbor search we all tend to use in vector databases? How proximate is it? Is it the right information or is it wrong information?’” Faulkes said.
Though developers are sure to face more challenges in the weeks and months to come, it’s clear that the convergence of cloud computing, generative AI and real-time data demands a cohesive approach. The pieces may remain a moving target, but embracing a unified strategy appears essential in the new era.
(* Disclosure: TheCUBE is a paid media partner for pre-event coverage of the Real-time Data Summit event. Neither Aerospike Inc., the sponsor of theCUBE’s coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Image: ismagilov / Getty Images
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU