2024 has arguably been the year where we’ve spent the most time eagerly awaiting new AI releases and announcements.
Despite some disappointments and surprises along the way, it’s clear that AI development has generally left us with a positive impression. I say this because many of us are incorporating various models into our daily routines more and more.
That said, it’s also true that we always want more—it’s just human nature. With that in mind, let’s explore some challenges and trends that I’m confident big tech companies are already working on as we speak.
Let me clarify: I’m not trying to predict the future—that’s not my goal at all. Instead, I want to highlight areas where you’ve probably noticed, just like I have, that AI still has room to grow.
1) It's all about AI Agents
In recent months, there’s been a growing appetite to better understand this technology. But before we dive in, let’s tackle a simple question: What are AI agents?
These are intelligent systems that can reason, plan, and take action. Essentially, an AI agent can break down complex problems, create multi-step plans, and interact with tools and databases to achieve specific goals. For this reason, most people agree on the value of a well-performing AI agent.
The challenge, however, is that current models often struggle to reason logically and consistently. They’re typically good at executing simple plans, but when it comes to navigating complex scenarios with multiple variables, they tend to lose focus and make decisions that don’t always add up.
AI agents hold the promise of delivering more specific and personalized responses based on the context or input we provide. One of the biggest hurdles is striking the right balance between how autonomous these agents are and the quality of the responses they produce.
To bridge that gap, we’ll need more advanced models in 2025.
2) Rethinking human-in-the-loop AI Systems
You’ve probably heard about the study where a chatbot outperformed doctors in clinical reasoning. In this study, 50 doctors were tasked with diagnosing medical conditions based on case reports. The same information was given to a chatbot, which ended up achieving a higher score than the doctors.
What makes this even more fascinating is that some doctors were randomly assigned to use the chatbot as an assistant during the study. Surprisingly, this group—doctors working with the chatbot—scored lower than the chatbot working on its own. This points to a breakdown in both the AI system and the human augmentation process. Ideally, an expert combined with an effective AI system should perform better together than either could alone.
Deploying LLM-powered chatbots is no small feat. It requires crafting the right prompts, meaning you need to ask for things in just the right way. To address this, we need better systems that enable professionals to seamlessly incorporate AI tools into their workflows—without having to become AI experts themselves.
3) The rise of ultra-large AI Models
Large language models are built with an enormous number of parameters that are fine-tuned during training. The models we’ve seen in 2024 typically range between 1 and 2 trillion parameters. Looking ahead, the next generation of models is expected to be much larger, potentially surpassing 50 trillion parameters.
As we approach the end of 2024, we’re seeing launches like Gemini 2.0 and o3 from ChatGPT, signaling the direction these developments are taking. It’s no surprise that these more advanced models are also paving the way for entirely new business opportunities to emerge alongside them.
4) The potential of compact AI Models
While we’ve talked about the rise of larger models, there’s also a growing opportunity for smaller ones. These models, with just a few billion parameters (which might still sound like a lot), don’t need massive data centers loaded with stacks of GPUs to run. Instead, they can operate on laptops or even smartphones.
Take IBM’s Granite 3 model, for example. With just 2 billion parameters, it can run on a laptop without requiring heavy computational power.
Moving forward, we’re likely to see more models of this size tailored for specific tasks, offering efficient solutions without demanding significant resources.
5) The path to near-infinite memory in AI
I still remember the first time I used generative AI to help me write an email. Back then, the context window for the LLM was only 2,000 tokens. Today’s models handle context measured in the hundreds of thousands or even millions of tokens, with the goal of reaching near-infinite memory—where bots can retain everything they know about us at all times.
We’re approaching an era where customer service chatbots will remember every conversation they’ve had with us. At first, this might seem like a positive development—though, is it really?
6) Evolving AI Applications
Do you know what the most common business use cases for AI were in 2024?
According to a Harris survey, AI was primarily used to enhance customer experience, improve IT operations and automation, power virtual assistants, and bolster cybersecurity.
As we move into 2025, we can expect to see even more advanced use cases. With increasingly sophisticated multimodal capabilities, customer service bots will likely handle more complex issues instead of just generating support tickets. We may also see AI systems that proactively optimize entire IT networks or security tools that adapt to evolving threats in real-time.
7) The role of inference time
During inference, the model processes real-time data, comparing the user’s query to the information it learned during training. In this process, new AI models are extending their inference capabilities, essentially taking some time to "think" before generating a response. How long this takes depends on the complexity of the reasoning required. A simple query might only take a second or two, but a more complex or large-scale request could take several minutes.
What makes inference time computation so compelling is that inference reasoning can be fine-tuned and improved without needing to retrain or modify the underlying model. This introduces two key opportunities for enhancing reasoning in LLMs: during training—by using better-quality data—and now during inference, by refining the chain-of-thought process.
This dual approach could ultimately result in AI agents that feel significantly "smarter."
What AI-related topics do you think will emerge as a major trend in 2025?
Omniscience and access to databases with and without paywalls.