top of page

Beyond the Bot Ep. 3: Deep Research & GPT-4.5

Tony and Steven for Beyond the Bot episode 3
Tony and Steven for Beyond the Bot Episode 3

In this episode of Beyond the Bot, hosts Tony DeHart and Steven King dive into one of the most exciting AI developments of the year: the introduction of OpenAI's GPT-4.5. Broadcasting from the Blue Sky Lab, they unpack not only the technical nuances of this new model, but also its real-world implications for businesses, developers, and everyday users navigating an increasingly AI-integrated world.


The discussion covers how GPT-4.5 differs from its predecessors, especially in terms of human-like interaction and empathy, while still grappling with the ever-present challenge of hallucinations. With insights on deep research capabilities, model selection, and the evolving cost-efficiency balance, this episode provides a clear-eyed look at the trajectory of generative AI. Whether you're a power user or just beginning your journey with AI tools, the conversation offers valuable context and expert takes on where things are headed.


Transcript:

Tony DeHart: Welcome to the latest episode of Beyond the Bot, where we break down the latest in AI and robotics news—and what it means for you. I'm Tony DeHart.


Steven King: And I'm Steven King.


Tony: We're in the Blue Sky Lab, and today we're talking about some really big advancements in the world of AI. In many ways, we got the first look at this kind of future agentic infrastructure with deep research and scheduled tasks. Some new updates to the capabilities of ChatGPT's models. We also got a big surprise this week with the first look at OpenAI's newest model: GPT-4.5.


Tony: Steven, you know, this is a big release. In many ways, it's different from some of the most recent OpenAI releases. How is this different?


Steven: For one thing, I think we're seeing a model that is about growing and figuring out some of the things that maybe previous models weren't as good at. So if you think about how a product develops, you have a product where it grows in technology or in the math and logic pieces—but maybe it also needs to grow in how it engages with its user. That's what we're seeing here: the ability of this particular model to be more humanistic, to engage with its user more, and to give back results that feel more like what they want, in a language that feels more human.


Tony: So, is this model actually better at reasoning or doing complex math and things like that?


Steven: No, and that's where the race has always been: making better reasoning. I think they did really well getting up to this point, but this one puts much more emphasis on the human factors and how people are going to engage with the content.


Tony: We've seen a big push on the adoption side—to use some of these chatbots to replace what in the past were human touchpoints. I'm thinking about training conversations and personal or professional development. Is this model improving those experiences?


Steven: Yeah, that’s what I’m really excited about. This model gives us a chance to respond more human-like. From an executive’s perspective, I can go and input some challenges I might be facing with my staff. I can see how the model thinks I should respond. I can even go back and forth with it or do some role-playing. It helps me prepare because it’s a little more empathetic. It's more like how an executive in my case would react. I also see value in using the API for how robots interact with people because now the response is more human and therefore more comfortable for people to engage with.


Tony: Now, I want to drill down on one thing you just said—while it may be better at helping you prepare for interactions with people, it’s really not ready to take over that role yet, is that right?


Steven: Absolutely. I don’t want to be confusing about that. We really need to think of this as augmenting me as an executive—not replacing me. I can change the inputs I give the model, and it can give me different responses. So it can be more customized to my team. But again, it’s augmenting me, not replacing me.


Tony: So in this race between man versus machine, this gives users superpowers—but it’s not a replacement for yourself.


Steven: Yeah I think it’s a great example of me being able to use a new tool to make me a better leader.


Tony: What are some real-world implications for users who might be leveraging this model day in and day out?


Steven: When you look at this new version, it’s going to pop up in your ChatGPT screen. You’ll be able to use the selector and choose it. You might find it’s the right one for you—or maybe it’s not. If you’re looking for more mathematical, logical tasks, there might be a better model. But if you're looking for something closer to how a human might write, this could be the best one. So for users now I would say experiment with the different ones. You’ve got to be a Plus user to get it, but experiment with the different models and kind of see which ones give you the output you’re looking for.


Tony: So in many ways it’s about finding the right tool for the right job. We talk in tech about vendor soup, and now we kind of have model soup. For new users of ChatGPT, it can be difficult to sort through which model applies to which task.


Steven: Exactly. Most key leaders don’t have time to dig into all the differences. It’s like having a drill—sometimes you put a screwdriver on the end, sometimes a traditional drill bit, sometimes you need a hammer drill. Changing the model is like changing out the drill bit. Sometimes you try with a regular drill and realize you need more power. I’d say try a few and see what works best. Let your developers decide what to use in the API or your products.


Tony: And we've seen some new tools that can go on those drills, right? One of the most exciting recently is Deep Research. It used to be reserved for the highest-tier ChatGPT users, but it’s now becoming more widely available. Have you used Deep Research?


Steven: I’m a big fan of Deep Research. It gives me capabilities that would have cost a lot of time and money. It’s like having a consultant. For example, I recently used it to understand weaknesses in our business. It analyzed competition, products, and provided analysis—not just data. It takes about 30 to 45 minutes, but gives me insights that would’ve taken weeks or thousands of dollars to gather.


Tony: For those unfamiliar, it might seem similar to web search. How is this a step beyond?


Steven: Web search goes out, grabs a fact, and returns. Deep Research dives deeper. It goes down rabbit holes, does analysis, spiders out, and returns a more synthesized, valuable result—more like what a consultant might provide.


Tony: One of my use cases was evaluating pricing across vendors. Instead of finding one price, Deep Research gave me a comparative analysis of multiple vendors. Much more like how I’d research myself.


Steven: Exactly. Sometimes you want a fact. Sometimes you want deeper analysis. That’s where Deep Research shines.


Tony: And Deep Research is compatible with this new model. One key advantage is its lower propensity for hallucinations. Can you explain that?


Steven: Hallucinations are when AI makes up content. This model does better at avoiding that, though not perfect. You still need to check sources and understand where data comes from. Even if hallucinations are reduced by half, that's still over 20%—so human oversight is crucial.


Tony: And it’s worth noting that you don’t get to choose which model Deep Research uses under the hood. You can choose the model that interacts with that data, though.


Steven: That’s right. From OpenAI’s perspective, it’s about choosing the most cost-efficient model that delivers strong results. And for me, Deep Research was so good I didn’t mind not choosing the model. But if the result isn’t quite what I wanted, then I’d want that choice back.


Tony: So this is kind of a culmination of two big frontiers: advanced reasoning and human interaction.


Steven: Right. We're starting to see them blended in a unified customer experience. Soon, we might not choose a model at the start of a chat—it’ll switch based on the query. As a leader, I like testing different models now. But I understand why OpenAI wants to simplify.


Tony: Do you think it’s better to let users choose at the start or have it switch dynamically?


Steven: For most users, switching dynamically is better and more efficient. But as a researcher, I want more control, even if the default is automatic.


Tony: And that model choice affects costs too. GPT-4.5 has significantly higher API costs—30 times higher in some cases. So engineers will need to make smart decisions on when to use which model.


Steven: Right. It’s about using the most efficient model for each task. Gain knowledge with one, pass it to another. That’s how you optimize your tokens and credits.


Tony: So for the average user, what’s the takeaway?


Steven: I feel more confident in the responses I get—less hallucinations, more human language. Play with the models now, but know that flexibility may go away in version 5.


Tony: That trade-off might be worth it for ease, but advanced users may miss the control.


Steven: Exactly.


Tony: Steven, it's been a pleasure unpacking this with you. Thank you for joining us for this episode of Beyond the Bot. We'll be back next week with more updates.

bottom of page