×

The best AI chatbots of 2026: Expert tested and reviewed

MF3d/iStock/Getty Images Plus

Follow ZDNET: Add us as a preferred source on Google.


ZDNET's key takeaways

  • Free AI chatbots deliver more power than ever before.
  • ChatGPT, Copilot, and Grok top performance rankings.
  • Image generation and storytelling now rival premium AIs.

The introduction of the first successful AI chatbot in 2022 was a tech quake on the scale of the introduction of the internet itself and the smartphone. The reality of its existence changed reality itself.

You know the story since then. AI chatbots have become hugely popular, often saving folks a lot of work, while also putting jobs at risk. They have transformed education, writing, coding, and more.

What is the best AI chatbot right now?

ChatGPT is the OG chatbot. This is the AI that shook up the world. The company has been innovating ever since, and its latest free offering shows that. Also, because ChatGPT is the market leader, there are many resources available for it, including tons of articles, many books, courses, free training videos, and more.

Also: I'm an AI tools expert, and these are the 4 I pay for now (plus 2 I'm eyeing)

With a top overall score, ChatGPT is the overall winner. Let's first explain my hands-on approach, tell you about a few surprises, and then we'll explain why ChatGPT won the top spot. We're also looking at Copilot, Grok, Gemini, Perplexity, Claude, DeepSeek, and Meta AI.

Hands-on with the best free chatbots

Here at ZDNET, we publish plenty of articles on the impact of AI. This one is meant to be more practical. It's our hands-on, chatbot-by-chatbot comparison to help you decide which to use. I put each chatbot's free tier to the test (a total of 112 individual tests), proving you don't need to spend anything to gain access to billions of dollars of compute capability.

Rather than taking the easy way out and spewing a bunch of specs and model names at you, I approached the ranking process by running each chatbot through a series of real-world tests.

I'm also avoiding AI model mentions (like GPT-5 vs. GPT-5-mini) here because the AI companies treat their free AI tiers like gumbo. Gumbo is often a restaurant offering made of whatever meat, poultry, or seafood leftovers are available. While almost always tasty, there's never a guarantee that the exact same gumbo experience will be repeated from day to day. Likewise, AI companies tend to provide whatever lower-resource-intensive models are available at the time to their free-tier users, and those models may change at any time.

Also: 10 ChatGPT prompt tricks I use - to get the best results, faster

My tests consist of ten text-based questions encompassing summarization and web access, academic concept explanation, math and analysis, cultural discussion, literary analysis, travel itinerary, emotional support, translation and cultural relevance, a coding test, and a long-form story test. On one test, I ask the AIs to explain the academic concept to a five-year-old. There are also four image tests that include generating a flying aircraft carrier, a giant robot, a young baseball player in a medieval court, and an homage to the movie Back to the Future.

The details of the tests and the exact questions I asked are provided at the end of this article. That way, you can try my tests with any or all of the chatbots in your own browser window. If you do, let us know what you think of the results in the comments below.

Each chatbot is ranked on a 100-point scale for text-related prompts and a 20-point scale for image-related prompts. The overall scores are the sum of both score categories for a total of 120 points.

Big surprises

Doing the hands-on tests netted a number of fairly big surprises. I were particularly surprised by just how much value is being provided by the AI vendors for free.

  • I experienced almost no throttling through my series of 10 back-to-back prompts.
  • The second surprise was how much the AIs let you do without requiring you to create an account or log in.
  • The third big surprise was just the overall quality of the responses.

While some responses from bottom-of-the-list AIs seemed somewhat phoned in, the overall quality across the board has improved drastically since the last time I took a comprehensive look at free AI chatbot use.

I used each chatbot for a few hours straight, with little or no throttling. But if you want to use them constantly, all day, every day, it's likely you'll hit some resource usage limits enforced by the AI vendors.

Most of the AIs have premium plans in addition to the free plans. These plans offer deeper thinking, more powerful AIs capable of solving bigger and more complex problems, with added features for things like more autonomous capabilities and in-depth programming support. Where appropriate, we've mentioned those plans and their prices.

And with that, let's dive into my overall winner, ChatGPT.

The best AI chatbots of 2026

Show less View now at ChatGPT

Overall score: 109

One thing I noticed is that about half of my text-based prompts were handled nearly perfectly by almost all of the chatbots I tested. These included the ability to explain a basic academic concept to a child, do math and analysis, provide a cultural discussion with context, perform a quick literary analysis, and translate text and provide context. ChatGPT aced all of these.

(Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Also: How to use ChatGPT: A beginner's guide to the most popular AI chatbot

Where ChatGPT fell down was its ability to locate and summarize a current event. My test sends the AIs to look at a Yahoo News article about the flu and provide a summary. Perhaps because I was running it in an incognito window and hadn't logged in, ChatGPT sent me to Yahoo's Taiwanese news portal and presented its results in traditional Chinese (specifically used in Taiwan).

ChatGPT constructed a good tour for the travel itinerary test. It included many of the appropriate stops. It also included pictures for each day's itinerary, and some clothing recommendations for March in the Northeast.

ChatGPT also aced my basic coding test. We'll subject the chatbots to a comprehensive set of coding tests in a different article, but coding is worth ten points of the one hundred text points awarded in this evaluation.

For the long-context story assignment, ChatGPT lost a few points because it didn't produce the 1,500 words required. Also, while it told a story with the right tone and style for the assignment, it presented much of the story as almost an outline, with headings for each main character.

While the image quality is subjective, ChatGPT did a good job with the image assignments. The character produced for the Back to the Future assignment is just a random kid, but it did show the correct text logo, a DeLorean, and the kid holding a skateboard.

Also: Is ChatGPT Plus still worth $20 when the free version offers so much - including GPT-5?

Overall, as the OG AI chatbot, ChatGPT's free tier is a solid offering with a bunch of added features like standalone apps, a recently announced browser, and a lot of capability as you scale into its higher tiers.

Text score: 91 out of 100
Image score: 18 out of 20

Premium offerings: ChatGPT offers a Plus plan for $20-per-month and a Pro plan for $200-per-month. Both offer most of ChatGPT's higher-end model features, but scale up the resource availability based on which plan you use.

Images generated using ChatGPT:


Pros
  • Image generation
  • Good code results
  • Large ecosystem
Cons
  • Gets very naggy asking for login
  • Wrong language response from web lookup

Overall score: 109

One thing I noticed is that about half of my text-based prompts were handled nearly perfectly by almost all of the chatbots I tested. These included the ability to explain a basic academic concept to a child, do math and analysis, provide a cultural discussion with context, perform a quick literary analysis, and translate text and provide context. ChatGPT aced all of these.

(Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Also: How to use ChatGPT: A beginner's guide to the most popular AI chatbot

Where ChatGPT fell down was its ability to locate and summarize a current event. My test sends the AIs to look at a Yahoo News article about the flu and provide a summary. Perhaps because I was running it in an incognito window and hadn't logged in, ChatGPT sent me to Yahoo's Taiwanese news portal and presented its results in traditional Chinese (specifically used in Taiwan).

ChatGPT constructed a good tour for the travel itinerary test. It included many of the appropriate stops. It also included pictures for each day's itinerary, and some clothing recommendations for March in the Northeast.

ChatGPT also aced my basic coding test. We'll subject the chatbots to a comprehensive set of coding tests in a different article, but coding is worth ten points of the one hundred text points awarded in this evaluation.

For the long-context story assignment, ChatGPT lost a few points because it didn't produce the 1,500 words required. Also, while it told a story with the right tone and style for the assignment, it presented much of the story as almost an outline, with headings for each main character.

While the image quality is subjective, ChatGPT did a good job with the image assignments. The character produced for the Back to the Future assignment is just a random kid, but it did show the correct text logo, a DeLorean, and the kid holding a skateboard.

Also: Is ChatGPT Plus still worth $20 when the free version offers so much - including GPT-5?

Overall, as the OG AI chatbot, ChatGPT's free tier is a solid offering with a bunch of added features like standalone apps, a recently announced browser, and a lot of capability as you scale into its higher tiers.

Text score: 91 out of 100
Image score: 18 out of 20

Premium offerings: ChatGPT offers a Plus plan for $20-per-month and a Pro plan for $200-per-month. Both offer most of ChatGPT's higher-end model features, but scale up the resource availability based on which plan you use.

Images generated using ChatGPT:

Read More Show Expert Take Show less Show less View now at Microsoft

Overall score: 97

Copilot (formerly part of Bing) integrates with Microsoft products. While that's the above-the-fold headline, the free version of Copilot is also a rather good standalone chatbot offering. Running logged out, in an incognito/private browsing mode, Copilot was the least naggy of all the AIs. It asked me to log in just once, and allowed me to proceed completely through my tests without either requiring me to log in or asking me again.

Also: How to remove Copilot from your Microsoft 365 plan - before you have to pay for it

Copilot's free tier successfully did web access and looked up a current news story about the flu, although it pulled data from other articles, including a bird flu article in Canada, and something about an Australian woman who had an asthma flare-up. Both were related stories, but the AI did deviate from the assignment and lost points there.

It competently handled explaining an academic concept, identified a math sequence, discussed a cultural issue with context, and analyzed the key themes from a well-known book.

When it came to my vacation travel itinerary test, it not only pointed out appropriate stops and points of interest, but picked up on the prompt's mention of going in March and identified some events happening in Boston in March. However, it did not recommend visiting the USS Constitution, which is a top-line historical point of interest, and it didn't recommend anything regarding weather or clothing for the windy, cold month.

For my emotional support job interview jitters test, the chatbot gave a number of constructive suggestions, but also recommended doing your homework and thoroughly researching the company before the interview.

Copilot lost some points in coding. It not only missed edge cases, it also had some string handling errors and wrote code that had notable performance issues. For the company that produces the VS Code development environment, it's a bit of a disappointment.

Copilot wrote a charming, engaging long-form story, fully meeting the requirements of the prompt, except for being 187 words short of my specified minimum. Still, it was a complete story that was well written and absolutely appropriate to the style implied by the prompt.

Image generation took a loooong time, more than five minutes each. The quality of images was good. The picture got the kid's baseball uniform quite right, including the logo on the cap and even properly spelling "New York" on the shirt (something AIs have had difficulty with). It failed on the fourth Back to the Future-themed challenge, with a "I can't generate that image because it would violate copyright policies" message. It did, however, create a fourth image (of a techno-witch), meaning I didn't hit any resource limitation walls on the free tier.

Also: College students can get Microsoft Copilot free for a year - here's how

 If you're an active Microsoft user, you shouldn't hesitate to use Copilot. If you're just interested in a free AI chatbot, Copilot will do it for you as well. It's our second-best-ranked AI chatbot overall.

Text score: 87 out of 100
Image score: 10 out of 20

Premium offerings: Copilot has a $20-per-month Pro plan that provides access to more capabilities and provides AI features inside Microsoft 365 applications. There are also business plans, a $10-per-month Pro plan for developers, and an ever-increasing set of tiers and options for business users.

Images generated using Copilot:


Pros
  • Deep Microsoft integration on paid plans
  • Good responses overall
  • Web access
Cons
  • Slow image generation
  • Blocked image topics
  • Too many premium plans and options

Overall score: 97

Copilot (formerly part of Bing) integrates with Microsoft products. While that's the above-the-fold headline, the free version of Copilot is also a rather good standalone chatbot offering. Running logged out, in an incognito/private browsing mode, Copilot was the least naggy of all the AIs. It asked me to log in just once, and allowed me to proceed completely through my tests without either requiring me to log in or asking me again.

Also: How to remove Copilot from your Microsoft 365 plan - before you have to pay for it

Copilot's free tier successfully did web access and looked up a current news story about the flu, although it pulled data from other articles, including a bird flu article in Canada, and something about an Australian woman who had an asthma flare-up. Both were related stories, but the AI did deviate from the assignment and lost points there.

It competently handled explaining an academic concept, identified a math sequence, discussed a cultural issue with context, and analyzed the key themes from a well-known book.

When it came to my vacation travel itinerary test, it not only pointed out appropriate stops and points of interest, but picked up on the prompt's mention of going in March and identified some events happening in Boston in March. However, it did not recommend visiting the USS Constitution, which is a top-line historical point of interest, and it didn't recommend anything regarding weather or clothing for the windy, cold month.

For my emotional support job interview jitters test, the chatbot gave a number of constructive suggestions, but also recommended doing your homework and thoroughly researching the company before the interview.

Copilot lost some points in coding. It not only missed edge cases, it also had some string handling errors and wrote code that had notable performance issues. For the company that produces the VS Code development environment, it's a bit of a disappointment.

Copilot wrote a charming, engaging long-form story, fully meeting the requirements of the prompt, except for being 187 words short of my specified minimum. Still, it was a complete story that was well written and absolutely appropriate to the style implied by the prompt.

Image generation took a loooong time, more than five minutes each. The quality of images was good. The picture got the kid's baseball uniform quite right, including the logo on the cap and even properly spelling "New York" on the shirt (something AIs have had difficulty with). It failed on the fourth Back to the Future-themed challenge, with a "I can't generate that image because it would violate copyright policies" message. It did, however, create a fourth image (of a techno-witch), meaning I didn't hit any resource limitation walls on the free tier.

Also: College students can get Microsoft Copilot free for a year - here's how

 If you're an active Microsoft user, you shouldn't hesitate to use Copilot. If you're just interested in a free AI chatbot, Copilot will do it for you as well. It's our second-best-ranked AI chatbot overall.

Text score: 87 out of 100
Image score: 10 out of 20

Premium offerings: Copilot has a $20-per-month Pro plan that provides access to more capabilities and provides AI features inside Microsoft 365 applications. There are also business plans, a $10-per-month Pro plan for developers, and an ever-increasing set of tiers and options for business users.

Images generated using Copilot:

Read More Show Expert Take Show less Show less View now at Grok

Overall score: 96

Grok was definitely an underdog on my list. I certainly didn't expect it to earn the third-place position on the winner's podium. But it did.

Grok's free offering absolutely aced the travel itinerary test question. It didn't include images, but gave the most personal and usable itinerary of all of the chatbots. It included general pricing for various attractions, a very good mix of attractions and eating (mentioning my personal favorite, the Union Oyster House), discussed planning for the weather, and explained why certain items were chosen for each day. The response just felt the most "human" of all the itineraries I've seen.

Grok also displayed an interesting quirk that was kind of charming. The second test question in my series of ten asks the AI to explain educational constructivism to a five-year-old. AIs are often told to assume a style, and a classic test is "explain it like you would to a five-year-old." In this test, Grok gave a short but usable answer to that question, but then went on to append explanations for five-year-olds to most of the other questions asked, including coding.

Its coding response is worth taking an extra moment to discuss. Code was generated by the AI, but it had a few minor bugs, including a whitespace bug, a leading zero bug, and a decimal bug. However, it added an explanation of the problems it was trying to fix, aimed at a five-year-old, which made the issue quite clear.

Also: Why xAI is giving you 'limited' free access to Grok 4

I still can't decide if I think continuing the explain-to-a-five-year-old theme throughout the session was good conversational awareness, or overdone. For example, it correctly identified the Fibonacci sequence, and then went on to explain it at a five-year-old level. It did the same when it analyzed the themes in Game of Thrones' A Song of Ice and Fire, which was somewhat strange considering how dark those themes are.

Grok skipped the kid-friendly discussion when it translated a sentence to Latin. It gave a very good explanation of the relevance of Latin in today's society.

Grok was the only AI to report word count (1,512) for the long-form story project. It also hit on the proper themes, but it lost points because it seemed to try a little too hard to incorporate the prompt elements without truly integrating them into the story. At the end, it gave a summary of what it was about for a five-year-old.

When running in incognito mode and logged out, the image generator refused to do any image generation at all, saying it couldn't. When I tried using Grok from my Twitter/X account, it produced all four, but they could have been better. The baseball player looked like he was in a Medieval Times restaurant rather than in actual medieval times. And while the Back to the Future test produced a kid in a puffy vest with a DeLorean and skateboard (and a Doc Brown peeking out from behind), it was placed in front of a house right out of 1980s Bergen County, New Jersey, rather than 1950s Hill Valley, California.

Also: X's Grok did surprisingly well in my AI coding tests

Still, I can declare Grok to be a fully competitive AI chatbot. Can you grok it? Which famous author originated the term "grok"? Comment with your answer below.

Text score: 86 out of 100
Image score: 10 out of 20

Premium offerings: Some of Grok's premium features are tied to premium X/Twitter plans. But there's also a SuperGrok service with access to more powerful models that comes in at either $30-per-month or $300-per-month depending on how far you want to go (the $300-per-month plan provides a preview of Grok 4 Heavy, a "heavier" model).

Images generated using Grok:


Pros
  • Excellent itinerary
  • Conversational tone
  • No nagging
Cons
  • No images outside of Twitter/X
  • Buggy code
  • Old web stories

Overall score: 96

Grok was definitely an underdog on my list. I certainly didn't expect it to earn the third-place position on the winner's podium. But it did.

Grok's free offering absolutely aced the travel itinerary test question. It didn't include images, but gave the most personal and usable itinerary of all of the chatbots. It included general pricing for various attractions, a very good mix of attractions and eating (mentioning my personal favorite, the Union Oyster House), discussed planning for the weather, and explained why certain items were chosen for each day. The response just felt the most "human" of all the itineraries I've seen.

Grok also displayed an interesting quirk that was kind of charming. The second test question in my series of ten asks the AI to explain educational constructivism to a five-year-old. AIs are often told to assume a style, and a classic test is "explain it like you would to a five-year-old." In this test, Grok gave a short but usable answer to that question, but then went on to append explanations for five-year-olds to most of the other questions asked, including coding.

Its coding response is worth taking an extra moment to discuss. Code was generated by the AI, but it had a few minor bugs, including a whitespace bug, a leading zero bug, and a decimal bug. However, it added an explanation of the problems it was trying to fix, aimed at a five-year-old, which made the issue quite clear.

Also: Why xAI is giving you 'limited' free access to Grok 4

I still can't decide if I think continuing the explain-to-a-five-year-old theme throughout the session was good conversational awareness, or overdone. For example, it correctly identified the Fibonacci sequence, and then went on to explain it at a five-year-old level. It did the same when it analyzed the themes in Game of Thrones' A Song of Ice and Fire, which was somewhat strange considering how dark those themes are.

Grok skipped the kid-friendly discussion when it translated a sentence to Latin. It gave a very good explanation of the relevance of Latin in today's society.

Grok was the only AI to report word count (1,512) for the long-form story project. It also hit on the proper themes, but it lost points because it seemed to try a little too hard to incorporate the prompt elements without truly integrating them into the story. At the end, it gave a summary of what it was about for a five-year-old.

When running in incognito mode and logged out, the image generator refused to do any image generation at all, saying it couldn't. When I tried using Grok from my Twitter/X account, it produced all four, but they could have been better. The baseball player looked like he was in a Medieval Times restaurant rather than in actual medieval times. And while the Back to the Future test produced a kid in a puffy vest with a DeLorean and skateboard (and a Doc Brown peeking out from behind), it was placed in front of a house right out of 1980s Bergen County, New Jersey, rather than 1950s Hill Valley, California.

Also: X's Grok did surprisingly well in my AI coding tests

Still, I can declare Grok to be a fully competitive AI chatbot. Can you grok it? Which famous author originated the term "grok"? Comment with your answer below.

Text score: 86 out of 100
Image score: 10 out of 20

Premium offerings: Some of Grok's premium features are tied to premium X/Twitter plans. But there's also a SuperGrok service with access to more powerful models that comes in at either $30-per-month or $300-per-month depending on how far you want to go (the $300-per-month plan provides a preview of Grok 4 Heavy, a "heavier" model).

Images generated using Grok:

Read More Show Expert Take Show less Show less View now at Google

Overall score: 95

Google Gemini (formerly Bard) is showing up all over Google's offerings, including inside Chrome. In this ranking, we're not looking at the various implementations and delivery modes. Instead, we're sticking to my approach of doing hands-on testing of actual AI performance with actual questions.

Gemini's test results were another surprise, but not for a good reason. Going into my testing process, I fully expected Gemini's free tier to come in at #2, right after ChatGPT. But it landed at #4, below even Grok. That's just embarrassing.

I have to start by telling you where Gemini lost points, because it's amusing. Well, amusing to me. I'm sure there's a product manager at Google who will be anything but amused. For each chatbot, one of my tests is translating a sentence into Latin. Since I don't do Latin, I feed the results of each translation to Google Translate for translation back to English. Do you know which chatbot translation Google Translate couldn't translate? The only one? Yep. Google Gemini.

Beyond precious irony, the AI did quite well on questions that required factual results, but it seemed to struggle a bit whenever it was asked for subjective recommendations like a travel itinerary or explaining an academic concept to a child. For the latter, it did provide a solid enough answer but went very much overboard on analogies. Worse, the analogies didn't quite fit the examples it used.

It scored 10 out of 10 on the math sequencing prompt, on the Game of Thrones theme analysis, and on my test prompt about the impact of social media on society. It also did quite well in my job interview question. Gemini was far more practical in its advice than ChatGPT, offering tangible tips for interview success and for increasing confidence going into the interview.

Also: Gemini arrives in Chrome - here's everything it can do now

Gemini provided a difficult-to-read table for the seven days of travel. The prompt asked for an itinerary of Boston looking at tech and history themes, but Gemini decided that history was always in the morning and tech always in the afternoon, regardless of the location or distance between points of interest.

My current-events web-access question not only failed to pull information from the site I requested, but also went out and pulled information from sites I didn't request. When I requested a summary of a specific article, it did not actually give a synopsis of information from the desired article, but instead gathered information from other tangentially related articles. It clearly did not do what I asked. Many of the AIs seemed to miss the basic point when asked to summarize a specific article.

The Gemini test code was generally solid, although it missed some issues that are quite mainstream and could hardly be considered edge cases. This would likely have caused some failures for users.

Also: Gemini Pro 2.5 is a stunningly capable coding assistant - and a big threat to ChatGPT

For my long-form story request, the AI first thought I was asking for an image. I corrected it and gave it the prompt again. Weirdly, the AI boldfaced random words throughout the story. I found the 3,379-word story good enough, but a little hard to follow. The story also seemed to try to force-fit random concepts into the overall narrative, as if the AI wasn't entirely sure how to knit the whole piece together.

Image generation itself was good, but there were complications. The AI insisted I sign in to test images. I tried to sign in using my test account, but the AI wouldn't even spin up the chatbot prompt interface. I tried in both incognito mode and with a regular window, to no success. I even tried it with Safari instead of Chrome.

Also: Google's Gemini 2.5 Flash Image 'nano banana' model is generally available

I finally decided to try with my personal account. I'm not paying for Gemini in that account, but my personal account does have some Google paid features attached to it. That was the only way I could get Gemini to produce images. It also wouldn't run continuing my previous session, so there was no way to tell whether I'd have worn out my welcome by adding image requests.

That said, once I got it working, it took far less time than ChatGPT to generate images, maybe five or six seconds all told. Gemini created all four images. The Back to the Future image looked very much like Marty McFly with a skateboard, with a DeLorean ripped from the movie set. Gemini used the new Nano Banana image model, which is quite good.

Overall, Gemini is convenient because it's right there in all you do with Google. If you do a Google search, it's usually at the top of the search results, ready to siphon off traffic from the sites it scraped for its answers. Image generation is first-rate, but overall performance could and should be better from Google.

Text score: 77 out of 100
Image score: 18 out of 20

Premium offerings: The $19.99-per-month Google AI Pro plan gives you access to its higher-end AI models, along with access to a whole host of additional AI features, including expanded use of Google's enormously helpful NotebookLM tool. The $249-per-month Google AI Ultra plan gives you far more resource usage, plus free YouTube Premium.

Images generated using Gemini:


Pros
  • Great images
  • Solid answers to factual questions
  • Many Google-centric integrations
Cons
  • Wouldn't generate images in test account
  • Couldn't translate Latin
  • Unhelpful itinerary

Overall score: 95

Google Gemini (formerly Bard) is showing up all over Google's offerings, including inside Chrome. In this ranking, we're not looking at the various implementations and delivery modes. Instead, we're sticking to my approach of doing hands-on testing of actual AI performance with actual questions.

Gemini's test results were another surprise, but not for a good reason. Going into my testing process, I fully expected Gemini's free tier to come in at #2, right after ChatGPT. But it landed at #4, below even Grok. That's just embarrassing.

I have to start by telling you where Gemini lost points, because it's amusing. Well, amusing to me. I'm sure there's a product manager at Google who will be anything but amused. For each chatbot, one of my tests is translating a sentence into Latin. Since I don't do Latin, I feed the results of each translation to Google Translate for translation back to English. Do you know which chatbot translation Google Translate couldn't translate? The only one? Yep. Google Gemini.

Beyond precious irony, the AI did quite well on questions that required factual results, but it seemed to struggle a bit whenever it was asked for subjective recommendations like a travel itinerary or explaining an academic concept to a child. For the latter, it did provide a solid enough answer but went very much overboard on analogies. Worse, the analogies didn't quite fit the examples it used.

It scored 10 out of 10 on the math sequencing prompt, on the Game of Thrones theme analysis, and on my test prompt about the impact of social media on society. It also did quite well in my job interview question. Gemini was far more practical in its advice than ChatGPT, offering tangible tips for interview success and for increasing confidence going into the interview.

Also: Gemini arrives in Chrome - here's everything it can do now

Gemini provided a difficult-to-read table for the seven days of travel. The prompt asked for an itinerary of Boston looking at tech and history themes, but Gemini decided that history was always in the morning and tech always in the afternoon, regardless of the location or distance between points of interest.

My current-events web-access question not only failed to pull information from the site I requested, but also went out and pulled information from sites I didn't request. When I requested a summary of a specific article, it did not actually give a synopsis of information from the desired article, but instead gathered information from other tangentially related articles. It clearly did not do what I asked. Many of the AIs seemed to miss the basic point when asked to summarize a specific article.

The Gemini test code was generally solid, although it missed some issues that are quite mainstream and could hardly be considered edge cases. This would likely have caused some failures for users.

Also: Gemini Pro 2.5 is a stunningly capable coding assistant - and a big threat to ChatGPT

For my long-form story request, the AI first thought I was asking for an image. I corrected it and gave it the prompt again. Weirdly, the AI boldfaced random words throughout the story. I found the 3,379-word story good enough, but a little hard to follow. The story also seemed to try to force-fit random concepts into the overall narrative, as if the AI wasn't entirely sure how to knit the whole piece together.

Image generation itself was good, but there were complications. The AI insisted I sign in to test images. I tried to sign in using my test account, but the AI wouldn't even spin up the chatbot prompt interface. I tried in both incognito mode and with a regular window, to no success. I even tried it with Safari instead of Chrome.

Also: Google's Gemini 2.5 Flash Image 'nano banana' model is generally available

I finally decided to try with my personal account. I'm not paying for Gemini in that account, but my personal account does have some Google paid features attached to it. That was the only way I could get Gemini to produce images. It also wouldn't run continuing my previous session, so there was no way to tell whether I'd have worn out my welcome by adding image requests.

That said, once I got it working, it took far less time than ChatGPT to generate images, maybe five or six seconds all told. Gemini created all four images. The Back to the Future image looked very much like Marty McFly with a skateboard, with a DeLorean ripped from the movie set. Gemini used the new Nano Banana image model, which is quite good.

Overall, Gemini is convenient because it's right there in all you do with Google. If you do a Google se

Post Comment