Is ChatGPT king? How top free AI chatbots fared during field testing

While OpenAI’s ChatGPT was the first artificial intelligence (AI)-powered chatbot to captivate the world after its public release in November 2022, a variety of competitors have entered the marketplace since then.

Tech giants Google and Microsoft have launched their AI chatbots, with Google’s Bard removing its waitlist, and opening up to over 180 countries and territories on May 10, after Microsoft beat it to the punch and fully released its AI-powered Bing search engine on May 4.

With several chatbots to choose from, Cointelegraph decided to put some of the most well-known through their paces to see which held up best during field testing, as well as comparing some of their features.

To test the chatbots, they were each asked a series of questions, riddles and more complex prompts to determine their accuracy and speed of responses.

Many AI chatbots available today are powered by OpenAI’s GPT models. While these AI chatbots may give similar results to ChatGPT, the app developers can also add additional commands, which may change the results.

OpenAI’s ChatGPT-3.5

While OpenAI has already released ChatGPT-4, which is available to Plus plan users for $20 per month, ChatGPT-3.5 is free to use and is tested here.

ChatGPT-4 significantly outperforms its predecessor with faster response speeds, more accurate responses and less server downtime.

The first AI chatbot to take the world by storm can help with tasks like essay writing, code debugging and even personal finances after only a second or so of processing time.

However, one area where ChatGPT underperforms is its lack of ability to search the internet.

This means the model is only as good as the training data fed into it, which goes up until September 2021. OpenAI is rolling out plugins that allow it to source online information using Bing’s search API, but this will be limited to users on the Plus plan.

Despite this shortcoming in the free version, the chatbot is still usually able to suggest resources to help the user with their query, as highlighted in the interaction below.

A screenshot illustrating ChatGPT-3.5’s inability to speak of recent events. Source: OpenAI

ChatGPT-3.5 correctly answered most of the riddles it was given and all the simple math problems, but the answers were less consistently correct when it was asked more complex problems.

For example, when asked to solve the quadratic equation 2t^2 + 0.3t – 0.4 = 0, ChatGPT-3.5 returned the correct answer in one out of three attempts and had similar issues multiplying larger numbers.

ChatGPT-3.5 can also be inaccurate when answering other questions. According to OpenAI’s testing, it was only able to correctly answer 213 of 400 questions in the Uniform Bar Exam, which graduated law students in the United States are required to pass before they can become practicing lawyers.

Outside of factual inaccuracies, ChatGPT-3.5 also struggled with questions to test its logical ability, such as the one below.

ChatGPT incorrectly answers a question aimed to test its logical ability. Source: OpenAI

Microsoft’s Bing

Bing’s ChatGPT is based on the GPT-4 language model created by OpenAI, but the two chatbots have several key differences.

The first noticeable difference is that it takes Bing’s chatbot much longer to respond to questions, with an average response time of approximately five seconds compared with OpenAI’s ChatGPT taking only one second.

It also requires users to use the Microsoft Edge web browser, which is nowhere near as popular as Google Chrome.

On the positive side, Bing’s chatbot utilizes the Bing search engine in its responses, allowing it to answer questions about current events, unlike any other chatbot using GPT-4. It’s also currently available for free.

Additionally, it provides sources for its answers, letting users more easily verify claims made by the chatbot.

Microsoft’s Bing ChatGPT in action. Source: Bing

Using the same quadratic equation 2t^2 + 0.3t – 0.4 = 0, Bing linked to Microsoft Math Solver but often gave an incorrect answer and had similar issues correctly answering larger multiplications.

In the same logical question about the bookmark posed to ChatGPT-3.5, Bing correctly answered that you would expect to see the bookmark on page 120.

Google’s Bard

Google’s recently released AI chatbot called Bard, which runs on its PaLM 2 language model.

As pointed out in a Twitter thread by AI enthusiast Moritz Kremb, it can both respond and be prompted with images, supports numerous programming languages and, like Bing’s chatbot, can connect to the internet.

When asked how PaLM 2 compares with GPT-4, Bard said that GPT-4 is better at generating text, but PaLM 2 is better at reasoning and logic, adding:

“Ultimately, the best language model for you depends on your needs. If you need an LLM that’s strong at reasoning and logic, then Palm 2 is the better choice. If you need an LLM that’s fast, good at generating text and has proved itself, then GPT-4 is the better choice.”

Bard correctly answered the bookmark question and it explained its answer in more depth than Bing, but the explanations were often nonsensical.

Related: What is Google’s Bard, and how does it work?

It solved most of the riddles it was given and performed well on the math questions, correctly solving the complex multiplication questions and the quadratic equation in two of the three draft answers it prepared.

YouChat

While it also uses OpenAI’s GPT-3.5, there are some differences between You.com’s YouChat and OpenAI’s ChatGPT.

It lists sources for most of the text it generates and also provides links to several web pages related to the query.

It also connects to the internet, allowing it to access current events, and because it doesn’t have the same level of popularity as OpenAI’s chatbot, downtime is not an issue.

It incorrectly answered both the bookmark question, the quadratic equation and the more complex multiplication problem.

It was able to solve most of the riddles given to it but incorrectly answered some.

HuggingChat

HuggingChat is an open-source AI chatbox from the AI firm Hugging Face, released in April.

Asked to solve the same quadratic equation, HuggingChat returned 684 words of text and failed to provide an answer to the question. While it could correctly answer simple problems, it could not multiply larger numbers.

While it sometimes gave direct answers, HuggingChat often returned vast walls of text, which were relevant initially but devolved into something akin to rambling.

For example, it was asked to solve the following riddle: “A barrel of water weighed 60 pounds. Someone put something in it, and now it weighs 40 pounds. What did the person add?”

The correct answer is a hole, but the HuggingChat replied ice cubes before launching into a 545-word monologue.

What about the rest?

There are many other AI chatbots currently available, designed for more limited use cases than the ones mentioned here, with the market likely to continue growing rapidly.

For example, Socratic is another AI chatbot from Google that can be downloaded onto a smartphone to help users answer questions on science, math, literature and more. It also provides visual explanations of concepts in different subjects and is a useful tool to aid learning.

DeepAI is an AI chatbot that specializes in writing text such as programming code, poems, stories or essays.

Conclusion

While it might be unfair to compare OpenAI’s ChatGPT-3.5 to Bing’s AI chatbot — given they are using different language models — this article intends to only look at AI chatbots available for free.

Through Bing, users can take advantage of OpenAI’s ChatGPT-4 language model, which is a huge improvement from its predecessor.

While Google’s Bard was promising, Bing generally performed the best of the current freely available AI chatbots, but still made some mistakes.

Other chatbots appear to have more limited use cases that could be more useful, but these three seem to lead the way as development progresses.

Magazine: Cryptocurrency trading addiction — What to look out for and how it is treated

The above represents an informal field testing of different AI solutions and is by no means exhaustive or representative of Cointelegraph’s position on a particular AI solution.

Facebook Comments Box

Hits: 0