Only 2.4% in math: Is ChatGPT turning dumb?

July 20, 2023

135

Why is ChatGPT in the news?

Recently, researchers Lingjiao Chen and James Zou from Stanford University, and Matei Zaharia from UC Berkeley tested GPT-3.5 and GPT-4 for solving math problems, answering sensitive and dangerous questions, generating code and for visual reasoning. The conclusion: the “performance and behaviour” of both these large language models (LLMs) “can vary greatly over time”. The March version of GPT-4 identified prime numbers with 97.6% accuracy. In the June version, accuracy collapsed to 2.4%. Both made “more formatting mistakes in code generation in June than in March”.

How did other experts react?

When the findings were published, AI expert Gary Marcus tweeted that “this instability will be LLMs’ undoing”. Jim Fan, senior scientist at Nvidia, opined that in a bid to make GPT-4 “safer”, OpenAI could have made it less useful, “leading to a possible degradation in cognitive skills”. He added that in a bid to cut costs, OpenAI could have reduced the parameters. Princeton professor of computer science Arvind Narayanan and a PhD student at the same university co-authored a response in which they argue, among other things, that variance in behaviour does not suggest a degradation in capability.

How is OpenAI reacting to this controversy?

Reacting to user criticism, Peter Welinder (in pic), vice-president of OpenAI, which owns ChatGPT, said GPT-4 was getting smarter with each new version. “When you use it more heavily, you start noticing issues you didn’t see before.” Logan Kilpatrick, lead of developer relations at OpenAI, tweeted: “We are actively looking into the reports people shared.”

What does this mean for users and cos?

Human resources tasks like onboarding, training, performance management, and employee queries and complaints can be automated using ChatGPT. But to integrate OpenAI’s application programming interfaces (APIs) with the business workflows of companies, one has to continuously monitor, retrain and fine-tune the models to ensure that they continue to produce accurate output and stay up-to-date. Variance in AI model behaviour only makes it a bigger challenge.

Is it a boost for open-source LLMs?

The day the paper was released, Meta too released the second version of its free open-source LLM called Llama 2 for research and commercial use, providing an alternative to the pricy proprietary LLMs sold by OpenAI like ChatGPT Plus and Google’s Bard. Interestingly, Databricks Inc., whose CTO is Zaharia (one of the paper’s authors), has open-sourced its LLM called Dolly 2.0. Hugging Face’s BigScience Large Open-science Open-access Multilingual Language Model (BLOOM), too, is open to researchers to run.

Source link

Only 2.4% in math: Is ChatGPT turning dumb?

Why is ChatGPT in the news?

How did other experts react?

How is OpenAI reacting to this controversy?

What does this mean for users and cos?

Is it a boost for open-source LLMs?

Yami Gautam, Aditya Dhar Welcome Baby Boy, Name Him Vedavid; Ranveer Singh, Ayushmann Khurrana Send Wishes – DellyRanks

Royal Enfield Himalayan, Guerrilla 450, India launch details -DellyRanks

Happy Birthday Jr NTR: Top Movies, Songs and Powerful Dialogues – DellyRanks

Mint Primer: Beam me up, Scotty… into the boardroom -DellyRanks

Rajkummar Rao Lost A Film To Star Kid Overnight; Farah Khan Jokes About Paid Airport porter Service – DellyRanks

Akshay Kumar, Arshad Warsi Go On A Bike Ride As They Wrap Jolly LLB 3 Rajasthan Schedule, Share Fun Video – DellyRanks

Latest Articles

Yami Gautam, Aditya Dhar Welcome Baby Boy, Name Him Vedavid; Ranveer Singh, Ayushmann Khurrana Send Wishes – DellyRanks

Royal Enfield Himalayan, Guerrilla 450, India launch details -DellyRanks

Happy Birthday Jr NTR: Top Movies, Songs and Powerful Dialogues – DellyRanks

YOU MAY ALSO LIKE..

Our Archieves

Yami Gautam, Aditya Dhar Welcome Baby Boy, Name Him Vedavid; Ranveer Singh, Ayushmann Khurrana Send Wishes – DellyRanks

Royal Enfield Himalayan, Guerrilla 450, India launch details -DellyRanks

Happy Birthday Jr NTR: Top Movies, Songs and Powerful Dialogues – DellyRanks

Mint Primer: Beam me up, Scotty… into the boardroom -DellyRanks

Rajkummar Rao Lost A Film To Star Kid Overnight; Farah Khan Jokes About Paid Airport porter Service – DellyRanks

Yami Gautam, Aditya Dhar Welcome Baby Boy, Name Him Vedavid; Ranveer Singh, Ayushmann Khurrana Send Wishes – DellyRanks

Royal Enfield Himalayan, Guerrilla 450, India launch details -DellyRanks

Happy Birthday Jr NTR: Top Movies, Songs and Powerful Dialogues – DellyRanks

Mint Primer: Beam me up, Scotty… into the boardroom -DellyRanks

POPULAR CATEGORY

Brilliantly

Content & Links