23.1 C
New Delhi
Friday, November 22, 2024
HomeTechFive trends that may change the course of Generative AI models

Five trends that may change the course of Generative AI models


While the potential of these models shows up in the numbers, with ChatGPT garnering more than 100 million users since December, these models have also alarmed many not only because they pretend to think and act like humans but also because they can reproduce the work of renowned writers and artists in seconds and have the potential to replace thousands of routine jobs. I’ve listed five trends to watch out for in this space, and it’s not exhaustive.


 

1. Rise of smaller open-source LLMs

For those new to this field, even a cursory reading of the history of technology will reveal that big tech companies like Microsoft and Oracle were strongly opposed to open-source technologies but embraced them after realizing that they couldn’t survive without doing so. Open-source language models are demonstrating this once again.

In a leaked document accessed by Semianalysis, a Google employee claimed, “Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params (parameters) that we struggle with at $10M (million) and 540B (billion). And they are doing so in weeks, not months.” The employee believes that people will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. He opined that “giant models are slowing us down. In the long run, the best models are the ones which can be iterated upon quickly. We should make small variants more than an afterthought now that we know what is possible in the < 20B parameter regime”.

 

View Full Image


Google may or may not subscribe to this point of view, but the fact is that open-source LLMs have not only come of age but are providing developers with a lighter and much more flexible option. Developers, for instance, are flocking to LLaMA–Meta’s open-source LLM. Meta’s Large Language Model Meta AI (LLaMA) requires “far less computing power and resources to test new approaches, validate others’ work, and explore new use cases”, according to Meta. Foundation models train on a large set of unlabelled data, which makes them ideal for fine-tuning a variety of tasks. Meta made LLaMA available in several sizes (7B, 13B, 33B, and 65B parameters) and also shared a LLaMA model card that detailed how it built the model, very unlike the lack of transparency at OpenAI.

According to Meta, smaller models trained on more tokens —pieces of words — are easier to re-train and fine-tune for specific potential product use cases. Meta says it has trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Its smallest model, LLaMA 7B, is trained on one trillion tokens. Like other LLMs, LLaMA takes a sequence of words as input and predicts the next word to generate text recursively. Meta says it chose a text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets, to train LLaMa.

Similarly, Low-Rank Adaptation of Large Language Models (LoRA) claims to have reduced the number of trainable parameters, which has lowered the storage requirement for LLMs adapted to specific tasks and enables efficient task-switching during deployment without inference latency. “LoRA also outperforms several other adaptation methods, including adapter, prefix-tuning, and fine-tuning”. In simple terms, developers can use LoRA to fine-tune LLaMA.

Pythia (from EluetherAI, which itself is likened to an open-source version of OpenAI) comprises 16 LLMs that have been trained on public data and range in size from 70M to 12B parameters.

Databricks Inc. released its LLM called Dolly in March, which it “trained for less than $30 to exhibit ChatGPT-like human interactivity”. A month later, it released Dolly 2.0–a 12B parameter language model based on the EleutherAI Pythia model family “and fine-tuned exclusively on a new, high-quality human-generated instruction following dataset, crowdsourced among Databricks employees”. The company has open-sourced Dolly 2.0 in its entirety, including the training code, dataset and model weights for commercial use, enabling any organization to create, own, and customize powerful LLMs without paying for API access or sharing data with third parties.

Of course, we cannot ignore Hugging Face’s BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) that has 176 billion parameters and is able to generate text in 46 natural languages and 13 programming languages. Researchers can download, run and study BLOOM to investigate the performance and behaviour of recently-developed LLMs. The open-source LLM march has only begun.

2. Is Generative AI really smart?

The power of LLMs, as I have pointed out often in earlier newsletters too, stems from the use of transformer neural networks that are able to read many words (sentences and paragraphs, too) simultaneously, figure out how they are related, and predict the following word. LLMs such as GPT and chatbots like ChatGPT are trained on billions of words from sources like the internet, books, and sources, including Common Crawl and Wikipedia, which makes them more “knowledgeable but not necessarily more intelligent” than most humans since they may be able to connect the dots but not necessarily understand what they spew out. This implies that while LLMs such as GPT-3 and models like ChatGPT may outperform humans at some tasks, they may not comprehend what they read or write as we humans do. Moreover, these models use human supervisors to make them more sensible and less toxic.

A new paper by lead author Rylan Schaeffer, a second-year graduate student in computer science at Stanford University, only confirms this line of thinking. It reads: “With bigger models, you get better performance,” he says, “but we don’t have evidence to suggest that the whole is greater than the sum of its parts.” You can read the paper titled ‘Are Emergent Abilities of Large Language Models a Mirage?’ here. The researchers conclude that “we find strong supporting evidence that emergent abilities may not be a fundamental property of scaling AI models”.

That said, the developments in the field of AI (and Generative AI) are too rapid for anyone to stick to any one point of view, so all I can say for now is let’s hold our horses till we get more data from the opaque LLMs of OpenAI and Google.

3. Dark side of Generative AI

Alarm bells started ringing louder when Geoffery Hinton, one of the so-called godfathers of AI, quit Google on 1 May. His reason, according to The New York Times, was that “…he can freely speak out about the risks of AI”. “A part of him, he said, now regrets his life’s work”. Hinton, who obviously deeply understands the technology, said in the above-cited NYT article, “It is hard to see how you can prevent the bad actors from using it for bad things”.

Hinton’s immediate concern, according to the article, is that “the internet will be flooded with false photos, videos and text, and the average person will “not be able to know what is true anymore.” He is also worried that AI technologies will, in time, upend the job market.” The fear is that Generative AI is only getting smarter with each passing day, and researchers are unable to understand the ‘How’ of it. Simply put, since large language models (LLMs) like GPT-4 are self-supervised or unsupervised, researchers cannot understand how they train themselves and arrive at their conclusions (hence, the term ‘black box’). Further, Tencent, for instance, has reportedly launched a ‘Deepfakes-as-a-Service’ for $145 — it needs just three minutes of live-action video and 100 spoken sentences to create a high-definition digital human.

You can read more about this here and here.

4. Generative AI for enterprises

While AI was discussed by 17% of CEOs in the January-March quarter of this calendar year, spurred by the release of ChatGPT and the discussions around its potential use cases, Generative AI was specifically discussed by 2.7% of all earnings calls, and conversational AI was mentioned in 0.5% of all earnings calls–up from zero mentions in the October-December quarter, according to the latest ‘What CEOs talked about’ report by IoT Analytics–a Germany-based markets insight and strategic business intelligence provider.

Generative AI multi-modal models and tools, including ChatGPT, Dall-E, Mid-Journey, Stable Diffusion, Bing, Bard, and LLaMA, are making waves not only due to their ability to write blogs, and reviews, create images, make videos, and generate software code, but also because they can aid in speeding up new drug discovery, create entirely new materials, and generate synthetic data too.

That said, once companies adopt Generative AI models, they will need to continuously monitor, re-train, and fine-tune to ensure the models continue to produce accurate output and stay up-to-date. Further, integrating the application programming interfaces (APIs) with the business workflows of other units has its own set of challenges for companies. Nevertheless, given the frenetic pace at which these models are training themselves, and pending the introduction of ChatGPT Business, business executives would benefit from being proactive.

5. Global guardrails are falling into place

The European Union’s AI Act, for instance, now proposes that AI tools should be classified according to their perceived risk level — from minimal to limited, high, and unacceptable.

The US-based National Artificial Intelligence Advisory Committee (NAIAC), among other things, states: “We understand that trustworthy AI is not possible without public trust, and public trust cannot be attained without clear mechanisms for its transparency, accountability, mitigation of harms, and redress. The Administration should require an approach that protects against these risks while allowing the benefits of values-based AI services to accrue to the public.”

India, too, needs to act fast to avoid the unbridled AI horse from running amok. You can read more about this in my earlier newsletter: ‘We must rein in the precocious Generative AI children. But how?’

This article is this week’s edition of Leslie D’Monte’s Tech Talk newsletter. Subscribe here.

Catch all the Technology News and Updates on Live Mint.
Download The Mint News App to get Daily Market Updates & Live Business News.

More
Less



Source link

- Advertisment -

YOU MAY ALSO LIKE..

Our Archieves