Published on the 30/04/2024 | Written by Heather Wright
Slimmed down LLMs offer new use cases…
In a world where AI just keeps getting bigger and mainstream large language models (LLMs) from major vendors have become large and heavy, analyst firm Gartner is eyeing up the potential for a slimmed down LLMs to expand GenAI use cases and provide viable alternatives for many Australian and New Zealand sectors.
Ray Valdes, VP analyst at Gartner, says ‘light LLMs’ are GenAI models that are light in their consumption of computational resources, such as CPU and memory, and can run completely disconnected from the internet. While the light LLMs category includes ‘small LLMs’ or small models, light LLMs go further in the use of optimised processing techniques, Valdes says.
“There are scenarios where a heavy general-purpose cloud-based LLM is not the right fit.”
A number of open source models are available including Bloom LLM from BigScience and SQLCoder, with big name vendors also seeking to cover their bases and add light LLMs – Google has Gemini Nano and Gemma, and Microsoft Research recently unveiled Phi-2.
Valdes told iStart light LLMs are relevant to many Australian and New Zealand organisations, with sectors including agriculture, mining, and eco-tourism likely to find the slimmed down models valuable.
Gartner is forecasting light LLMs – which it notes are still very much in the ‘emerging’ sphere – will enable new modes of deployment and new types of functions, reduce costs and help GenAI reach its ‘transformative’ potential.
“Although heavy LLMs will dominate the market over the next three years, there are important segments of the market that they cannot address,” Gartner’s Emerging Tech: Light LLMS Broaden the Spectrum of GenAI Use Cases report, of which Valdes was a co-author, says.
It is forecasting light LLMs to reach ‘early majority adoption – that’s 16 percent target market adoption – in one to three years, and cites BloombergGPT, a language model trained on financial data including proprietary Bloomberg data, as an early real-world example. It’s makers say the Bloomberg GPT, which is being integrated into Bloomberg’s information service for investors and financial professionals, outperforms existing models when applied to financial tasks such as evaluating financial statements, gauging sentiment and question answering.
Valdes notes that mainstream established LLMs from major vendors such as Microsoft, Google and Amazon have gotten large and heavy and can only run in huge cloud data centres run by those vendors.
“For many purposes, this is totally fine,” he says. “However, there are now scenarios where a heavy general-purpose cloud-based LLM is not the right fit.”
That’s where light LLMs could have a big role to play in the future.
They can run on smaller devices, including a server in a company data centre, a laptop or even on phones or edge devices.
“Of course, to deploy AI at these extremes requires tradeoffs,” Valdes notes. Reduced speed and less complete answers to questions are included, but he says for many uses, that tradeoff is worth it.
“Not only can light LLMs be deployed on small devices, they can also run entirely disconnected from the internet, relying solely on on-device resources. There are certain sectors that could find these capabilities valuable, for example agriculture, mining, eco-tourism, fishing and other situations where internet access is not reliable or available.”
Law enforcement and the military are also likely to find uses for light LLMs, he says – and not just for disconnected operations but also when there is data that needs to be private or confidential.
“Although major cloud vendors like Microsoft and Google are reputable and are trusted by many businesses, there are still some organisations that don’t want to put their own private data in a vendor’s cloud and instead need to store this sensitive data on a local device or local data centre.”
The light LLMs can also be trained or fine-tuned at low cost on task-specific user data, such as code and document repositories, and can be customised in various ways, including embedding in applications. Models can also be orchestrated to work together, often in conjunction with taking action, rather than just producing verbal output.
The Emerging Tech report says there is strong market demand for LLMs that can run in an enterprise data centre or in disconnected mode, without a cloud connection; leverage company data while keeping it private; be tuned to the needs of a particular vertical; and run on on small devices at the edge of a network.
Other market concerns which could play into the rise of light LLMs include growing concerns about the cost of operations as LLM usage increases, and the growing carbon footprint of GenAI, with its heavy computational requirements.
Light LLMs, the report says, use techniques including more efficient algorithms and lower precision numerics, to reduce computational requirements.
“Light LLMs benefit from ongoing innovations in more efficient training, inference, tooling and frameworks,” Gartner says.
“There will continue to be a steady stream of releases by both existing players as well as new entrants. In addition, technical skill sets in implementing LLM-based solutions are becoming more widespread — both among systems integrators as well as in-house teams in organisations that are early adopters of technology — in terms of skills, culture and practice.”
But while Valdes might be optimistic about the potential of light LLMs for Australian and New Zealand organisations, he also sounds a note of caution saying the technology is still in the emerging stage of evolution.
“Actually, the GenAI sector as a whole can still be considered immature and emerging. The technology will continue to evolve and improve. Today is the worst it will ever be. Tomorrow will be better.”