How Liquid AI Is Challenging Transformer-Based AI Models

How Liquid AI Is Challenging Transformer-Based AI Models

Despite their relatively impressive capabilities, most conventional deep learning AI models suffer from a number of limitations — such as not being able to recall previously learned knowledge after learning a new task (catastrophic forgetting) and the inability to adapt to new information (loss of plasticity).

Liquid neural networks (LNNs) are a relatively recent development that may address some of these limitations, thanks to a dynamic architecture, along with adaptive and continual learning capabilities.

Introduced back in 2020 by a team of researchers from MIT, liquid neural networks are a type of time-continuous recurrent neural network (RNN) that can process sequential data efficiently. In contrast to conventional neural networks that are generally only trained once on a fixed dataset, LNNs can also adapt to new inputs while still retaining knowledge from previously learned tasks — thus helping to avoid problems like catastrophic forgetting and loss of plasticity.

Liquid AI’s new LLN-based models enhance performance while minimizing memory usage, in contrast to LLMs based on transformers.

The ‘liquid’ nature of LNNs derives from its implementation of a liquid time constant (LTC), which allows the network to adapt to new information by dynamically altering the strength of the network’s connections while remaining robust to noise. Notably, however, the weights of an LNN’s nodes are bounded — meaning that LNNs are not vulnerable to issues like gradient explosion, which can cause the model to become unstable.

Fewer But Richer Nodes

According to study co-author Ramin Hasani, the inspiration for LNNs came from the nematode C. elegans, a microscopic roundworm that has only 302 neurons in its nervous system, yet it can “generate unexpectedly complex dynamics” — in contrast to massively large deep learning neural networks with thousands of neuronal nodes. With this in mind, the team’s goal was develop a scaled-down network with “fewer but richer nodes.”

See also  Atlassian’s New AI Product Gives Developers Access to Agents

It is these “richer” connections that allow LNNs to operate with relatively smaller network sizes and, subsequently, fewer computational resources while still permitting them to model complex behavior. This reduction in overall size also means the decisions that LNNs make are more transparent and “interpretable“, in comparison to other larger models that function more like inscrutable “black boxes”.

In real-world terms, these features give LNNs an edge in handling a variety of different types of data — from processing images, videos, and natural language to any kind of time series data that requires continuous learning. The smaller size and dynamic architecture of LNNs could mean a boost for robots, self-driving cars, autonomous drones, and data analysis for financial markets and medical diagnosis — basically, any situation where the systems in question might lack the capacity to store and run a large language model.

Enter Liquid AI and Liquid Foundational Models

Screenshot

The huge potential of LNNs has prompted its creators to take the next step in launching what they are calling Liquid Foundational Models (LFMs), via a new startup called Liquid AI (Hasani is co-founder and CEO). This new line of state-of-the-art generative AI models from Liquid AI enhances performance while minimizing memory usage, in contrast to large language models based on transformers — the now familiar type of deep learning architecture that was introduced by Google back in 2017 and made famous by ChatGPT in 2022.

According to the company, Liquid Foundational Models differ from generative pre-trained transformer (GPT) models in that they use a hybrid computational system that is based on the “theory of dynamical systems, signal processing, and numerical linear algebra.” This allows LFMs to function as general-purpose models that can be trained on any type of sequential data, whether that’s video, audio, text, time series and signals — and also achieve similar performance as traditional deep learning models while using fewer neurons.

LFMs (Liquid Foundational Models) are much more memory-efficient than transformer-based models, particularly when it comes to long inputs.

Most notably, LFMs are much more memory-efficient than transformer-based models, particularly when it comes to long inputs. With transformer-based LLMs, the KV cache grows linearly with sequence length, while LFMs can process longer sequences using the same hardware. Impressively, LFMs are designed to support a context length of 32K tokens, making them well-suited for complex uses, like smarter chatbots or document analysis.

See also  Forget Emoji Kitchen, Genmoji on iOS 18.2 beta 1 is the real deal

In addition, the team’s previous research demonstrates that these systems can function as universal approximators, expressive continuous-time machine learning systems for sequential data, are parameter efficient in learning new skills, are causal and interpretable, and when linearized they can efficiently model very long-term dependencies in sequential data.

There are currently three versions of LFMs, all of which either match or surpass transformer-based models of similar size during tests:

  • LFM-1B: At 1.3 billion parameters, this is the smallest of Liquid AI’s LFMs. It’s characterized as a dense model that is best suited for resource-constrained environments, with initial tests indicating that it’s the first time a non-GPT architecture significantly outperformed transformer-based models.
  • LFM-3B: The mid-tier model with 3.1 billion parameters that is more robust, and is optimized for edge deployments such as drones and mobile devices.
  • LFM-40B: Designed for running complex tasks in a cloud-based environment, this is a “mixture of experts” model with 40.3 billion parameters.

With their increased efficiency, dynamic adaptability, and multimodal capabilities, LFMs could help push generative AI tech to the next level by challenging the current dominance of GPT-based models. During its recent product launch event, the team also introduced the Liquid DevKit, offering developers a streamlined yet comprehensive approach to build, scale and explain LFMs. To find out more, you can rewatch its recent launch event via webcast. The company is also offering demo access to their LFMs via Liquid Playground, Lambda Chat and API, and Perplexity Labs.

The post How Liquid AI Is Challenging Transformer-Based AI Models appeared first on The New Stack.

See also  Techniques for Tackling Catastrophic Forgetting in AI Models
RECENT POSTS

Leave a Reply

Your email address will not be published. Required fields are marked *