How Foundation Models are Transforming AI?

Sujoy Roy
5 min readMar 11, 2024

--

A broader category of generative AI

Image: Generated by DALL.E-3 via text prompt

There are two types of AI viz. traditional and generative. While traditional AI analyzes data and tell you what it sees, generative AI can use that same data to creates new things from it.

Training a new large language model is a bit like launching a rocket.

It’s exciting,

It’s resource intensive,

It requires massive computing power,

and the training process takes months.

You have to plan and prepare a lot to ensure you have the newest and best technologies ready. Once you start training with GPUs, it’s like launching a rocket. You can’t make any more changes to the design. New ideas have to wait until the next time.

Just as rocket launches push the boundaries of science, large language models (LLMs) and the broader category of generative AI, called foundation models, are changing how we use AI.

What is a Foundation model?

Foundation models (FMs) are big neural networks trained on vast datasets, altering how data scientists tackle machine learning (ML). Instead of building AI from scratch, they use FMs as a base to develop ML models for new applications efficiently. Researchers coined the term “foundation model” to describe ML models trained on diverse, unlabeled data, capable of various tasks like understanding language, generating text and images, and engaging in natural language conversations.

What is unique about foundation models?

A unique feature of foundation models is their adaptability. These models can perform a wide range of disparate tasks with a high degree of accuracy based on input prompts. Some tasks include natural language processing (NLP), question answering, and image classification. Unlike traditional ML models, they’re versatile, not limited to specific tasks like sentiment analysis or image classification. These models serve as base models for developing specialized applications, the result of over a decade of evolution in size and complexity.

For instance, BERT, released in 2018, had 340 million parameters and a 16 GB dataset. In just five years, GPT-4 had 170 trillion parameters and a 45 GB dataset. Computational power for foundation modeling has doubled every 3.4 months since 2012, as per OpenAI. Modern FMs like Claude 2 and Llama 2, and the text-to-image model Stable Diffusion, handle diverse tasks across domains, from writing blogs to solving math problems.

An introduction to models of generative AI.

Traditional AI processes data to interpret existing information, while generative AI leverages data to innovate. This capability is crucial for businesses, enabling enhancements in customer service, code development, and document analysis.

New use cases are emerging constantly, empowering companies to boost productivity, cut costs, and explore new business avenues. In contrast, traditional machine learning is limited in scope, designed for specific tasks and requiring extensive human involvement.

Foundation models are expansive and versatile, leveraging unsupervised learning to train on large, unlabeled datasets. These models can then be customized for various applications. Their capabilities evolve rapidly, enabling a wide range of tasks.

As generative AI becomes a key factor in business success, it’s crucial to innovate and actively engage in discussions about AI’s future direction. By developing expertise within your organization, you can stay ahead of the curve and avoid simply imitating others’ strategies.

Beginning with generative AI.

When starting out, developing expertise is crucial. Firstly, assemble a team capable of working with foundation models proficiently, allowing them to experiment with new models and prototype various use cases. Secondly, select an internal low risk use case for testing. Develop a prototype and assess deployment, utilizing insights gained as your team gains experience. Lastly, engage in detailed discussions to identify key value and revenue drivers that generative AI can facilitate unlocking.

For instance, you must figure out the trustworthiness and regulatory requirements your models must meet for production. These questions become even more important as you transition from experimentation to building models for real applications with business impact. Lastly, you need to operate with responsibility and transparency.

You need to be clear about data collection, demonstrating what’s included and excluded, and how it’s managed. Explain how your AI makes decisions to ensure fairness, trustworthiness, and compliance with regulations.

Choosing the right evaluation metrics is crucial, reflecting your business needs and assessing the model’s strength, fairness, scalability, and deployment costs. While evaluating generative AI can be challenging, considering various dimensions might reveal that some use cases aren’t worth the cost or risk of using a large model on the cloud.

Instead of relying on a single model, smaller specialized models can suffice. These smaller models, typically with billions rather than trillions of parameters, can be just as effective for specific tasks. They offer significant cost savings and are easier to deploy on-premises, reducing deployment risks.

The approach of using platforms for managing models.

When starting with generative AI, it’s recommended to begin with a pre-trained model and make light customizations through tuning. This allows tailoring for specific use cases while leveraging the broad capabilities of existing models. Regularly updating pre-trained models and retraining foundation models multiple times a year is essential to reflect changes and comply with evolving regulations and best practices.

Choose a platform with expertise in foundation models and governance tools to address ethical concerns and facilitate the transition from experimentation to deployment. As you gain experience and confidence, you can eventually manage and expand models independently.

Although navigating the complexity of AI and foundation models requires effort, the benefits in terms of business success and societal progress make it worthwhile.

Final Thoughts

In conclusion, embarking on the journey with generative AI requires meticulous planning and readiness with cutting-edge technologies. Once the training process begins, akin to launching a rocket, any modifications to the design become impossible, necessitating innovative ideas to wait for subsequent attempts. Much like rocket launches advancing scientific frontiers, large language models (LLMs) and the broader realm of generative AI, epitomized by foundation models, herald a paradigm shift in AI utilization. Foundation models, trained on extensive datasets, redefine machine learning by offering versatility, adaptability, and scalability across various tasks. As generative AI emerges as a pivotal driver of business differentiation, organizations must cultivate expertise and embrace transparency and responsibility in their AI endeavors. Through careful evaluation of use cases, meticulous model management, and strategic utilization of platforms, businesses can harness the transformative potential of generative AI to drive innovation, efficiency, and growth.

Reference: MIT-IBM Watson AI Lab, AI Academy series “Why foundation models are a paradigm shift for AI”.

--

--

Sujoy Roy

A technology enthusiast, #Engineer, likes to speak on #artificial intelligence #tech #digital transformation #Cloud Computing #Fintech. Follow me @sujoyshub