Red Hat progresses AI offerings for accelerated implementation

Red Hat progresses AI offerings for accelerated implementation

Red Hat is making progress in expanding its enterprise AI offering. Thanks to validated third-party AI models and integration of Llama Stack and the Model Context Protocol, AI implementations at companies should accelerate. They will also have more freedom of choice, and confidence in generative AI applications within hybrid cloud environments should increase.

At the Red Hat Summit in Boston, the open source company is presenting several updates that offer companies more opportunities to effectively deploy generative AI. One of the most important additions is the Red Hat AI Third Party Validated Models, which will be available through Hugging Face. These models make it easier for large companies to find the right AI models that meet their specific needs. In addition to this collection of validated models, Red Hat offers implementation guidelines that increase confidence in model performance and the reproducibility of results.

Red Hat has further optimized certain models to make the offering more useful. This involves applying compression techniques to reduce model size and increase inference speed. This helps organizations minimize resource consumption and operational costs. Thanks to the ongoing validation process, customers also stay up to date with the latest optimizations in generative AI innovation.

Standardized APIs for AI applications

A second important innovation is the integration of Llama Stack (originally developed by Meta) and Anthropic’s Model Context Protocol (MCP). These additions provide developers with standardized APIs for building and deploying AI applications and agents.

The Llama Stack, currently available as a developer preview in Red Hat AI, provides a unified API for inference with vLLM, retrieval-augmented generation (RAG), model evaluation, guardrails, and agents for various generative AI models. MCP enables models to be integrated with external tools by providing a standardized interface for connecting APIs, plugins, and data sources in agent workflows.

Tip: How the Model Context Protocol is taking the AI world by storm

Updates for Red Hat OpenShift AI

Of course, OpenShift AI also plays a major role in the new enterprise AI offering. The latest version of Red Hat OpenShift AI (v2.20) includes several enhancements for building, training, deploying, and monitoring both generative and predictive AI models at scale.

For example, an optimized model catalog (in technology preview) provides easy access to validated Red Hat and third-party models. This catalog enables users to deploy models on Red Hat OpenShift AI clusters via the web interface and manages the lifecycle of these models using an integrated registry.

In addition, distributed training via the KubeFlow Training Operator enables InstructLab model tuning and other PyTorch-based training workloads to be distributed across multiple Red Hat OpenShift nodes and GPUs. This includes distributed RDMA network acceleration and optimized GPU utilization to reduce costs.

A feature store (also in technology preview), based on the upstream Kubeflow Feast project, provides a centralized repository for managing and serving data for both model training and inference. This streamlines data workflows and improves model accuracy and reusability.

Tip: Red Hat makes OpenShift the hybrid AI and cloud pioneer

RHEL AI is also moving forward

Red Hat Enterprise Linux AI is also getting an upgrade with the launch of version 1.5. This version of the platform for developing, testing, and running large language models (LLMs) provides greater availability in the public cloud. The platform is coming to the Google Cloud Marketplace, giving users a third option alongside AWS and Azure. This addition simplifies the implementation and management of AI workloads on Google Cloud.

In addition, the InstructLab project will see improved multilingual capabilities, which is an important step, especially from an international perspective. This concerns the languages Spanish, German, French, and Italian. This allows models to be customized using custom scripts, opening up new possibilities for multilingual AI applications. Users can also import their own teacher models for more control over model adjustments and testing for specific use cases and languages. Support for Japanese, Hindi, and Korean is planned for the future.

Red Hat AI Inference Server

With the newly announced Red Hat AI Inference Server, another component plays a significant role in the new enterprise AI offering. This server is designed to provide high-performance inference for generative AI applications. With the introduction of this server, companies can now build a complete stack for the development, training, and deployment of AI models within their existing technology infrastructure. We discuss this server in detail in a separate article.

In addition, the Red Hat AI InstructLab service on IBM Cloud is now generally available. This cloud service further streamlines the model adaptation process, improves scalability and user experience, and enables companies to use their unique data more effectively – with greater control and ease of use.

Organizations can accelerate their AI implementations without compromising on choice, performance, or reliability by offering a combination of validated models, standardized APIs, and optimized infrastructure.