






























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
computer science kubernetes ai
Typology: Schemes and Mind Maps
1 / 38
This page cannot be seen from the preview
Don't miss anything!































.......................................................................................................................... .......................................................................................................................... .......................................................................................................................... .......................................................................................................................... .......................................................................................................................... .......................................................................................................................... .......................................................................................................................... ..........................................................................................................................
CHAPTER 1. OVERVIEW OF OPENSHIFT AI CHAPTER 2. NEW FEATURES AND ENHANCEMENTS 2.1. NEW FEATURES 2.2. ENHANCEMENTS CHAPTER 3. TECHNOLOGY PREVIEW FEATURES CHAPTER 4. DEVELOPER PREVIEW FEATURES CHAPTER 5. SUPPORT REMOVALS 5.1. DEPRECATED 5.1.1. Deprecated Kubeflow Training operator v 5.1.2. Deprecated TrustyAI service CRD v1alpha 5.1.3. Deprecated KServe Serverless deployment mode 5.1.4. Deprecated LAB-tuning 5.1.5. Deprecated embedded Kueue component 5.1.6. Deprecated CodeFlare Operator 5.1.7. Deprecated model registry API v1alpha 5.1.8. Multi-model serving platform (ModelMesh) 5.1.9. Deprecated Text Generation Inference Server (TGIS) 5.1.10. Deprecated accelerator profiles 5.1.11. Deprecated OpenVINO Model Server (OVMS) plugin 5.1.12. OpenShift AI dashboard user management moved from OdhDashboardConfig to Auth resource 5.1.13. Deprecated cluster configuration parameters 5.2. REMOVED FUNCTIONALITY 5.2.1. Microsoft SQL Server command-line tool removal 5.2.2. Model registry ML Metadata (MLMD) server removal 5.2.3. Embedded subscription channel not used in some versions 5.2.4. Anaconda removal 5.2.5. Pipeline logs for Python scripts running in Elyra pipelines are no longer stored in S 5.2.6. Beta subscription channel no longer used 5.2.7. HabanaAI workbench image removal CHAPTER 6. RESOLVED ISSUES 6.1. ISSUES RESOLVED IN RED HAT OPENSHIFT AI 2. CHAPTER 7. KNOWN ISSUES CHAPTER 8. PRODUCT FEATURES 3 4 4 4 7 11 12 12 12 12 12 12 12 13 13 13 13 13 13 13 14 14 14 15 15 15 16 16 16 17 17 19 34 Table of Contents
CHAPTER 1. OVERVIEW OF OPENSHIFT AI Red Hat OpenShift AI is a platform for data scientists and developers of artificial intelligence and machine learning (AI/ML) applications. OpenShift AI provides an environment to develop, train, serve, test, and monitor AI/ML models and applications on-premise or in the cloud. For data scientists, OpenShift AI includes Jupyter and a collection of default workbench images optimized with the tools and libraries required for model development, and the TensorFlow and PyTorch frameworks. Deploy and host your models, integrate models into external applications, and export models to host them in any hybrid cloud environment. You can enhance your data science projects on OpenShift AI by building portable machine learning (ML) workflows with data science pipelines, using Docker containers. You can also accelerate your data science experiments through the use of graphics processing units (GPUs) and Intel Gaudi AI accelerators. For administrators, OpenShift AI enables data science workloads in an existing Red Hat OpenShift or ROSA environment. Manage users with your existing OpenShift identity provider, and manage the resources available to workbenches to ensure data scientists have what they require to create, train, and host models. Use accelerators to reduce costs and allow your data scientists to enhance the performance of their end-to-end data science workflows using graphics processing units (GPUs) and Intel Gaudi AI accelerators. OpenShift AI has two deployment options: Self-managed software that you can install on-premise or in the cloud. You can install OpenShift AI Self-Managed in a self-managed environment such as OpenShift Container Platform, or in Red Hat-managed cloud environments such as Red Hat OpenShift Dedicated (with a Customer Cloud Subscription for AWS or GCP), Red Hat OpenShift Service on Amazon Web Services (ROSA classic or ROSA HCP), or Microsoft Azure Red Hat OpenShift. A managed cloud service, installed as an add-on in Red Hat OpenShift Dedicated (with a Customer Cloud Subscription for AWS or GCP) or in Red Hat OpenShift Service on Amazon Web Services (ROSA classic). For information about OpenShift AI Cloud Service, see Product Documentation for Red Hat OpenShift AI. For information about OpenShift AI supported software platforms, components, and dependencies, see the Red Hat OpenShift AI: Supported Configurations Knowledgebase article. For a detailed view of the 2.25 release lifecycle, including the full support phase window, see the Red Hat OpenShift AI Self-Managed Life Cycle Knowledgebase article. CHAPTER 1. OVERVIEW OF OPENSHIFT AI
CHAPTER 2. NEW FEATURES AND ENHANCEMENTS This section describes new features and enhancements in Red Hat OpenShift AI 2.25. 2.1. NEW FEATURES Model registry and model catalog general availability OpenShift AI model registry and model catalog are now available as general availability (GA) features. A model registry acts as a central repository for administrators and data scientists to register, version, and manage the lifecycle of AI models before configuring them for deployment. A model registry is a key component for AI model governance. The model catalog provides a curated library where data scientists and AI engineers can discover and evaluate the available generative AI models to find the best fit for their use cases. LLM Compressor library added to OpenShift AI workbench images and pipelines The LLM Compressor library is now generally available and fully integrated into standard OpenShift AI workbench images and pipelines. This library provides a supported, integrated method to optimize large language models for improved inference, particularly for deployment on vLLM, without leaving your OpenShift AI environment. You can run model compression as an interactive notebook task or as a batch job in a pipeline, which significantly reduces the hardware costs and improves the inference speeds of their generative AI workloads. Use an existing Argo Workflows instance with pipelines You can now configure OpenShift AI to use an existing Argo Workflows instance instead of the one included with Data Science Pipelines. This feature supports users who maintain their own Argo Workflows environments and simplifies adoption of pipelines on clusters where Argo Workflows is already deployed. A new global configuration option disables deployment of the embedded Argo WorkflowControllers, allowing clusters that already use Argo Workflows to integrate with pipelines without conflicts. Cluster administrators can choose whether to deploy the embedded controllers or use their own Argo instance and manage both lifecycles independently. For more information, see Configuring pipelines with your own Argo Workflows instance. Support added for workbench images You can now install and upgrade Python 3.12 workbench images in OpenShift AI for your JupyterLab and code-server IDEs. 2.2. ENHANCEMENTS Support for customizing OAuth proxy sidecar resource allocation You can now customize the CPU and memory requests and limits for the OAuth proxy sidecar in workbench pods. To do this, add one or more of the following annotations to the notebooks custom resource (CR): notebooks.opendatahub.io/auth-sidecar-cpu-request notebooks.opendatahub.io/auth-sidecar-memory-request Red Hat OpenShift AI Self-Managed 2.25 Release notes
Administrators can now declaratively manage settings such as LMEval’s allowOnline and allowCodeExecution through the DSC interface, with changes automatically propagated to the TrustyAI operator. This unifies TrustyAI configuration with other OpenShift AI components and removes the need for manual ConfigMap edits or Operator restarts. Support added to move unwanted files to trash directory You can now increase your container storage by moving and permanently deleting your unwanted files to a trash directory in the Jupyter Notebook. To delete these files, click the Move to Trash icon on your Jupyter notebook toolbar and browse through your trash directory. Select the files that you would like to permanently delete, and delete them to prevent full notebook storage. Updated workbench images A new set of workbench images is now available. These pre-built workbench images and upgraded packages include Python libraries and frameworks for data analysis and exploration, as well as CUDA and ROCm packages for accelerating compute-intensive tasks. Additionally, they feature runtimes and updated IDEs for RStudio and code-server. Red Hat OpenShift AI Self-Managed 2.25 Release notes
CHAPTER 3. TECHNOLOGY PREVIEW FEATURES
This section describes Technology Preview features in Red Hat OpenShift AI 2.25. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope. IBM Spyre AI Accelerator model serving support on x86 platforms Model serving with the IBM Spyre AI Accelerator is now available as a Technology Preview feature for x86 platforms. The IBM Spyre Operator automates installation and integrates the device plugin, secondary scheduler, and monitoring. For more information, see the IBM Spyre Operator catalog entry. Distributed Inference with llm-d Distributed Inference with llm-d is currently available as a Technology Preview feature. Distributed Inference with llm-d supports multi-model serving, intelligent inference scheduling, and disaggregated serving for improved GPU utilization on GenAI models. For more information, see Deploying models by using Distributed Inference with llm-d. Build Generative AI Apps with Llama Stack on OpenShift AI With this release, the Llama Stack Technology Preview feature enables Retrieval-Augmented Generation (RAG) and agentic workflows for building next-generation generative AI applications. It supports remote inference, built-in embeddings, and vector database operations. It also integrates with providers like TrustyAI’s provider for safety and Trusty AI’s LM-Eval provider for evaluation. This preview includes tools, components, and guidance for enabling the Llama Stack Operator, interacting with the RAG Tool, and automating PDF ingestion and keyword search capabilities to enhance document discovery. Centralized platform observability Centralized platform observability, including metrics, traces, and built-in alerts, is available as a Technology Preview feature. This solution introduces a dedicated, pre-configured observability stack for OpenShift AI that allows cluster administrators to perform the following actions: View platform metrics (Prometheus) and distributed traces (Tempo) for OpenShift AI components and workloads. Manage a set of built-in alerts (alertmanager) that cover critical component health and performance issues. Export platform and workload metrics to external 3rd party observability tools by editing the DataScienceClusterInitialization (DSCI) custom resource. You can enable this feature by integrating with the Cluster Observability Operator, Red Hat build of OpenTelemetry, and Tempo Operator. For more information, see Monitoring and observability. For more information, see Managing observability. CHAPTER 3. TECHNOLOGY PREVIEW FEATURES
mode is in use for full compatibility if needed. There is no manual set up required. Then, in the DataScienceCluster custom resource for the Red Hat OpenShift AI Operator, set the spec.llamastackoperator.managementState field to Managed. For more information, see Trusty AI FMS Provider on GitHub. New Feature Store component You can now install and manage Feature Store as a configurable component in OpenShift AI. Based on the open-source Feast project, Feature Store acts as a bridge between ML models and data, enabling consistent and scalable feature management across the ML lifecycle. This Technology Preview release introduces the following capabilities: Centralized feature repository for consistent feature reuse Python SDK and CLI for programmatic and command-line interactions to define, manage, and retrieve features for ML models Feature definition and management Support for a wide range of data sources Data ingestion via feature materialization Feature retrieval for both online model inference and offline model training Role-Based Access Control (RBAC) to protect sensitive features Extensibility and integration with third-party data and compute providers Scalability to meet enterprise ML needs Searchable feature catalog Data lineage tracking for enhanced observability For configuration details, see Configuring Feature Store. IBM Power and IBM Z architecture support IBM Power (ppc64le) and IBM Z (s390x) architectures are now supported as a Technology Preview feature. Currently, you can only deploy models in KServe RawDeployment mode on these architectures. Support for vLLM in IBM Power and IBM Z architectures vLLM runtime templates are available for use in IBM Power and IBM Z architectures as Technology Preview. Enable targeted deployment of workbenches to specific worker nodes in Red Hat OpenShift AI Dashboard using node selectors Hardware profiles are now available as a Technology Preview. The hardware profiles feature enables users to target specific worker nodes for workbenches or model-serving workloads. It allows users to target specific accelerator types or CPU-only nodes. This feature replaces the current accelerator profiles feature and container size selector field, offering a broader set of capabilities for targeting different hardware configurations. While accelerator profiles, taints, and tolerations provide some capabilities for matching workloads to CHAPTER 3. TECHNOLOGY PREVIEW FEATURES
hardware, they do not ensure that workloads land on specific nodes, especially if some nodes lack the appropriate taints. The hardware profiles feature supports both accelerator and CPU-only configurations, along with node selectors, to enhance targeting capabilities for specific worker nodes. Administrators can configure hardware profiles in the settings menu. Users can select the enabled profiles using the UI for workbenches, model serving, and Data Science Pipelines where applicable. RStudio Server workbench image With the RStudio Server workbench image, you can access the RStudio IDE, an integrated development environment for R. The R programming language is used for statistical computing and graphics to support data analysis and predictions. To use the RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig , and then enable it in the OpenShift AI UI by editing the rstudio-rhel image stream. For more information, see Building the RStudio Server workbench images.
Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench. CUDA - RStudio Server workbench image With the CUDA - RStudio Server workbench image, you can access the RStudio IDE and NVIDIA CUDA Toolkit. The RStudio IDE is an integrated development environment for the R programming language for statistical computing and graphics. With the NVIDIA CUDA toolkit, you can enhance your work by using GPU-accelerated libraries and optimization tools. To use the CUDA - RStudio Server workbench image, you must first build it by creating a secret and triggering the BuildConfig , and then enable it in the OpenShift AI UI by editing the rstudio-rhel image stream. For more information, see Building the RStudio Server workbench images.
Disclaimer: Red Hat supports managing workbenches in OpenShift AI. However, Red Hat does not provide support for the RStudio software. RStudio Server is available through rstudio.org and is subject to their licensing terms. You should review their licensing terms before you use this sample workbench. The CUDA - RStudio Server workbench image contains NVIDIA CUDA technology. CUDA licensing information is available in the CUDA Toolkit documentation. You should review their licensing terms before you use this sample workbench. Support for multinode deployment of very large models Serving models over multiple graphical processing unit (GPU) nodes when using a single-model serving runtime is now available as a Technology Preview feature. Deploy your models across multiple GPU nodes to improve efficiency when deploying large models such as large language models (LLMs). For more information, see Deploying models by using multiple GPU nodes. Red Hat OpenShift AI Self-Managed 2.25 Release notes
CHAPTER 5. SUPPORT REMOVALS This section describes major changes in support for user-facing features in Red Hat OpenShift AI. For information about OpenShift AI supported software platforms, components, and dependencies, see the Red Hat OpenShift AI: Supported Configurations Knowledgebase article. 5.1. DEPRECATED
The Kubeflow Training Operator (v1) is deprecated starting OpenShift AI 2.25 and is planned to be removed in a future release. This deprecation is part of our transition to Kubeflow Trainer v2, which delivers enhanced capabilities and improved functionality.
Starting with OpenShift AI 2.25, the v1apha1 version is deprecated and planned for removal in an upcoming release. You must update the TrustyAI Operator to version v1 to receive future Operator updates.
Starting with OpenShift AI 2.25, The KServe Serverless deployment mode is deprecated. You can continue to deploy models by migrating to the KServe RawDeployment mode. If you are upgrading to Red Hat OpenShift AI 3.0, all workloads that use the retired Serverless or ModelMesh modes must be migrated before upgrading.
Starting with OpenShift AI 2.25, the LAB-tuning feature is deprecated. If you are using LAB-tuning for large language model customization, plan to migrate to alternative fine-tuning or model customization methods as they become available.
Starting with OpenShift AI 2.24, the embedded Kueue component for managing distributed workloads is deprecated. OpenShift AI now uses the Red Hat Build of Kueue Operator to provide enhanced workload scheduling across distributed training, workbench, and model serving workloads. The deprecated embedded Kueue component is not supported in any Extended Update Support (EUS) release. To ensure workloads continue using queue management, you must migrate from the embedded Kueue component to the Red Hat Build of Kueue Operator, which requires OpenShift Container Platform 4.18 or later. To migrate, complete the following steps:
Starting with OpenShift AI 2.24, the CodeFlare Operator is deprecated and will be removed in a future release of OpenShift AI.
This deprecation does not affect the Red Hat OpenShift AI API tiers.
Starting with OpenShift AI 2.24, the model registry API version v1alpha1 is deprecated and will be removed in a future release of OpenShift AI. The latest model registry API version is v1beta.
Starting with OpenShift AI version 2.19, the multi-model serving platform based on ModelMesh is deprecated. You can continue to deploy models on the multi-model serving platform, but it is recommended that you migrate to the single-model serving platform. For more information or for help on using the single-model serving platform, contact your account manager.
Starting with OpenShift AI version 2.19, the Text Generation Inference Server (TGIS) is deprecated. TGIS will continue to be supported through the OpenShift AI 2.16 EUS lifecycle. Caikit-TGIS and Caikit are not affected and will continue to be supported. The out-of-the-box serving runtime template will no longer be deployed. vLLM is recommended as a replacement runtime for TGIS.
Accelerator profiles are now deprecated. To target specific worker nodes for workbenches or model serving workloads, use hardware profiles.
The CUDA plugin for the OpenVINO Model Server (OVMS) is now deprecated and will no longer be available in future releases of OpenShift AI.
Previously, cluster administrators used the groupsConfig option in the OdhDashboardConfig resource to manage the OpenShift groups (both administrators and non-administrators) that can access the OpenShift AI dashboard. Starting with OpenShift AI 2.17, this functionality has moved to the Auth resource. If you have workflows (such as GitOps workflows) that interact with OdhDashboardConfig , you must update them to reference the Auth resource instead. Table 5.1. Updated configurations CHAPTER 5. SUPPORT REMOVALS
Starting with OpenShift AI 2.24, the Microsoft SQL Server command-line tools (sqlcmd, bcp) have been removed from workbenches. You can no longer manage Microsoft SQL Server using the preinstalled command-line client.
Starting with OpenShift AI 2.23, the ML Metadata (MLMD) server has been removed from the model registry component. The model registry now interacts directly with the underlying database by using the existing model registry API and database schema. This change simplifies the overall architecture and ensures the long-term maintainability and efficiency of the model registry by transitioning from the ml- metadata component to direct database access within the model registry itself. If you see the following error for your model registry deployment, this means that your database schema migration has failed: error: error connecting to datastore: Dirty database version {version}. Fix and force version. You can fix this issue by manually changing the database from a dirty state to 0 before traffic can be routed to the pod. Perform the following steps:
For OpenShift AI 2.8 to 2.20 and 2.22 to 2.25, the embedded subscription channel is not used. You cannot select the embedded channel for a new installation of the Operator for those versions. For more information about subscription channels, see Installing the Red Hat OpenShift AI Operator.
Anaconda is an open source distribution of the Python and R programming languages. Starting with OpenShift AI version 2.18, Anaconda is no longer included in OpenShift AI, and Anaconda resources are no longer supported or managed by OpenShift AI. If you previously installed Anaconda from OpenShift AI, a cluster administrator must complete the following steps from the OpenShift command-line interface to remove the Anaconda-related artifacts:
Logs are no longer stored in S3-compatible storage for Python scripts which are running in Elyra pipelines. From OpenShift AI version 2.11, you can view these logs in the pipeline log viewer in the OpenShift AI dashboard.
For this change to take effect, you must use the Elyra runtime images provided in workbench images at version 2024.1 or later. If you have an older workbench image version, update the Version selection field to a compatible workbench image version, for example, 2024.1, as described in Updating a project workbench. Updating your workbench image version will clear any existing runtime image selections for your pipeline. After you have updated your workbench version, open your workbench IDE and update the properties of your pipeline to select a runtime image.
Starting with OpenShift AI 2.5, the beta subscription channel is no longer used. You can no longer select the beta channel for a new installation of the Operator. For more information about subscription channels, see Installing the Red Hat OpenShift AI Operator.
Support for the HabanaAI 1.10 workbench image has been removed. New installations of OpenShift AI from version 2.14 do not include the HabanaAI workbench image. However, if you upgrade OpenShift AI from a previous version, the HabanaAI workbench image remains available, and existing HabanaAI workbench images continue to function. Red Hat OpenShift AI Self-Managed 2.25 Release notes