The MLOps pipeline automates the machine learning lifecycle, from model training to deployment, which involves defining the pipeline using Python code, monitoring for dataset or model parameter changes, training new models, evaluating them, and deploying successful models to production.
Model registries like MLFlow act as version control systems for ML models, allowing for easy tracking and management.
Model-serving platforms like Seldon Core provide a robust way to deploy and serve models in production, eliminating the need for custom web applications and simplifying the process for ML engineers.
MLOps platforms can be vulnerable to both inherent and implementation vulnerabilities.
Free Webinar on Detecting & Blocking Supply Chain Attack -> Book your Spot
Inherent vulnerabilities arise from the underlying formats and processes used on these platforms, such as the unsafe use of the Pickle format in Python, which are challenging to address as they are often inherent to the technology itself.Â
Implementation vulnerabilities, on the other hand, are specific to a particular MLOps platform’s implementation and can be mitigated through patches or updates.
Understanding these vulnerabilities is crucial for securing MLOps environments and preventing attacks.
The research identified inherent vulnerabilities in MLOps platforms that enable attackers to execute arbitrary code by embedding code in machine learning models (e.g., Keras H5 models) that execute upon loading.
Similarly, some dataset libraries (e.g., Hugging Face Datasets) allow code execution when loading datasets, and attackers can exploit Cross-Site Scripting (XSS) vulnerabilities in ML libraries (e.g., CVE-2024-27132 in MLFlow) to inject malicious JavaScript code that escapes the browser sandbox and executes arbitrary Python code on the Jupyter server.Â
The significant implementation vulnerabilities in MLOps platforms include a lack of authentication, container escape, and inherent immaturity, while many platforms lack authentication mechanisms, allowing unauthorized users to execute arbitrary code through ML pipelines.Â
Container escape vulnerabilities enable attackers to gain control of the container environment and potentially spread to other resources.
The immaturity of MLOps platforms, especially open-source ones, contributes to a higher number of security vulnerabilities.
According to JFrog, the map illustrates the vulnerabilities of various MLOps features to potential attacks.
For instance, platforms that enable model serving are susceptible to code injection attacks if they are not adequately secured.
To mitigate this risk, it’s imperative to isolate the model execution environment and implement robust container security measures.
Additionally, the map highlights other vulnerabilities in features like data pipelines, model training, and monitoring, emphasizing the need for comprehensive security practices throughout the MLOps lifecycle.
XSSGuard is a JupyterLab extension that mitigates XSS attacks by sandboxing susceptible output elements, which can be installed from the JupyterLab Extension Manager, while Hugging Face Datasets version 2.20.0 disables automatic code execution by default.Â
Users should upgrade to this version and use explicit flags for code execution when loading datasets.
To deploy MLOps platforms securely, check for supported features, isolate components in Docker containers, enable authentication, and implement strict policies for model uploads and execution.
Are You From SOC/DFIR Teams? - Try Advanced Malware and Phishing Analysis With ANY.RUN - 14 day free trial