MLflow Model Registry and MinIO

Introduction

MLflow Model Registry allows you to manage models that are destined for a production environment. This post picks up where my last post on MLflow Tracking left off. In my Tracking post I showed how to log parameters, metrics, artifacts, and models. If you have not read it, then give it a read when you get a chance. In this post I’ll show how to register and version a logged model. As an added bonus I’ll also show how to load a registered model from MLflow Model Registry into memory.

MLflow Model Registry is one of four major components of MLflow. A brief description of each component is given below.

  • MLflow Tracking - An engineer will use this feature the most. It allows experiments to be recorded and queried. It also keeps track of the code, data, configuration and results for each experiment.
  • MLflow Projects - Allows experiments to be reproduced by packaging the code into a platform agnostic format.
  • MLflow Models - Deploys machine learning models to an environment where they can be served.
  • MLflow Model Registry - Allows for the storage, annotation, discovery, and management of models in a central repository.

I’ll assume you already have MLflow installed on your development machine as a remote server using PostgreSQL and MinIO. If you do not have MLflow installed, check out my post on Setting up a Development Machine with MLflow and MinIO, where I showed how to use Docker Compose to run the services shown below.

MinIO is used in this deployment to store models and datasets. It is important to have a quality object store for these artifacts if you are going to be experimenting with models and eventually moving them into production. This is especially true if you are training large language models.

MLflow and MinIO are better together. MinIO is a performant object storage (349 GB/s GET and 177 GB/s PUT on 32 nodes of NVMe), capable of backing even the most demanding datalake, analytics and AI/ML workloads. Data is written to MinIO with strong consistency, with immutable objects. All objects are protected with inline erasure-code, bitrot hashing and encryption.

What is a Registered Model?

A registered model is a candidate for your production environment. You should register the best-performing models within an experiment. Once a model is registered, you can add metadata to it, such as tags and a description. Models can also be versioned within the Registry and each version can have its own description and tags. Additionally, you can specify the `stage` of the model to track its status. Allowable stages are `None,` `Stage,` `Production,` and `Archive.` I will demo all these features in this post. All code shown in this post can be found here.

To register a model, you must have previously logged it using the `log_model()` function that I demonstrated in my tracking post. Recall from this post that I used a simple flavor of the `log_model()` function to save the model to MLflow. This line of code is shown below.

# Log the trained model.
model_artifact_path = 'mnistmodel'
mlflow.pytorch.log_model(model, artifact_path=model_artifact_path)

To take advantage of all the features of Model Registry, we need to upgrade this command by adding a signature to it. A signature is basically the schema of the model. It tells MLflow the allowable datatypes for inference. You can see from the modified logging code below that MLflow provides an `infer_signature()` function which, as the name implies, infers the signature of your model based on a few samples you send it from either your training set or test set.

# Log the trained model.
images, labels = next(iter(test_loader))
img = images[0].view(1, 784)
label = labels[0]
print(img.size())
print(label.size())
with torch.no_grad():
  logps = model(img)

model_signature = mlflow.models.infer_signature(img.numpy(), logps.numpy())
model_artifact_path = 'mnistmodel'
mlflow.pytorch.log_model(model, artifact_path=model_artifact_path, 

                         signature=model_signature)

Once we have a model logged and it has a schema, we can register it. There are two ways to register a model. One way is to use the MLflow APIs. The other way is to use the MLflow UI to locate and register a previously logged model. I’ll show both these techniques. Let’s start with the APIs for registering.

Registering a Model using APIs

The best way to register a previously logged model using APIs is shown below. There are two other techniques which I will briefly describe in a moment - but the technique below is best because it allows you to create the top-level collection entity and the new version of a model using two separate API calls. This gives you the flexibility to give each a different description and different tags. Think of the top-level collection entity as an object that maintains a list of all versions of a model with the same name.

model_name = 'mnist_nn_model'
client = MlflowClient()

# Register top-level collection entity if it has not been previously registered.
filter_string = f"name='{model_name}'"
results = client.search_registered_models(filter_string=filter_string)
if len(results) == 0:
  model_tags = {'framework': 'Pytorch'}
  model_description = 'Various versions of the MNIST model with different hidden layers.'
  client.create_registered_model(model_name, model_tags, model_description)

# Register the new version
run_id = active_run.info.run_id
run_uri = f'runs:/{run_id}/{model_artifact_path}'
model_source = RunsArtifactRepository.get_underlying_uri(run_uri)
version_tags = {'layers': len(params['hidden_sizes'])}
version_description = f'Hidden sizes: {params["hidden_sizes"]}'
model_version = client.create_model_version(model_name, model_source, run_id, 

                                            tags=version_tags, 

                                            description=version_description)

The `create_registered_model()` function creates the collection entity. Note - it can only be called if the collection does not already exist - otherwise it throws an error. In the code above, I used the `search_registered_models()` function to make sure this function never gets called twice for the same model. The `create_model_version()` function creates a new version of the model. Notice in the code above that the collection and the version have different descriptions and tags.

Once you have registered several versions of a model you will be able to view them in the UI. Let’s take a look at the Model Registry in the MLflow UI and look at the top-level entry and a few versions. Start the UI by navigating to http://localhost:5000/. The default view will show you all your experiments. We want to look at registered models, so find the `Models` tab at the top of the page and click it. You will see a list of all models that have been registered. An example is shown below.

This view shows all the models registered. For each model, this list shows the latest version registered (Version 2), the latest version that is staged (Version 1), and the latest version that is in production (no models are in production in the screenshot above).

Clicking on the model name will show the page below.

Here we see all the versions of this model. The description and tags shown here apply to the top-level entry. Consequently, they should be something that is true of all versions.

As a final note, the MLflow documentation also describes two other sets of APIs that can be used to register models. Some of the logging APIs that were developed specifically for each framework have registering capabilities. (These would be the functions found under mlflow.sklearn for Sklearn and mlflow.pytorch for Pytorch.) They do not give you the same level of control as what I showed above and support varies from framework to framework. There is also an `mlflow.register_model()` function. This function does not give you control over how the collection entity is created and it does not have a parameter for descriptions.

Registering a Model using the MLflow UI

It turns out that you do not need the APIs to register a model. A previously logged model can be registered using the MLflow UI. This may be the preferred method for registering a model if you want a human to determine which logged model should be promoted to the Registry. Let’s take another tour of the UI to see how this is done.

Start off by naviaging to the `Experiements` tab of the UI and select the experiment that contains the model you want to register.

If you add `Model` to the list of columns that are displayed for each run then you will be able to see which runs have a logged models and which logged models are already registered. In the screen shot above the registered models have a name and a version designation. Select a run that contains an unregistered model that you would like to be in the model registry. This will take you to the run details page shown below. Notice the `Register Model` button that appears when you select the model in the artifacts section of this page.

Clicking on the `Register Model` button will show the dialog below.

This is where we will create our top-level collection entity if we need a new one, or we can add the model to an existing collection of registered models. Once you complete this dialog, your model will be registered. Let’s navigate to the Models tab and find our newly registered model.

Select the `mnist_lr_optimized` collection and you will see the list of versions for this collection. This is shown below. Also notice that there is a description and tags at the collection level.

Clicking on a specific version, we can look at the details of that version.

The version has its own description and tags. You can edit these values using this page and you can also set the stage of the model. Also review the schema of this model. This will come into play when we load the model and perform tests on it which is what we will do in the next section.

Loading and Testing a Registered Model

The final thing we will do is load a previously registered model. After all, what good is logging and registering a model if you cannot retrieve it and perform further experiments with it? What I will show in this section should not be confused with what is known as `Model Serving.`  When you serve a model, you take a previously trained and tested model and deploy it with all its dependencies where it can perform inference as a stand-alone service. What we will do here is merely retrieve a previously registered model. To use this model, you will need to take care of all dependencies yourself. This is useful if a model is still under development and you want to check out a new version of a model that a colleague just trained. Perhaps the test set was not available when the model was being trained and you want to see how well the new model performs against data it has never seen.

The code below will load a previously registered model into memory.

model_name = 'mnist_nn_model'
model_version = 6

mlflow.set_tracking_uri('http://localhost:5000/')
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")
type(model)

All you need to know is the model name and the version number. These values will be used to create a model URI, which is passed to the `mlflow.pyfunc.load_model()` function. The code above when run in a notebook cell will produce the output below.

mlflow.pyfunc.PyFuncModel

The type check in the code above tells us that the model is an instance of `mlflow.pyfunc.PyFuncModel`. MLflow wraps all models using the `PyFuncModel` class. This may seem unnecessary but remember that MLflow is a tool for all frameworks - so if you are on a team that is experimenting with several frameworks like Pytorch, TensorFlow, and Sklearn, then the same downstream testing or inference code will work for all models since they are wrapped in a common class.

Our wrapped model has a predict() function, which is used for inference. Let’s create a test function that uses this model.

def test_model(model: mlflow.pyfunc.PyFuncModel, loader: DataLoader) -> Dict[str, Any]:
  correct_count, total_count = 0, 0
  for images, labels in loader:
    for i in range(len(labels)):
      img = images[i].view(1, 784)
      # Turn off gradients to speed up this part
      with torch.no_grad():
        logps = model.predict(img.numpy())

      # Output of the network are log-probabilities, 

      #need to take exponential for probabilities
      ps = np.exp(logps)
      probab = list(ps[0])
      pred_label = probab.index(max(probab))
      true_label = labels.numpy()[i]
      if(true_label == pred_label):
        correct_count += 1
      total_count += 1
  

  testing_metrics = {
    'incorrect_count': total_count-correct_count,
    'correct_count': correct_count,
    'accuracy': (correct_count/total_count)
  }
  print("Number Of Images Tested =", total_count)
  print("\nModel Accuracy =", (correct_count/total_count))
  return testing_metrics

For the function above to work, we need a test set. The code below will give us the test set for the MNIST dataset.

def load_test_images(batch_size: int) -> Tuple[Any]:
  # Define a transform to normalize the data
  transform = transforms.Compose([transforms.ToTensor(),
                                  transforms.Normalize((0.5,), (0.5,))])

  # Download and load the testing data
  test_dataset = datasets.MNIST('./mnistdata', download=True, train=False,   

                                transform=transform)
  test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True)
  return test_loader

That is everything you need to load a registered model from Model Registry and experiment further with it. A word of caution, if the wrapped model’s predict method throws errors on you, then there is a good chance that your inferred schema is wrong. In this situation, go back to the code that logs your model and make sure that the data type and shape used in the `infer_schema` function are the same data type and shape that the wrapped model’s predict method is receiving.

Summary

Combining MLflow and MinIO offers a potent synergy that elevates the machine learning development process. MLflow's streamlined experiment tracking, model versioning, and deployment capabilities ensure efficient collaboration and reproducibility among data scientists and engineers. By integrating MinIO's high-performance object storage, the process gains scalability, cost-effectiveness, and robust data management, facilitating the seamless storage and sharing of large-scale datasets and models. This unified methodology enhances team productivity, accelerates innovation, and establishes a solid foundation for building advanced machine learning applications with confidence.