Use Case: Encrypted Inference with ONNX Model#

In this tutorial, we will demonstrate how we can run encrypted machine learning inference from a model saved as an ONNX file.

For simplicity, we will start by prototyping this computation between the different parties locally using the pm.LocalMooseRuntime. Then we will execute the same Moose computation over the network with pm.GrpcMooseRuntime. You can also check additional gRPC example here.

import pathlib

import numpy as np

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

from onnxmltools.convert import convert_sklearn
from skl2onnx.common import data_types as onnx_dtypes

import pymoose as pm

random_state = 5

Use Case#

You are a healthcare AI startup who has trained a model to diagnose heart disease. You would like to serve this model to a hospital to help doctors diagnose potential heart disease for their patients. However, the patients’ data is too sensitive to be shared with the AI startup. For this reason, you would like to encrypt the patient’s data and run the model on it.

In this tutorial, we will perform the following steps:

  • Train a model with Scikit-Learn

  • Convert the trained model to ONNX.

  • Convert the model from ONNX to a Moose computation.

  • Run encrypted inference by evaluating the Moose computation.

Training#

For this tutorial, we use a synthetic dataset. The dataset contains 10 features (X). Each record is labeled (y) by 0 or 1 (heart disease or not). For the model, we train a logistic regression with Scikit-Learn. But you could experiment with other models such as XGBClassifier from XGBoost or even Multi-layer Perceptron.

Once the model is trained, you can convert it to ONNX which is a format to represent machine learning models. Since it’s a Scikit-Learn model, you can convert it to ONNX with convert_sklearn from ONNXMLTools. Use convert_xgboost for XGBoost models.

n_samples = 1000
n_features = 10
n_classes = 2

# Generate synthetic dataset
X, y = make_classification(
    n_samples=n_samples,
    n_features=n_features,
    n_classes=n_classes,
    random_state=random_state,
)

# Split dataset between training and testing datasets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=random_state
)

# Train logistic regression
lg = LogisticRegression()
lg.fit(X_train, y_train)

# Convert scikit-learn model to ONNX
initial_type = ("float_input", onnx_dtypes.FloatTensorType([None, n_features]))
onnx_proto = convert_sklearn(lg, initial_types=[initial_type])

Convert ONNX to Moose Predictor#

PyMoose provides several predictor classes to translate an ONNX model into a PyMoose DSL program.

It currently supports 8 types of model predictors:

  • linear_regressor.LinearRegressor: for models such as linear regression, ridge regression, etc.

  • linear_regressor.LinearClassifier: for models such as logistic regression, classifier using ridge regression, etc.

  • tree_ensemble.TreeEnsembleRegressor for models such as XGBoost regressor, random forest regressor, etc.

  • tree_ensemble.TreeEnsembleClassifier: for models such as XGBoost classifier, random forest classifier, etc.

  • multilayer_perceptron_predictor.MLPRegressor: for multi-layer perceptron regressor models.

  • multilayer_perceptron_predictor.MLPClassifier: for multi-layer perceptron classifier models.

  • neural_network_predictor.NeuralNetwork: for feed forward neural network from PyTorch and TensorFlow.

Because the trained model is a logistic regression, we should use the class linear_regressor.LinearClassifier. The class has a method from_onnx which will parse the ONNX file. More specifically, it will extract the model weights, intercepts and the post transform (e.g sigmoid or softmax etc.). The returned object is callable. When called, it will compute the forward pass of the logistic regression.

predictor = pm.predictors.LinearClassifier.from_onnx(onnx_proto)

On the LinearClassifier object, there is a host_placements property. As you can see, when instantiating the object, there are three host placements which have been created automatically: alice, bob and carole. These three players are grouped under the replicated placement to perform the encrypted inference.

print("List of host placements:", predictor.host_placements)
List of host placements: (HostPlacementExpression(name='alice'), HostPlacementExpression(name='bob'), HostPlacementExpression(name='carole'))

Define Moose Computation#

For this example, alice will play the role of the hospital.

The Moose computation below performs the following steps:

  • Loads patient’s data in plaintext from alice’s (hospital) storage.

  • Secret share (encrypts) the patient’s data.

  • Computes logistic regression on secret shared data.

  • Reveals the prediction only to alice (hospital) and saves it into its storage.

@pm.computation
def moose_predictor_computation():
    # Alice (the hospital in our use case) load the patients' data in plaintext
    # Then the data gets converted from float to fixed-point
    with predictor.alice:
        x = pm.load("x", dtype=pm.float64)
        x_fixed = pm.cast(x, dtype=pm.predictors.predictor_utils.DEFAULT_FIXED_DTYPE)
    # The patients' data gets secret shared when moving from host placement
    # to replicated placement.
    # Then compute the logistic regression on secret shared data
    with predictor.replicated:
        y_pred = predictor(x_fixed, pm.predictors.predictor_utils.DEFAULT_FIXED_DTYPE)

    # The predictions gets revealed only to Alice (the hospital)
    # Convert the data from fixed-point to floats and save the data in the storage
    with predictor.alice:
        y_pred = pm.cast(y_pred, dtype=pm.float64)
        y_pred = pm.save("y_pred", y_pred)
    return y_pred

Evaluate the computation#

For simplicity, we will use LocalMooseRuntime to locally simulate this computation running across hosts. To do so, we need to provide: the Moose computation, the list of host identities to simulate, and a mapping of the data stored by each simulated host.

Since the hospital is represented by alice, we will place the patients’ data in alice storage.

Once you have instantiated the LocalMooseRuntime with the identities and additional storage mapping and the runtime set as default, you can simply call the Moose computatio to evaluate it. If you prefer, you can also evaluate the computation with runtime.evaluate_computation(computation=multiparty_correlation, arguments={}). We can also provide arguments to the computation if needed, but we don’t have any in this example. Note that the output of evaluate_computation is an empty dictionary, since this function’s output operation pm.save returns the Unit type.

executive_storage = {"alice": {"x": X_test}, "bob": {}, "carole": {}}
identities = [plc.name for plc in predictor.host_placements]

runtime = pm.LocalMooseRuntime(identities, storage_mapping=executive_storage)
runtime.set_default()

_ = moose_predictor_computation()

Results#

Once the computation is done, we can extract the results. The predictions have been stored in alice’s storage. We can extract the value from the storage with read_value_from_storage.

y_pred = runtime.read_value_from_storage("alice", "y_pred")

In this simulated setting, we can validate that the results on encrypted data match the computation on plaintext data. To do so, we compute the logistic regression prediction with Scikit-Learn.

expected_result = lg.predict_proba(X_test)
np.testing.assert_almost_equal(y_pred, expected_result, decimal=2)

Nice! You were able to compute the inference on patients’ data while keeping the data encrypted during the entire process.

Run Computation over the Network with gRPC#

To run the same computation over the network, you need to launch a gRPC worker at the right endpoints for each party (alice, bob and carole)). You can launch the three workers as follow:

cargo run --bin comet -- --identity localhost:50000 --port 50000
cargo run --bin comet -- --identity localhost:50001 --port 50001
cargo run --bin comet -- --identity localhost:50002 --port 50002

For the data, we will use the numpy files saved in the data folder of this tutorial containing the patients data (x_test.npy).

For the Moose computation, we will use is exact the same computation as the one used with LocalMooseRuntime except for the key of the load operations we will provide the actual file path.

_DATA_DIR = pathlib.Path().parent / "data"
x_path = str((_DATA_DIR / "x_test.npy").resolve())


@pm.computation
def moose_predictor_computation():
    # Alice (the hospital in our use case) load the patients' data in plaintext
    # Then the data gets converted from float to fixed-point
    with predictor.alice:
        x = pm.load(x_path, dtype=pm.float64)
        x_fixed = pm.cast(x, dtype=pm.predictors.predictor_utils.DEFAULT_FIXED_DTYPE)
    # The patients' data gets secret shared when moving from host placement
    # to replicated placement.
    # Then compute the logistic regression on secret shared data
    with predictor.replicated:
        y_pred = predictor(x_fixed, pm.predictors.predictor_utils.DEFAULT_FIXED_DTYPE)

    # The predictions gets revealed only to Alice (the hospital)
    # Convert the data from fixed-point to floats and return the prediction
    # to Alice who's launching the computation. You could also save the result
    # to her storage with `pm.save`.
    with predictor.alice:
        y_pred = pm.cast(y_pred, dtype=pm.float64)

    return y_pred

For the runtime, we will use pm.GrpcMooseRuntime this time. As an argument, we need to provide a mapping between the players identity and the gRPC host address. Once you set the runtime as default, you can call the moose computation to compute the encrypted prediction. You could also evaluate the computation with runtime.evaluate_computation(computation=multiparty_correlation, arguments={}) if you prefer.

role_map = {
    predictor.alice: "localhost:50000",
    predictor.bob: "localhost:50001",
    predictor.carole: "localhost:50002",
}

runtime = pm.GrpcMooseRuntime(role_map)
runtime.set_default()

grpc_y_pred = moose_predictor_computation()

We can finally comfirm that we get same predictions when running this computation with gRPC 🎉!

np.testing.assert_almost_equal(grpc_y_pred[0]["output_0"], y_pred, decimal=2)