Skip to content

Concrete-ML is a Privacy-Preserving Machine Learning (PPML) open-source set of tools which aims to simplify the use of fully homomorphic encryption (FHE) for data scientists. Particular care was given to the simplicity of our Python package in order to make it usable by any data scientist, even those without prior cryptography knowledge.

License

Notifications You must be signed in to change notification settings

NielsPichon/concrete-ml

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Concrete-ML is a Privacy-Preserving Machine Learning (PPML) open-source set of tools built on top of The Concrete Framework by Zama. It aims to simplify the use of fully homomorphic encryption (FHE) for data scientists to help them automatically turn machine learning models into their homomorphic equivalent. Concrete-ML was designed with ease-of-use in mind, so that data scientists can use it without knowledge of cryptography. Notably, the Concrete-ML model classes are similar to those in scikit-learn and it is also possible to convert PyTorch models to FHE.

Main features.

Data scientists can use models with APIs which are close to the frameworks they use, with additional options to run inferences in FHE.

Concrete-ML features:

  • built-in models, which are ready-to-use FHE-friendly models with a user interface that is equivalent to their the scikit-learn and XGBoost counterparts
  • support for customs models that can use quantization aware training. These are developed by the user using pytorch or keras/tensorflow and are imported into Concrete-ML through ONNX

Installation.

Depending on your OS, Concrete-ML may be installed with Docker or with pip:

OS / HW Available on Docker Available on pip
Linux Yes Yes
Windows Yes Coming soon
Windows Subsystem for Linux Yes Yes
macOS (Intel) Yes Yes
macOS (Apple Silicon, ie M1, M2 etc) Yes Coming soon

Note: Concrete-ML only supports Python 3.7 (linux only), 3.8 and 3.9.

Docker.

To install with Docker, pull the concrete-ml image as follows:

docker pull zamafhe/concrete-ml:latest

Pip.

To install Concrete-ML from PyPi, run the following:

pip install -U pip wheel setuptools
pip install concrete-ml

You can find more detailed installation instructions in this part of the documentation

A simple Concrete-ML example with scikit-learn.

A simple example which is very close to scikit-learn is as follows, for a logistic regression :

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

from concrete.ml.sklearn import LogisticRegression

# Create the data for classification
x, y = make_classification(n_samples=100, class_sep=2, n_features=4, random_state=42)

# Retrieve train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    x, y, test_size=10, random_state=42
)

# Fix the number of bits to used for quantization 
model = LogisticRegression(n_bits=2)

# Fit the model
model.fit(X_train, y_train)

# Run the predictions on non-encrypted data as a reference
y_pred_clear = model.predict(X_test, execute_in_fhe=False)

# Compile into a FHE model
model.compile(x)

# Run the inference in FHE
y_pred_fhe = model.predict(X_test, execute_in_fhe=True)

print("In clear  :", y_pred_clear)
print("In FHE    :", y_pred_fhe)
print(f"Comparison: {int((y_pred_fhe == y_pred_clear).sum()/len(y_pred_fhe)*100)}% similar")

# Output:
#  In clear  : [0 0 0 1 0 1 0 1 1 1]
#  In FHE    : [0 0 0 1 0 1 0 1 1 1]
#  Comparison: 100% similar

This example is explained in more detail in the linear model documentation. Concrete-ML built-in models have APIs that are almost identical to their scikit-learn counterparts. It is also possible to convert PyTorch networks to FHE with the Concrete-ML conversion APIs. Please refer to the linear models, tree-based models and neural networks documentation for more examples, showing the scikit-learn-like API of the built-in models.

Documentation.

Full, comprehensive documentation is available here: https://docs.zama.ai/concrete-ml.

Online demos and tutorials.

Various tutorials are proposed for the built-in models and for deep learning. In addition, several complete use-cases are explored:

  • MNIST:a python script and notebook showing quantization-aware training following FHE constraints. The model is implemented with Brevitas and is converted to FHE with Concrete-ML.

  • Titanic: a notebook, which gives a solution to the Kaggle Titanic competition. Done with XGBoost from Concrete-ML. It comes as a companion of Kaggle notebook, and was the subject of a blogpost in KDnuggets.

  • Encrypted sentiment analysis: a gradio demo which predicts if a tweet / short message is positive, negative or neutral, in FHE of course! The live interactive demo is available on Hugging Face. And read the official blog post explaining how we do it!

More generally, if you have built awesome projects using Concrete-ML, feel free to let us know and we'll link to it!

Citing Concrete-ML

To cite Concrete-ML, notably in academic papers, please use the following entry, which list authors by order of first commit:

@Misc{ConcreteML,
  title={Concrete-{ML}: a Privacy-Preserving Machine Learning Library using Fully Homomorphic Encryption for Data Scientists},
  author={Arthur Meyre and Benoit Chevallier-Mames and Jordan Frery and Andrei Stoian and Roman Bredehoft and Luis Montero and Celia Kherfallah},
  year={2022-*},
  note={\url{https://github.com/zama-ai/concrete-ml}},
}

Need support?

License.

This software is distributed under the BSD-3-Clause-Clear license. If you have any questions, please contact us at hello@zama.ai.

About

Concrete-ML is a Privacy-Preserving Machine Learning (PPML) open-source set of tools which aims to simplify the use of fully homomorphic encryption (FHE) for data scientists. Particular care was given to the simplicity of our Python package in order to make it usable by any data scientist, even those without prior cryptography knowledge.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 79.7%
  • Jupyter Notebook 14.0%
  • Shell 3.7%
  • Makefile 2.6%