Introducing Vision Models in Azure Machine Learning Model Catalog

RashaudSavage · ‎Sep 12 2023

We are thrilled to announce a brand-new model catalog on the Azure Machine Learning platform. The model catalog is your starting point to explore collections of foundation models. Launched in May 2023, it offers a wide range of state-of-the-art open-source models curated by AzureML, a Hugging Face Hub Community Partner collection, a collection of Meta’s Llama 2 large language models (LLMs), and a collection of Azure OpenAI Service models. With today’s announcement, our catalog is now enriched with a diverse array of curated advanced open-source vision models, readily available to the AI community on our platform. We have curated a set of image classification models from HuggingFace transformers, and object detection and image segmentation models from MMDetection. These Azure machine learning curated vision models are thoroughly tested, use various architectures, and come packaged with default hyperparameters that would ensure a good performance out-of-the-box across an array of datasets.

Image Classification Models

As a result of the partnership between Microsoft and Hugging Face, we are delighted to offer you a diverse range of open-source image classification models. These models have been meticulously trained and fine-tuned on massive datasets, ensuring exceptional accuracy and performance in various use-cases. As mentioned earlier, these Azure machine learning curated image classification models are thoroughly tested and come packaged with default hyperparameters that would ensure good performance out-of-the-box across an array of datasets. The image classification models are as follows:

facebook-deit-base-patch16-224
google-vit-base-patch16-224
microsoft-swinv2-base-patch4-window12-192-22k
microsoft-beit-base-patch16-224-pt22k-ft22k

Finetuning the above Azure machine learning curated image classification models is accelerated using the latest technologies like ONNX Runtime Training and DeepSpeed. Together, they reduce the training time and GPU hours by 10%-40% depending on the model being finetuned. They employ several memory and compute optimizations to maximize batch size and utilize memory efficiently and reduce overall training time.

In addition to the above AzureML curated models, the user can use any image classification model from HuggingFace transformers library. To learn how to use any image classification model from HuggingFace transformers library, refer to “How to finetune, evaluate and deploy these models” section below.

Object Detection and Instance Segmentation Models

In addition to image classification, we are also introducing a suite of powerful image instance segmentation and object detection models. These models have been trained by the owners of MMDetection GitHub repository, making them an essential resource for anyone working on object detection tasks. Similar to the earlier mentioned image classisfication models, these Azure machine learning curated object detection and instance segmentation models are thoroughly tested and come packaged with default hyperparameters that would ensure a good performance out-of-the-box across an array of datasets. The object detection and image segmentation models now available are as follows::

sparse_rcnn_r50_fpn_300_proposals_crop_mstrain_480-800_3x_coco
sparse_rcnn_r101_fpn_300_proposals_crop_mstrain_480-800_3x_coco
vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco
vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco
deformable_detr_twostage_refine_r50_16x2_50e_coco
yolof_r50_c5_8x8_1x_coco
mask_rcnn_swin-t-p4-w7_fpn_1x_coco

In addition to the above curated models, the user can use any image object detection and instance segmentation model from MMDetection library. To learn how to use any image object detection and instance segmentation model from MMDetection library, refer to “How to finetune, evaluate and deploy these models” section below.

Our Approach

We use a pipeline based modular approach that allows users to import, fine-tune, and evaluate these vision models seamlessly, for their specific use-cases and datasets. In a pipeline, we have components connected to one another, with each component performing a dedicated functionality, namely – model import, finetuning and model evaluation. Figure-1 depicts the end-to-end flow from model import to model deployment.

Model Import component: The model import component is responsible for importing a given model, either Azure machine learning curated model from azureml system registry or directly from HuggingFace hub or MMDetection, depending on the task type.
Finetune component: This component enables finetuning the chosen model on a custom or readily available dataset.
Evaluate component: Once the finetuning of a given model is complete, the best model can be evaluated on the test dataset that you provide, allowing you to track the performance of the model. The model evaluation component can also be used to compare the performance of various models/checkpoints across multiple datasets.
Deploy: Once the user is happy with the performance of a given model, they can deploy it to real-time or batch endpoint. Once your model is deployed, you can test the model or access the REST API for scoring requests from your application(s).

Figure-1. End-to-end flow from model import to model deployment.

How to finetune, evaluate, and deploy these models?

To finetune, evaluate and deploy these cutting-edge Azure machine learning curated vision models in Azure Machine Learning Studio, follow along with the below steps.

1. Explore the vision models in AzureML Studio Model Catalog

On AzureML Studio, navigate to the Model Catalog section. To explore the newly available vision models, look for the “image classification” and “image segmentation” or “object detection” in the task filters.

2. Review the Model Card

Once you find a model that suits your needs, review the detailed information about its training approach, limitations, and potential biases on the model card. In this blog, we will use “yolof_r50_c5_8x8_1x_coco” model for an object-detection task.

3. Finetune the selected model

In the model card, scroll down to “Finetune Samples” section. You can use either the SDK or CLI example to finetune the model. In this blog, we are demonstrating finetuning a model using the SDK example. You can either clonethe azureml-examples github repository on your machine and browse to the specific Jupyter Notebook or you can download the specific Jupyter Notebook straight to your machine. Now, you can run the example Jupyter Notebook cell-by-cell providing any user-specific details as needed.

As mentioned earlier, the AzureML curated vision models have been thoroughly tested, and come packaged with default hyperparameters that would ensure a good performance out-of-the-box across an array of datasets.
Finetuning any model from HuggingFace Transformers library for image classification (multi-class and multi-label) or any model from MMDetection for object detection and instance segmentation is as simple as providing the model id or the model name in model_name parameter of the pipeline component.
In the example Jupyter Notebook(s), we use the Fridge dataset. However, you can prepare and use another dataset leveraging the guidance shared in the example Jupyter Notebook.

Once the finetune job is complete, you can check the metrics in AzureML studio by following the link to the job.

Now that the model is finetuned, you can follow the cells to register and deploy the model to a real-time endpoint and test it. The deployment process typically takes 10-15 minutes. Once your model is deployed, you can test the model or access the REST API for scoring requests from your applications. When you are done testing, don’t forget to delete the real-time endpoint.

4. Evaluate the selected model

If your test data is similar to the dataset used for training the base model, you can also run evaluation on the base models. In the model card, scroll to the “Model Evaluation” section. You can either clone the azureml-examples github repository to your machine and browse to the specific SDK example or you can download the specific SDK notebook example straight to your machine. Now, run the SDK example notebook providing any user-specific details as needed.

If your test dataset is not similar to the dataset used for training the base model, we highly recommend you finetune the base model on your dataset and run evaluations on the finetuned model. As illustrated in the example notebook, you can use the model evaluation component to compare the performance of various models across various datasets.

5. Deploy the selected model and Infer

If your test data is similar to the data used for training the base model, you can deploy the base model from the UI/SDK/CLI to either a real-time or batch endpoint, depending on your needs. Below we demonstrate the single-click no-code deployment of the base model to a real-time endpoint from the UI. You can also deploy using SDK or CLI examples - by scrolling to “Inference samples” section in the model card. You can either clone azureml-examples github repository to your machine and browse to the specific SDK or CLI example or you can download the SDK or CLI example notebooks straight to your machine. Now, run the notebook example providing any user-specific details as needed.

If your test samples are not like the dataset used for training the base model, we highly recommend that you first finetune the base model on your dataset (as shown in previous steps). Then, register and deploy the finetuned model.
If you are unsure about using real-time vs batch endpoint, learn more about these concepts from Online and Batch endpoints section.

Responsible AI is at the heart of Microsoft’s approach to AI and how we partner. For years we’ve invested heavily in making Azure the place for responsible, cutting-edge AI innovation, whether customers are building their own models or using pre-built and customizable models from Microsoft, Meta, OpenAI and the open-source ecosystem.

We are thrilled to empower our users with cutting-edge Vision-based technology while promoting responsible AI usage with the possibility to integrate with any of the models. We cannot wait to witness the incredible applications and solutions our users will create using these state-of-the-art vision-based models.

Explore SDK and CLI Examples for foundation model in azureml-examples github repo!

Learn more!

Get started with new vision models in Azure Machine Learning
Sign up for Azure AI for free and start exploring vision-based models in the Azure Machine Learning model catalog.
Announcing Foundation Models in Azure Machine Learning
Explore documentation for the model catalog in Azure Machine Learning
Learn more about generative AI in Azure Machine Learning
Learn more about Azure AI Content Safety - Azure AI Content Safety – AI Content Moderation | Microsoft Azure
Learn more about Azure Container for Pytorch - Azure Container for PyTorch - Azure Machine Learning | Microsoft Learn
Large Model Training | onnxruntime
Training Overview and Features - DeepSpeed

Products (49)

Special Topics (26)

Video Hub (462)

Most Active Hubs