Skip to content

Azure/azureml-sdk-for-r

Azure Machine Learning SDK for R (preview)

Build Status Build Status Build Status Build Status CRAN_Status_Badge

** The Azure Machine Learning SDK for R was deprecated at the end of 2021 to make way for an improved R training and deployment experience using Azure Machine Learning CLI 2.0. See the samples repository to get started with the 2.0 CLI. **

Data scientists and AI developers use the Azure Machine Learning SDK for R to build and run machine learning workflows with Azure Machine Learning.

Azure Machine Learning SDK for R uses the reticulate package to bind to Azure Machine Learning's Python SDK. By binding directly to Python, the Azure Machine Learning SDK for R allows you access to core objects and methods implemented in the Python SDK from any R environment you choose.

Main capabilities of the SDK include:

  • Manage cloud resources for monitoring, logging, and organizing your machine learning experiments.
  • Train models using cloud resources, including GPU-accelerated model training.
  • Deploy your models as webservices on Azure Container Instances (ACI) and Azure Kubernetes Service (AKS).

Please take a look at the package website https://azure.github.io/azureml-sdk-for-r for complete documentation.

Key Features and Roadmap

✔️ feature available 🔄 in progress 📋 planned

Features Description Status
Workspace The Workspace class is a foundational resource in the cloud that you use to experiment, train, and deploy machine learning models ✔️
Compute Cloud resources where you can train your machine learning models. ✔️
Data Plane Resources Datastore, which stores connection information to an Azure storage service, and DataReference, which describes how and where data should be made available in a run. ✔️
Experiment A foundational cloud resource that represents a collection of trials (individual model runs). ✔️
Run A Run object represents a single trial of an experiment, and is the object that you use to monitor the asynchronous execution of a trial, store the output of the trial, analyze results, and access generated artifacts. You use Run inside your experimentation code to log metrics and artifacts to the Run History service. ✔️
Estimator A generic estimator to train data using any supplied training script. ✔️
HyperDrive HyperDrive automates the process of running hyperparameter sweeps for an Experiment. ✔️
Model Cloud representations of machine learning models that help you transfer models between local development environments and the Workspace object in the cloud. ✔️
Webservice Models can be packaged into container images that include the runtime environment and dependencies. Models must be built into an image before you deploy them as a web service. Webservice is the abstract parent class for creating and deploying web services for your models. ✔️
Dataset An Azure Machine Learning Dataset allows you to explore, transform, and manage your data for various scenarios such as model training and pipeline creation. When you are ready to use the data for training, you can save the Dataset to your Azure ML workspace to get versioning and reproducibility capabilities. TabularDataset support is experimental. ✔️

Installation

Install Conda if not already installed. Choose Python 3.5 or later.

# Install Azure ML SDK from CRAN
install.packages("azuremlsdk")

# Or the development version from GitHub
install.packages("remotes")
remotes::install_github('https://github.com/Azure/azureml-sdk-for-r', build_vignettes = TRUE)

# Then, use `install_azureml()` to install the compiled code from the AzureML Python SDK.
azuremlsdk::install_azureml()

Now, you're ready to get started!

For a more detailed walk-through of the installation process, advanced options, and troubleshooting, see our Installation Guide.

Getting Started

To begin running experiments with Azure Machine Learning, you must establish a connection to your Azure Machine Learning workspace.

  1. If you don't already have a workspace created, you can create one by doing:

    # If you haven't already set up a resource group, set `create_resource_group = TRUE`  
    # and set `resource_group` to your desired resource group name in order to create the resource group 
    # in the same step.
    new_ws <- create_workspace(name = <workspace_name>, 
                               subscription_id = <subscription_id>, 
    			   resource_group = <resource_group_name>, 
    			   location = location, 
    			   create_resource_group = FALSE)

    After the workspace is created, you can save it to a configuration file to the local machine.

    write_workspace_config(new_ws)
  2. If you have an existing workspace associated with your subscription, you can retrieve it from the server by doing:

    existing_ws <- get_workspace(name = <workspace_name>, 
    			     subscription_id = <subscription_id>, 
    			     resource_group = <resource_group_name>)

    Or, if you have the workspace config.json file on your local machine, you can load the workspace by doing:

    loaded_ws <- load_workspace_from_config()

Once you've accessed your workspace, you can begin running and tracking your own experiments with Azure Machine Learning SDK for R.

Take a look at our code samples and end-to-end vignettes for examples of what's possible with the SDK!

Resources

Contribute

We welcome contributions from the community. If you would like to contribute to the repository, please refer to the contribution guide.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.