Introducing JAIS: Arabic-centric Large Language Model on Azure
Published May 21 2024 08:30 AM 2,883 Views
Microsoft

Arabic is the sixth most spoken language in the world, with over 400 million Arabic speakers¹. It is a language of rich cultural and linguistic diversity, with many dialects and variations that reflect the history and identity of its speakers. Given this importance, we are thrilled to announce the launch of JAIS on Azure, a new Arabic-centric foundation and instruction-tuned open generative large language model (LLM) developed by Core42, a G42 company and leading UAE-based artificial intelligence (AI) technology holding company. This aligns with our recent $1.5 billion investment in G42 to enhance global AI and cloud services, with a focus on safety, security, and skilling initiatives.  

 

AnjaleePatel_0-1715810833276.png

 

JAIS 30B Chat Models-as-a-Service is now available on Microsoft’s Azure AI model catalog, giving developers and businesses access to cutting-edge capabilities in both Arabic and English. This is one step closer to making AI and LLMs more accessible to speakers across the world. By leveraging a custom-built vocabulary, a massive pretraining corpus, and several architectural innovations, JAIS demonstrates great performance and efficiency in a variety of tasks, such as summarization, translation, text generation, and information retrieval. 

 

JAIS 30B Chat on Azure allows for increased flexibility in model choice and addresses the multilingual needs of companies. Azure AI provides a secure, scalable, and compliant environment for deploying and managing AI solutions, as well as a rich set of tools and services for developing, fine-tuning, and evaluating AI models.  

 

"We’ve seen JAIS used in powerful use cases to transform access and experience in industries such as healthcare and finance. Now with availability on Azure and our expanded reach across the globe, I am excited about the opportunities we see to further empower individuals and organizations to leverage AI for diverse applications." - Andrew Jackson, Chief AI Officer at Core42 

 

What is JAIS 30B Chat? 

 

JAIS is an auto-regressive bilingual LLM for Arabic and English based on the GPT-3 decoder-only architecture, which uses a transformer-based neural network to generate text from a given input. The model is pretrained on a mixture of Arabic and English texts, including source code in various programming languages, totaling 1.63 trillion tokens. Of these, 475 billion tokens come from Arabic documents. The model now available in Azure is JAIS 30B Chat, an instruction-tuned variant of the foundation JAIS model, with 30 billion parameters, that can handle a wide variety of use cases across industries as outlined by Core42: 

 

  • Government and Public Sector: Enhances communication, improves public service delivery, and digitizes government archives and documents. 
  • Education: Supports educational technology platforms by providing language learning, automated tutoring, and translation tools. 
  • Banking and Finance: Helps financial institutions with customer service automation, document analysis, and compliance monitoring. 
  • Healthcare: Assists in patient management systems by interpreting, automating, and translating patient and medical data. 
  • Media and Entertainment: Generates, summarizes, and creates interactive and engaging media content for Arabic and English-speaking audiences. 

JAIS 30B Chat uses a dense transformer-based causal large language model that incorporates several recent architectural advancements, such as ALiBi positional embeddings and SwiGLU non-linearity, enhancing its ability to handle longer inputs at inference time than training and improve training efficiency. Based on information from Core42, the model also uses a custom-made tokenizer that splits words into smaller units, reducing the number of tokens needed to process Arabic text by 3-4 times compared to other models. This results in higher quality, more fluent text generation, and faster and more efficient inference.  

 

Customers can not only get JAIS 30B Chat Inferencing API access, but also access to Azure’s cutting-edge AI infrastructure and development environment where you can build end-to-end, evaluate, and deploy AI-powered solutions.   

 

How does JAIS 30B Chat perform? 

 

JAIS 30B Chat has been extensively evaluated by Core42 on a wide range of benchmarks and has shown remarkable results. In Arabic, JAIS 30B Chat outperforms all existing open Arabic and multilingual models by a sizable margin, demonstrating better knowledge and reasoning capabilities. To learn more about the model's performance, check out the whitepaper.  

 

How to get started with JAIS on Azure 

 

Developers and enterprises worldwide can discover the possibilities of JAIS 30B Chat on the Azure AI model catalog. Learn more about the new model on the Core42 page here. To access JAIS 30B Chat on Azure, you will need an Azure subscription and an API key. You can use the Azure Portal, the Azure CLI, or the Azure SDKs to create and manage your resources. You can use the model in Azure AI Studio to start building for your use cases, leveraging Azure's powerful compute and storage options. For more details, please refer to the documentation 

 

We welcome you to explore our offerings and are excited to see what you will create with JAIS on Azure. 

 

FAQs  

  • What does it cost to use the JAIS 30B Chat model on Azure? 
    • You are billed based on the number of prompt and completions tokens. You can review the pricing in the Marketplace offer details tab when deploying the model. You can also find the pricing on the Azure Marketplace. 
    • Paygo-inference-input tokens are $0.0032 per 1k tokens, and paygo-inference-output-tokens are $0.00971 per 1k tokens 
  • Do I need GPU capacity in my Azure subscription to use JAIS 30B Chat? 
    • No, you do not need GPU capacity. The JAIS 30B Chat is offered as an API through Models as a Service.  
  •  Is JAIS 30B Chat available in Azure Machine Learning Studio? 
    • Yes, JAIS 30B Chat is available in the model catalog in both Azure AI Studio and Azure Machine Learning Studio. 
  • JAIS 30B Chat is listed on the Azure Marketplace. Can I purchase and use JAIS 30B Chat directly from Azure Marketplace? 
    • Azure Marketplace is our foundation for commercial transactions for models built on or built for Azure. The Azure Marketplace enables the purchasing and billing of JAIS 30B Chat. However, model discoverability occurs in both Azure Marketplace and the Azure AI model catalog. Meaning you can search and find JAIS 30B Chat in both the Azure Marketplace and Azure AI model catalog.
      • If you search for JAIS 30B Chat in Azure Marketplace, you can subscribe to the offer before being redirected to the Azure AI model catalog in Azure AI Studio where you can complete subscribing and can deploy the model.
      • If you search for JAIS 30B Chat in the Azure AI model catalog, you can subscribe and deploy the model from the Azure AI model catalog without starting from the Azure Marketplace. The Azure Marketplace still tracks the underlying commerce flow.
  • Given that JAIS 30B Chat is billed through the Azure Marketplace, does it retire my Azure consumption commitment (aka MACC)? 
  • Is my inference data shared with Core42? 
  • Are there rate limits for the JAIS 30B Chat model on Azure? 
    • Yes, there are rate limits for the JAIS 30B Chat model on Azure. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. Contact Azure customer support if you have additional questions.  
  • Is the JAIS 30B Chat model region specific? 
    • JAIS 30B Chat model API endpoints can be created in AI Studio projects to Azure Machine Learning workspaces in EastUS2 and Sweden Central. If you want to use the model in prompt flow in project or workspaces in other regions, you can use the API and key as a connection to prompt flow manually. Essentially, you can use the API from any Azure region once you create it in EastUS2 or Sweden Central. 
  • Can I fine-tune the JAIS 30B Chat model on Azure? 
    • You cannot currently fine-tune the model through Azure AI Studio, stay tuned for more updates. 
  • Can I use MaaS models in any Azure subscription types?
    • Customers can use MaaS models in all Azure subsection types with a valid payment method, except for the CSP (Cloud Solution Provider) program. Free or trial Azure subscriptions are not supported.

¹ Meet Jais, The World’s Most Advanced Arabic LLM (g42.ai) and The most spoken languages worldwide 2023 | Statista

Version history
Last update:
‎Jun 17 2024 11:34 AM
Updated by: