Infinite loading issue when uploading training data to Fine Tune ChatGPT 4

Freddie 0

I am trying to fine-tune ChatGPT 4 for live chat queries using Azure AI Studio. Upon uploading training data to the Fine Tuning (preview) section, I click on the "+Fine-tune model" button to submit the data. However, the blue loading bar appears and continues indefinitely without submitting my data. This is preventing me from fine-tuning my model. Attached is the format of my training data and its size is about 28MB.

Vinodh247 12,831 Reputation points

2024-07-07T10:50:23.5933333+00:00
hope you have stable internet connection

Fine-tuning is essential for adapting ChatGPT 4 to specific tasks or domains. It improves performance and contextually tailors the model. Prepare a high-quality dataset relevant to your target task or domain.

try splitting the file into smaller chunks and submitting them separately.

Open the browser’s developer tools (usually F12) and check the console for any error messages when you attempt to submit the training data. This may give clues about what is going wrong.
Freddie 0 Reputation points

2024-07-07T12:17:00.9466667+00:00

I do have a stable internet connection,
I've tried to upload a smaller file, my file is about 28MB so as far as I can tell shouldn't be a problem, but have tried anyway.
I've checked the error logs and didn't find it generated any error message

I'm really confused as to what the solution could be, so if there is any information that I haven't given that you need, please just ask.
romungi-MSFT 43,681 Reputation points Microsoft Employee

2024-07-08T08:28:25.5533333+00:00

@Freddie Do you have the role Cognitive Services OpenAI Contributor assigned? This is one of the pre-req to fine tune a model as per documentation.

1 answer

mikelydick 76 Reputation points

2024-07-07T12:36:00.4133333+00:00
Ensure that your training data is correctly formatted and within the size limits specified by Azure AI Studio. The data should typically be in JSONL format and validated using tools like the OpenAI CLI data preparation tool. Given that your data is 28MB, it should be within acceptable limits, but double-check the formatting.

Verify that the region you are using supports fine-tuning for the specific model you are working with. Some regions may have capacity constraints or may not support fine-tuning at all times. You can check the supported regions and models in the Azure AI Studio documentation.

Someone with region issue:
https://www.reddit.com/r/AZURE/comments/13y6e0g/issues_uploading_training_file_on_azure_ai_studio/

See the regional capacity table:
https://learn.microsoft.com/en-us/azure/ai-studio/concepts/fine-tuning-overview

For fine-tuning ChatGPT models in Azure AI Studio, the training data generally needs to adhere to the following format requirements:

File Format:

The file should typically be in JSON format (.json extension).

Some systems may also accept JSONL (JSON Lines) format, where each line is a valid JSON object.

Data Structure:

Each entry in the file should represent a single conversation or example.

The structure usually follows a pattern of alternating "system", "user", and "assistant" messages.

Required Fields:

Each entry typically needs to include:

A "messages" array containing the conversation

Each message in the array should have a "role" (system, user, or assistant) and "content"

Example Structure:

{ "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "The capital of France is Paris."} ] }

Multiple Conversations:

If you're including multiple conversations, each should be a separate JSON object.

In JSONL format, each line would be a complete conversation.

Special Characters:

Ensure that any special characters are properly escaped according to JSON rules.

Be particularly careful with quotation marks and backslashes.

UTF-8 Encoding:

The file should be saved with UTF-8 encoding to properly handle all characters.

File Size:

While you mentioned your file is about 28MB, make sure it doesn't exceed the maximum file size limit set by Azure AI Studio (this can vary, so check the current documentation).

Consistency:

Ensure all entries in your dataset follow the same structure consistently.

Validation:

Use a JSON validator tool to check your file for any syntax errors before uploading.

If your data doesn't meet these requirements, the fine-tuning process may fail to start or you might encounter the indefinite loading issue you described. It's worth double-checking your data against these criteria and perhaps sharing a small, anonymized sample of your data structure (without any sensitive information) to get more specific guidance.
Please sign in to rate this answer.

0 comments No comments
Sign in to comment

Share via

Infinite loading issue when uploading training data to Fine Tune ChatGPT 4

1 answer