Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught exception input sequence_length is >= max_length #646

Closed
hubertwang opened this issue Jun 26, 2024 · 7 comments
Closed

Uncaught exception input sequence_length is >= max_length #646

hubertwang opened this issue Jun 26, 2024 · 7 comments
Assignees

Comments

@hubertwang
Copy link

Hi everyone,

I recently tried Phi-3 example (onnxruntime-inference-example/mobile/examples/phi-3) on iPhone.
Sometimes the output of Phi-3 is more than my max_length.
My app will crash since I am not able to catch the exception.

libc++abi: terminating due to uncaught exception of type std::runtime_error: input sequence_length (12) is >= max_length (10)

I tried Obj-C and C++ type try-catch, but all failed to catch this exception.
Anyone has had the same issue?

Thanks!

@natke
Copy link
Contributor

natke commented Jun 27, 2024

Hi @hubertwang, can you please share your prompt and your max_length value?

@hubertwang
Copy link
Author

hubertwang commented Jun 27, 2024

Hi @hubertwang, can you please share your prompt and your max_length value?

Hi @natke,

Yes, I tried two relatively extreme conditions.

I input a privacy policy extract from app store, ask prompt to analyze The max length set to 200 (default from example), the output size will be around 800~900 and throw exception.

After I got this exception, I tried another prompt, expect short answers:

"How are you?" max limit 10

I except something like "I am good" or "good".
But it throw exception output (11, 12) > (10)

Then I try to further limit the answer:

"How are you, answer good or no good"
It still throw same exception. output (11, 12) > (10)

Note: The question is wrapped by the fine-tuned prompt format mentioned in the paper, with ## prefix.

@natke
Copy link
Contributor

natke commented Jul 1, 2024

To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again

@natke natke self-assigned this Jul 1, 2024
@hubertwang
Copy link
Author

To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again

Hi @natke,

Thanks for your reply. We'll keep that in mind and adjust the parameter.

Is it possible to catch this exception? Looks like the app will just crash for now, no chance to catch the exception. It's hard to estimate the output prompt may give us.

BTW, we also observed excessive memory usage when the prompt is longer. Seems longer prompt consume more memory.

I need to use iPhone 15 pro max to run certain prompt, which is the iPhone with the most memory for now.

Is it a expected behavior? Is it possible to control memorry message through search option?

Thank you.

@natke
Copy link
Contributor

natke commented Jul 1, 2024

Can you please add details of the exception you are seeing?

@hubertwang
Copy link
Author

hubertwang commented Jul 2, 2024

Hi @natke, yes, I added my sample code and screenshot while exception catched.

I used try-catch, or @try-@catch, but failed to catch the exception.
But I can set a break point to stop it while throwing exception.
Weird...

- (nullable NSString *)generate:(nonnull NSString*)input_user_question maxLength:(nonnull NSNumber*)max_length
{
  __weak __typeof__(self) weakSelf = self;
  NSMutableString *result = [NSMutableString string];
  
  @try {
    NSString* llmPath = [[NSBundle mainBundle] resourcePath];
    const char* modelPath = [llmPath cStringUsingEncoding:NSUTF8StringEncoding];

    auto model = OgaModel::Create(modelPath);
    auto tokenizer = OgaTokenizer::Create(*model);

    NSString* promptString = [NSString stringWithFormat:@"<|user|>\n%@<|end|>\n<|assistant|>", input_user_question];
    const char* prompt = [promptString UTF8String];

    auto sequences = OgaSequences::Create();
    tokenizer->Encode(prompt, *sequences);

    auto params = OgaGeneratorParams::Create(*model);
    params->SetSearchOption("max_length", max_length.intValue);
    params->SetInputSequences(*sequences);

    // Streaming Output to generate token by token
    auto tokenizer_stream = OgaTokenizerStream::Create(*tokenizer);

    auto generator = OgaGenerator::Create(*model, *params);
  
    while (!generator->IsDone()) {
      generator->ComputeLogits();
      generator->GenerateNextToken();

      const int32_t* seq = generator->GetSequenceData(0);
      size_t seq_len = generator->GetSequenceCount(0);
      const char* decode_tokens = tokenizer_stream->Decode(seq[seq_len - 1]);
      //NSLog(@"Decoded tokens: %s", decode_tokens);

      // Add decoded token to SharedTokenUpdater
      NSString* decodedTokenString = [NSString stringWithUTF8String:decode_tokens];
      if (hasListeners) {// Only send events if anyone is listening
        [weakSelf sendEventWithName:RCTOnnxEventGenTextTokenUpdate body:decodedTokenString];
      }
      //NSLog(@"[Phi-3] %@", decodedTokenString);
      [result appendString:decodedTokenString];
    }
  } @catch (id exception) {
    NSLog(@"Exception: %@", exception);
  }
  //NSLog(@"[Phi-3] Result: %@", result);
  return result;
}

Exception:

libc++abi: terminating due to uncaught exception of type std::runtime_error: input sequence_length (11) is >= max_length (10)
Screenshot 2024-07-02 at 11 07 52 AM

@hubertwang
Copy link
Author

Hi @natke,

I managed to catch the c++ exception.
What I have done is just adding another layer c++ style try catch inside obj-c style @try-@catch.
Not sure why it didn't work, thanks for your responses!
I'll close this issue.

  try {
     // The sample code
  } catch (const std::exception &e) {
      NSLog(@"Caught C++ exception: %s", e.what());
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants