Uncaught exception input sequence_length is >= max_length #646

hubertwang · 2024-06-26T07:00:23Z

Hi everyone,

I recently tried Phi-3 example (onnxruntime-inference-example/mobile/examples/phi-3) on iPhone.
Sometimes the output of Phi-3 is more than my max_length.
My app will crash since I am not able to catch the exception.

libc++abi: terminating due to uncaught exception of type std::runtime_error: input sequence_length (12) is >= max_length (10)

I tried Obj-C and C++ type try-catch, but all failed to catch this exception.
Anyone has had the same issue?

Thanks!

natke · 2024-06-27T22:28:34Z

Hi @hubertwang, can you please share your prompt and your max_length value?

hubertwang · 2024-06-27T23:18:05Z

Hi @hubertwang, can you please share your prompt and your max_length value?

Hi @natke,

Yes, I tried two relatively extreme conditions.

I input a privacy policy extract from app store, ask prompt to analyze The max length set to 200 (default from example), the output size will be around 800~900 and throw exception.

After I got this exception, I tried another prompt, expect short answers:

"How are you?" max limit 10

I except something like "I am good" or "good".
But it throw exception output (11, 12) > (10)

Then I try to further limit the answer:

"How are you, answer good or no good"
It still throw same exception. output (11, 12) > (10)

Note: The question is wrapped by the fine-tuned prompt format mentioned in the paper, with ## prefix.

natke · 2024-07-01T19:35:42Z

To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again

hubertwang · 2024-07-01T21:47:05Z

To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again

Hi @natke,

Thanks for your reply. We'll keep that in mind and adjust the parameter.

Is it possible to catch this exception? Looks like the app will just crash for now, no chance to catch the exception. It's hard to estimate the output prompt may give us.

BTW, we also observed excessive memory usage when the prompt is longer. Seems longer prompt consume more memory.

I need to use iPhone 15 pro max to run certain prompt, which is the iPhone with the most memory for now.

Is it a expected behavior? Is it possible to control memorry message through search option?

Thank you.

natke · 2024-07-01T22:08:06Z

Can you please add details of the exception you are seeing?

hubertwang · 2024-07-02T03:19:37Z

Hi @natke, yes, I added my sample code and screenshot while exception catched.

I used try-catch, or @try-@catch, but failed to catch the exception.
But I can set a break point to stop it while throwing exception.
Weird...

- (nullable NSString *)generate:(nonnull NSString*)input_user_question maxLength:(nonnull NSNumber*)max_length
{
  __weak __typeof__(self) weakSelf = self;
  NSMutableString *result = [NSMutableString string];
  
  @try {
    NSString* llmPath = [[NSBundle mainBundle] resourcePath];
    const char* modelPath = [llmPath cStringUsingEncoding:NSUTF8StringEncoding];

    auto model = OgaModel::Create(modelPath);
    auto tokenizer = OgaTokenizer::Create(*model);

    NSString* promptString = [NSString stringWithFormat:@"<|user|>\n%@<|end|>\n<|assistant|>", input_user_question];
    const char* prompt = [promptString UTF8String];

    auto sequences = OgaSequences::Create();
    tokenizer->Encode(prompt, *sequences);

    auto params = OgaGeneratorParams::Create(*model);
    params->SetSearchOption("max_length", max_length.intValue);
    params->SetInputSequences(*sequences);

    // Streaming Output to generate token by token
    auto tokenizer_stream = OgaTokenizerStream::Create(*tokenizer);

    auto generator = OgaGenerator::Create(*model, *params);
  
    while (!generator->IsDone()) {
      generator->ComputeLogits();
      generator->GenerateNextToken();

      const int32_t* seq = generator->GetSequenceData(0);
      size_t seq_len = generator->GetSequenceCount(0);
      const char* decode_tokens = tokenizer_stream->Decode(seq[seq_len - 1]);
      //NSLog(@"Decoded tokens: %s", decode_tokens);

      // Add decoded token to SharedTokenUpdater
      NSString* decodedTokenString = [NSString stringWithUTF8String:decode_tokens];
      if (hasListeners) {// Only send events if anyone is listening
        [weakSelf sendEventWithName:RCTOnnxEventGenTextTokenUpdate body:decodedTokenString];
      }
      //NSLog(@"[Phi-3] %@", decodedTokenString);
      [result appendString:decodedTokenString];
    }
  } @catch (id exception) {
    NSLog(@"Exception: %@", exception);
  }
  //NSLog(@"[Phi-3] Result: %@", result);
  return result;
}

Exception:

libc++abi: terminating due to uncaught exception of type std::runtime_error: input sequence_length (11) is >= max_length (10)

hubertwang · 2024-07-12T07:31:30Z

Hi @natke,

I managed to catch the c++ exception.
What I have done is just adding another layer c++ style try catch inside obj-c style @try-@catch.
Not sure why it didn't work, thanks for your responses!
I'll close this issue.

  try {
     // The sample code
  } catch (const std::exception &e) {
      NSLog(@"Caught C++ exception: %s", e.what());
  }

github-actions bot added the platform:mobile label Jun 26, 2024

natke self-assigned this Jul 1, 2024

hubertwang closed this as completed Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uncaught exception input sequence_length is >= max_length #646

Uncaught exception input sequence_length is >= max_length #646

hubertwang commented Jun 26, 2024

natke commented Jun 27, 2024

hubertwang commented Jun 27, 2024 •

edited

Loading

natke commented Jul 1, 2024

hubertwang commented Jul 1, 2024

natke commented Jul 1, 2024

hubertwang commented Jul 2, 2024 •

edited

Loading

hubertwang commented Jul 12, 2024

Uncaught exception input sequence_length is >= max_length #646

Uncaught exception input sequence_length is >= max_length #646

Comments

hubertwang commented Jun 26, 2024

natke commented Jun 27, 2024

hubertwang commented Jun 27, 2024 • edited Loading

natke commented Jul 1, 2024

hubertwang commented Jul 1, 2024

natke commented Jul 1, 2024

hubertwang commented Jul 2, 2024 • edited Loading

hubertwang commented Jul 12, 2024

hubertwang commented Jun 27, 2024 •

edited

Loading

hubertwang commented Jul 2, 2024 •

edited

Loading