https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md

\n","updatedAt":"2024-07-09T20:29:43.420Z","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1597948137099-noauth.jpeg","fullname":"Nguyen Bach","name":"nguyenbh","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":12}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7177151441574097},"editors":["nguyenbh"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"isPullRequest":false,"isReport":false},"repo":{"name":"microsoft/Phi-3-vision-128k-instruct","type":"model"},"activeTab":"discussion","canEditTitle":false,"canDelete":false,"canPin":false,"canLock":false}">

Heavy Hallucination | non truncating o/p in Phi 3 Vision Finetuning to convert chart to json

#47
by ar9av - opened
https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md

\n","updatedAt":"2024-07-09T20:29:43.420Z","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1597948137099-noauth.jpeg","fullname":"Nguyen Bach","name":"nguyenbh","type":"user","isPro":false,"isHf":false,"isMod":false,"followerCount":12}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7177151441574097},"editors":["nguyenbh"],"reactions":[],"isReport":false}}],"pinned":false,"locked":false,"isPullRequest":false,"isReport":false},"primaryEmailConfirmed":false,"repo":{"name":"microsoft/Phi-3-vision-128k-instruct","type":"model"},"rights":{"comment":false,"writeRepo":false,"updateDiscussion":false,"manageCommunity":false,"hasHfLevelAccess":false},"acceptLanguages":["*"],"disableDiscussionClosingAndCommentHiding":false,"hideComments":true}">

Phi 3 Vision works quite well when used straight out of the box but on finetuning it starts hallucinating on numerical values [2020,2021....] like extrapolating values in cases like dates, numbers. And keeps on generating.
I checked and I using the eos_tag correctly. So that is not the issue. Where else am i going wrong?

And I have noticed this behaviour with other Vision based Transformers as well. Any idea how to get this fixed?

Microsoft org

Thank you for your interest in the Phi-3 Vision model.
Maybe you could try the Phi-3 CookBook finetuning recipe here https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md

Sign up or log in to comment