Skip to content

alisonreboud/screenplay_summarization

Repository files navigation

Zero-Shot classification for summarization of screenplays

This repository contains the code for reproducing the results on the CSI corpus reported in the paper "Stories of love and violence: Zero-Shot interesting events classification for unsupervised TV series summarization"

Please cite the following if you use this code.

@inproceedings{reboud2021stories,
  title={Stories of love and violence: Zero-Shot interesting events classification for unsupervised TV series summarization},
  author={Reboud, Alison and Harrando, Ismail and Lisena, Pasquale and Troncy, Rapha{\"e}l},
  journal={Multimedia Systems},
  year={2022}
  publisher={Springer}
}

A Method based on Zero-shot classification

In this paper, we propose an unsupervised approach to generate TV series summaries using screenplays that are composed of dialogue and scenic textual descriptions. In the last years, the creation of large language models has enabled Zero-Shot text classification to perform effectively in some conditions. We use Entail (Bart MNLI) to explore if and how such models can be used for TV series summarization by conducting experiments with varying text inputs.

The model

bart

An example of scene classified with ZeSTE and Entail (Bart)

example

Follow the notebook Zero_Shot_Pipeline.ipynb to reproduce the experiments using named-events for screenplays summarization with BART MNLI. The notebook also contains the code for the experiment that combines zero-shot classification scores and centrality scores from SUMMER. The notebook includes the preprocessing steps of the dataset text and the classification with Entail. The csi corpus can be found here and should be placed under /coref/csi-corpus.

These centrality scores can found under centralities.npy

Results

The results obtained with Entail (Bart MNLI) unveils the potential of zero-shot classification for unsupervised summarization with events representative of a genre as candidate labels for crime series and soap operas. When provided with a screenplay, we were able to observe that the Entail model performs best when handling only visual information data. We think our approach is helping to push interpretability: contrary to modelling interestingness without proxies, this approach allows to justify the choice of summary scenes by their closeness to non subjective labels. Another major strength of this approach is its flexibility. Model architecture

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published