Unsupervised Knowledge Free Word Sense Disambiguation

A software to construct and visualize Word Sense Disambiguation models based on JoBimText models. This project implements the method described in the following paper, please cite it if you use the paper in a research project:

Panchenko A., Marten F., Ruppert E., Faralli S., Ustalov D., Ponzetto S.P., Biemann C. Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation. In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP 2017). 2017. Copenhagen, Denmark. Association for Computational Linguistics

@inproceedings{Panchenko:17:emnlp,
  author    = {Panchenko, Alexander and Marten, Fide and Ruppert, Eugen and Faralli, Stefano  and Ustalov, Dmitry and Ponzetto, Simone Paolo and Biemann, Chris},
  title     = {{Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation}},
  booktitle = {In Proceedings of the the Conference on Empirical Methods on Natural Language Processing (EMNLP 2017)},
  year      = {2017},
  address   = {Copenhagen, Denmark},
  publisher = {Association for Computational Linguistics},
  language  = {english}
}

Prerequisites

Java 1.8
Docker Engine (1.13.0+), see Docker installation guide
Docker Compose (1.10.0+), see Compose installation guide
(Spark 2.0+, to build your own model)

Serving the WSD model

Online demo

Download precalculated DB and pictures

We provide a ready for use database and a dump of pictures for all senses in the database. To download and prepare the project with those two artifacts, you can use the following command:

To download and untar it, you will need 300 GB of free disk space!

./wsd model:download

Note: For instructions on how to rebuild the DB with the model, please see below: Build your own DB

Start the web application

To start the application:

./wsd web-app:start

The web application runs with Docker Compose. To customize your installation adjust docker-compose.override.yml. See the official documentation for general information on this file.

To get further information on the running containers you can use all Docker Compose commands, such as docker-compose ps and docker-compose logs.

Build your own DB

First set the $SPARK_HOME environment variable or provide spark-submit on your path.

By modifying the script scripts/spark_submit_jar.sh you can adjust the amount of memory used by Spark (consider changing --conf 'spark.driver.memory=4g' and --conf 'spark.executor.memory=1g').

We recommend to first use a toy training data set to build a toy model within a few minutes.

Build small toy model

./wsd model:build-toy

This model only provides senses for the word "Python" but is fully functional.

Build full model

Building the full model will take nearly 11 hours on an eight core machine with 30 GB of memory and needs around 300 GB of free disk space. It will also download 4 GB of training data.

./wsd model:build-full

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
api		api
bing_images/src		bing_images/src
common/src		common/src
data		data
project		project
sbt		sbt
scripts		scripts
spark		spark
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
build.sbt		build.sbt
docker-compose.yml		docker-compose.yml
nice-app.conf		nice-app.conf
sample-docker-compose.override.yml		sample-docker-compose.override.yml
wsd		wsd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised Knowledge Free Word Sense Disambiguation

Prerequisites

Serving the WSD model

Download precalculated DB and pictures

Start the web application

Build your own DB

Build small toy model

Build full model

See also

About

Releases 1

Packages

Contributors 2

Languages

License

uhh-lt/wsd

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Knowledge Free Word Sense Disambiguation

Prerequisites

Serving the WSD model

Download precalculated DB and pictures

Start the web application

Build your own DB

Build small toy model

Build full model

See also

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages