Analysis of the Impact of Negative Sampling on Link Prediction in Knowledge Graphs

This code implements the state-of-the-art Knowledge Graph Embedding algorithms such as RESCAL, TransE, COMPLEX, Computational Graphs are implemented using PyTorch.

Installation

Data

Create a data directory and download FB15K datasets from here
Rename freebase_mtr100_mte100-train.txt, freebase_mtr100_mte100-valid.txt, freebase_mtr100_mte100-test.txt as train, dev, test respectively. These files should reside inside the data directory.

Configuration

Create directory ./data/experiment_specs/
Experiment configurations are specified as JSON files inside the experiment_specs directory.
An example configuration file (complex.json) can be found in the repository.

Specifications

"batch_size": Train Batch Size,
"entity_dim": Embedding Dimension (must be specified),
"exp_name": Experiment Name,
"is_dev": True if you want to test on validation data (must be specified),
"is_typed": True if you want to use Type Regularizer (default False),
"l2": Strength of L2 regularizer (must be specified),
“lr”: Learning rate for ADAM SGD optimizer
"model": Model (rescal, transE, distmult, complex) (must be specified),
"num_epochs": Max number of epochs (default 100),
“neg_sampler”: Negative Sampling (random, corrupt, relational, typed, nn, adversarial). Must be specified
“num_negs”: Number of negative samples (must be specified).

Additional specifications (early stopping, save and report time, etc) can be changed by modifying constants.py. Default values can also be found in constants.py

Usage

To train and test simply run
python experiment_runner.py "data_dir" "experiment_name"
where experiment_name is the name of the JSON file without the .json extension. For example
python experiment_runner.py ./data/ freebase_bilinear

Additional options: -r to resume training. -v for saving embeddings and -t for fine tuning.
Note that nearest neighbor and adversarial sampling requires a trained RESCAL model for negative sampler. To use nn, or adversarial, change options in json file and add -t to the command line. For example to tune a pre trained complex model execute the following command. Note that complex directory must contain the pretrained parameters in pytorch format.

python experiment_runner.py ./data/ complex -t

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
README.md		README.md
complex.json		complex.json
constants.py		constants.py
data_loader.py		data_loader.py
embedding_loader.py		embedding_loader.py
evaluation.py		evaluation.py
experiment_runner.py		experiment_runner.py
hyperparams.py		hyperparams.py
models.py		models.py
negative_sampling.py		negative_sampling.py
optimizer.py		optimizer.py
results.py		results.py
sample_list.pyx		sample_list.pyx
scripts.py		scripts.py
setup.py		setup.py
util.py		util.py

bhushank/kge-rl

Folders and files

Latest commit

History

Repository files navigation

Installation

Data

Configuration

Specifications

Usage

About

Topics

Resources

Stars

Watchers

Forks

Languages