Skip to content

tteofili/ellmer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ELLMER

Code for ELLMER: Explain Large Language Models for ER.

Installation

To install ELLMER locally run :

pip install .

Usage

To replicate experiments, first download the DeepMatcher datasets somewhere on your local disk, then use the python eval script.

You can choose the LLM model_type by choosing:

  • OpenAI models deployed on Azure with --model_type azure_openai
  • local Llama2-13B model --model_type llama2
  • local Falcon model --model_type falcon

You can choose how many samples the evaluation should account for (--samples param), the explanation granularity (--granularity param, accepted values are token and attribute).

You can choose one or more datasets for the evaluation as the name of one or more directories in the base_dir.

python scripts/eval.py --base_dir path/to/deepmatcher_datasets --model_type azure_openai --datasets beers --samples 5 --granularity token

Other optional parameters can be specified in the script.

Notebooks

Citing ELLMER

If you extend or use this work, please cite:

@article{teofili2023ellmer,
  title={ELLMER: Explain Large Language Models for Entity Resolution},
  author={Teofili, Tommaso and Firmani, Donatella and Koudas, Nick and Merialdo, Paolo and Srivastava, Divesh},
  year={2023}
}