Skip to content

Living-with-machines/image-search

Repository files navigation

Image search with 🤗 datasets

DOI

Materials for a workshop on image search with a focus on heritage data. The workshop is based on a blog post Image search with 🤗 datasets but goes into a little bit more detail.

Contents

  • The Slides are used to introduce 🤗 datasets, sentence-transformers, and CLIP as well as giving a broader conceptual overview of image search, embeddings and concluding with some discussion of ethical considerations about deployment.
  • Notebook 1 Open In Colab gives a rapid overview of how sentence-transformers can be used to 'encode' text and images for tasks like image search.
  • Notebook 2 Open In Colab allows for the exploration of the outputs of a CLIP model. This is intended to allow people to begin interrogating the strengths, weaknesses and issues with using CLIP with heritage material.
  • Notebook 3 Open In Colab is the original notebook which accompanied the blog post. This notebook gives an overview of the steps involved from start to finish.

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Acknowledgment

This work was support by Living with Machines. This project, funded by the UK Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration delivered by the Arts and Humanities Research Council (AHRC), with The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia, Exeter, and Queen Mary University of London.