Materials for a workshop on image search with a focus on heritage data. The workshop is based on a blog post Image search with 🤗 datasets but goes into a little bit more detail.
- The Slides are used to introduce 🤗
datasets
,sentence-transformers
, and CLIP as well as giving a broader conceptual overview of image search, embeddings and concluding with some discussion of ethical considerations about deployment. - Notebook 1 gives a rapid overview of how
sentence-transformers
can be used to 'encode' text and images for tasks like image search. - Notebook 2 allows for the exploration of the outputs of a CLIP model. This is intended to allow people to begin interrogating the strengths, weaknesses and issues with using CLIP with heritage material.
- Notebook 3 is the original notebook which accompanied the blog post. This notebook gives an overview of the steps involved from start to finish.
This work is licensed under a Creative Commons Attribution 4.0 International License.
This work was support by Living with Machines. This project, funded by the UK Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration delivered by the Arts and Humanities Research Council (AHRC), with The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia, Exeter, and Queen Mary University of London.