Skip to content
@cisnlp

Deep NLP @ CIS - LMU

Deep Natural Language Processing Group at Center for Language and Information Processing, University of Munich (LMU)

Popular repositories

  1. simalign simalign Public

    Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)

    Python 342 46

  2. Glot500 Glot500 Public

    Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages (ACL 2023)

    Python 96 3

  3. GlotLID GlotLID Public

    GlotLID: Language Identification with Support for More Than 2000 Labels (EMNLP 2023).

    Python 69 6

  4. semi-markov-crf semi-markov-crf Public

    Code for paper "Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging"

    Python 17 4

  5. parcoure parcoure Public

    ParCourE - Parallel Corpus Explorer

    Python 12

  6. GlotScript GlotScript Public

    GlotScript: A Resource and Tool for Low Resource Writing System Identification (LREC 2024).

    Python 12 1

Repositories

Showing 10 of 22 repositories
  • TransliCo Public

    TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models

    Python 4 0 0 0 Updated May 23, 2024
  • JavaScript 1 0 0 0 Updated May 23, 2024
  • TransMI Public

    TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data

    Python 3 0 0 0 Updated May 17, 2024
  • JavaScript 0 0 0 0 Updated May 15, 2024
  • GlotLID Public

    GlotLID: Language Identification with Support for More Than 2000 Labels (EMNLP 2023).

    Python 69 Apache-2.0 6 1 0 Updated May 12, 2024
  • GlotWeb Public

    GlotWeb: Web Indexing for Low-Resource Languages -- under construction.

    Python 5 CC0-1.0 0 0 0 Updated May 10, 2024
  • XAMPLER Public

    XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

    Python 3 0 0 0 Updated May 9, 2024
  • Glot500 Public

    Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages (ACL 2023)

    Python 96 3 2 0 Updated Apr 20, 2024
  • GlotSparse Public

    GlotSparse: Building Corpora in Under-Resourced Languages

    0 CC0-1.0 0 0 0 Updated Apr 18, 2024
  • GlotScript Public

    GlotScript: A Resource and Tool for Low Resource Writing System Identification (LREC 2024).

    Python 12 MIT 1 0 0 Updated Apr 18, 2024

Top languages

Loading…

Most used topics

Loading…