Skip to content

YieldLang/yieldlang

Repository files navigation

YieldLang LOGO YieldLang LOGO

GitHub Actions Workflow Status Documentation Apache License, Version 2.0 GitHub commit activity PyPI - Wheel

English | 简体中文

YieldLang is a meta-language for generating structured text (ST) that can provide corpora for large language models (LLMs) or guide LLMs to generate ST. Currently provided as a Python package.

  • 🧠 Based on a coroutine generator and sampler architecture
  • 🤖 Stream-sends characters and parses the context above into a syntax tree
  • 🦾 Build formal grammars with classes, methods, and combinators

Work in progress now.

Simple Usage

pip install yieldlang

Import the TextGenerator class and define a generator. The top method always serves as the entry point for the generator. You can treat the generator as an iterator and use a for loop to iterate over the generated text. For example:

from yieldlang import TextGenerator

class G(TextGenerator):
    def top(self):
        yield "Hello, World!"

for text in G():
    print(text)

Set another sampler for the generator (default is random sampling). For example, set the large language model sampler:

sampler = MyLLMSampler()
print(list(G(sampler)))

Use combinators (e.g., select, repeat, join, etc.) to define grammar rules in the TextGenerator. For example, for JSON values:

def value(self):
    yield select(
        self.object,
        self.array,
        self.string,
        self.number,
        self.boolean,
        self.null
    )

This is equivalent to the EBNF form:

value = object 
      | array
      | string
      | number
      | boolean
      | null

Generate a sequence easily. For example:

def array(self):
    yield select(
        ('[', self.ws, ']'),
        ('[', self.elements, ']')
    )

You can get the string just generated and add branches, loops, and other control structures to the generation rules. For example:

def diagram(self):
    match (yield self.diagram_type):
        case "flowchart":
            yield self.flowchart
        case "gannt":
            yield self.gannt

Use a loop statement in the generator. For example:

def repeat4(self, s):
    l: list[str] = []
    for _ in range(4):
        l.append((yield s))
    self.do_my_own_thing(l)

Print the generated context tree (convertible to an abstract syntax tree):

def print_context_tree():
    ctx = yield from G()
    print(ctx)

For more documentation, please visit docs.yieldlang.com.

Development

For more information, please refer to CONTRIBUTING.md.

Clone

In order for git to create symbolic links correctly, on Windows you have to run as administrator (Linux users can ignore this):

git clone -c core.symlinks=true https://github.com/YieldLang/yieldlang.git

Install

Install the package in editable mode with the development dependencies:

pip install -e ".[dev]"

Make

make run-checks # Run all checks and tests
make build      # Build the package
make docs       # Build and watch the docs

Release

Release the YieldLang package. Visit: RELEASE_PROCESS.md

Publications

Acknowledgements