GitHub - RMNCLDYO/gemma-ai-toolkit: A python wrapper and command-line interface for Google's latest open-source 'Gemma' instruct models.

Overview

The Gemma AI Toolkit offers a user-friendly way to leverage Google's latest open-source model Gemma, for text generation and processing tasks such as conversation, summarization, and question answering.

This toolkit offers both a Python wrapper and a command-line interface without the need for deep technical knowledge. It supports various model versions, including special instruction-tuned variants, and allows for offline use once the models are downloaded to your system.

Key Features

Conversational AI: Create interactive, real-time chat experiences (chatbots) or AI assistants.
Text Generation: Produce coherent and contextually relevant text and answers from simple prompts.
Offline Capability: Utilize models offline once downloaded, reducing dependency on an internet connection.
Highly Customizable: Tailor settings like model version, maximum tokens and more to suit your specific requirements.
Lightweight Integration: Efficiently designed with minimal dependencies, requiring only the torch and transformers packages for core functionality.

Prerequisites

Python 3.6 or newer.
Internet connection for downloading model weights and dependencies (only required for download not use).
An API key from Hugging Face (if accessing models that are not cached locally).

Dependencies

The following Python packages are required:

torch: This package, also known as PyTorch provides the underlying framework for tensor operations and neural network layers, enabling the loading and execution of the Gemma models.
transformers: Developed by Hugging Face, this library is used for easy access to the Gemma models and their tokenizers, facilitating tasks like model downloading, loading, and inference with just a few lines of code.

The following Python packages are optional:

python-dotenv: For managing API keys and other environment variables.

Installation

To use the Gemma AI Toolkit, clone the repository to your local machine and install the required Python packages.

Clone the repository:

git clone https://github.com/RMNCLDYO/gemma-ai-toolkit.git

Navigate to the repositories folder:
```
cd gemma-ai-toolkit
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Accessing Gated Models

This wrapper and command-line interface supports gated models like Google's gemma-2b-it and gemma-7b-it instruct models. To gain access to these models:

Visit Google AI and sign in with your Google account.
Click on "Get Started" to be redirected to the official Google 'Kaggle' page here.
Authorize Hugging Face to access the gated model by following the prompts on Kaggle.

Once granted access, you can use your Hugging Face API key with this wrapper to download the Gemma models and automatically setup your cache. Once the weights and tokenizer are downloded to your system, and cached, neither the Hugging Face API key or Internet connection are required.

Getting Started

Setting Up Your API Key

If not using locally cached models, obtain an API key (token) from Hugging Face.
Create or rename the .env file in the project's root directory and add your API key:
```
API_KEY=your_api_key
```

Using the Command-Line Interface

To start a chat with the default Gemma model:

python gemma_cli.py chat

To ask a question with the default Gemma model:

python gemma_cli.py text --prompt "Your prompt here"

For additional options and help:

python gemma_cli.py --help

Using the Python Wrapper

To initiate a conversation using the default Gemma model with the wrapper:

from gemma_chat import ChatAPI

ChatAPI().chat()

To ask the model a question using the default Gemma model with the wrapper:

from gemma_text import TextAPI

TextAPI(prompt="Your question goes here.").text()

Advanced Configuration

The tool allows for advanced configurations including specifying the model version, adjusting the maximum number of tokens, and utilizing different computational precisions for optimizing performance on specific hardware.

Command-Line Options

Chat Options

--model: Specify the Gemma model version.
--api_key: Your Hugging Face API key.
--max_tokens: The maximum number of tokens to generate.

Text Options

--prompt: The question you would like to ask the model.
--model: Specify the Gemma model version.
--api_key: Your Hugging Face API key.
--max_tokens: The maximum number of tokens to generate.

To exit the program at any time, you can type exit or quit. This command works similarly whether you're interacting with the program via the CLI or through the Python wrapper ensuring that you can easily and safely conclude your work with the Gemma AI Toolkit without having to resort to interrupt signals or forcibly closing the terminal or command prompt.

Python Wrapper Options

When using the Python wrapper, these configurations can be adjusted by passing parameters to the ChatAPI class:

ChatAPI(model="google/gemma-2b-it", api_key="your_api_key", max_tokens=150)

When using the Python wrapper, these configurations can be adjusted by passing parameters to the TextAPI class:

TextAPI(prompt="Your question goes here.", model="google/gemma-2b-it", api_key="your_api_key", max_tokens=150)

Offline Access and Performance Considerations

Once the Gemma model weights are downloaded to your system, they are cached locally, allowing for offline access thereafter. This means subsequent uses do not require an internet connection or the API key, provided you're using the same system and the model weights remain in the cache.

Running on Limited Hardware

Performance: Running large models like Gemma locally can be resource-intensive. Users with limited CPU capabilities may experience slower response times.
Offline Use: For offline execution, ensure that the model and tokenizer files are correctly cached. All necessary dependencies must be installed while online.
Hardware Requirements: Performance can vary significantly based on the hardware configuration. Users are encouraged to adjust model parameters or use optimized versions of the model for better performance on constrained devices.

Contributing

Contributions are welcome!

Please refer to CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

Reporting Issues

Encountered a bug? We'd love to hear about it. Please follow these steps to report any issues:

Check if the issue has already been reported.
Use the Bug Report template to create a detailed report.
Submit the report here.

Your report will help us make the project better for everyone.

Feature Requests

Got an idea for a new feature? Feel free to suggest it. Here's how:

Check if the feature has already been suggested or implemented.
Use the Feature Request template to create a detailed request.
Submit the request here.

Your suggestions for improvements are always welcome.

Versioning and Changelog

Stay up-to-date with the latest changes and improvements in each version:

CHANGELOG.md provides detailed descriptions of each release.

Security

Your security is important to us. If you discover a security vulnerability, please follow our responsible disclosure guidelines found in SECURITY.md. Please refrain from disclosing any vulnerabilities publicly until said vulnerability has been reported and addressed.

License

Licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.env		example.env
gemma_chat.py		gemma_chat.py
gemma_cli.py		gemma_cli.py
gemma_text.py		gemma_text.py
initializer.py		initializer.py
loading.py		loading.py
requirements.txt		requirements.txt

License

RMNCLDYO/gemma-ai-toolkit

Folders and files

Latest commit

History

Repository files navigation

Overview

Key Features

Prerequisites

Dependencies

Installation

Accessing Gated Models

Getting Started

Setting Up Your API Key

Using the Command-Line Interface

Using the Python Wrapper

Advanced Configuration

Command-Line Options

Chat Options

Text Options

Python Wrapper Options

Offline Access and Performance Considerations

Running on Limited Hardware

Contributing

Reporting Issues

Feature Requests

Versioning and Changelog

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages