Skip to content

NimbleBox Apprenticeship ML Engineer Task - 1. This project demonstrates the implementation of a Language Model Server using FastAPI and gRPC. It leverages a large language model to generate coherent text based on user input.

Notifications You must be signed in to change notification settings

visheshc14/LLM-FastAPI

Repository files navigation

LLM-FastAPI

NimbleBox Apprenticeship ML Engineer Task - 1 This project demonstrates the implementation of a Language Model Server using FastAPI and gRPC. It leverages a large language model to generate coherent text based on user input.

Getting Started To set up and run the project, follow the steps below:

Install the required Python packages by running bash pip install -r requirements.txt.

Train the language model using trainer.py. Provide the dataset file (--fp argument) and other training arguments as needed. The trained model weights will be saved in a specified location.

Start the language model server by running uvicorn server:app --host 0.0.0.0 --port 8000. The server will listen on http://localhost:8000 and accept text generation requests.

Use the provided APIs or client.py to generate text by sending requests to the server. Example curl command:

curl -X POST -H "Content-Type: application/json" -d '{"text": "Hello"}' http://localhost:8000/generate

Optionally, use test.py to stress test the server's performance and evaluate its response time under load.

Define the Protobuf service and message types in text_generator.proto:

syntax = "proto3";

package textgenerator;

service TextGenerator {
  rpc GenerateText(TextRequest) returns (TextResponse) {}
}

message TextRequest {
  string text = 1;
}

message TextResponse {
  string generated_text = 1;
}

Generate the gRPC code using the protoc compiler, you need to install the protobuf and grpcio-tools packages:

pip install protobuf grpcio-tools
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. text_generator.proto

After running the command, you will see two new files generated in the current directory:

  • text_generator_pb2.py: Contains the generated code for the Protobuf messages.
  • text_generator_pb2_grpc.py: Contains the generated code for the gRPC service.

Crux: ML Engineer

Bonus points:

  • if the filepath can be a GitHub gist (eg. this gist)
  • if everything can be run via single shell file
  • if LLM can give coherent reply
  • a file test.py that can:
    • stress test the server using multithreading
    • provide a CLI for using the model fast

Ultra bonus points:

  • you use gRPC over HTTP/REST
  • you use something other than python (but not C++, Javascript FFS)

Train a language model and serve it over a FastAPI.

  • create a github repository
  • create a file called trainer.py which can be accessed via CLI to train an LLM (protip: take a look at python-fire). It should take in following arguments:
    • fp the file to finetune the model on
    • some training arguments as well (protip: don't use huggingface try karpathy/minGPT)
    • the result of this should be the model weights saved in some location
  • create a file called server.py that serves the LLM over a HTTP/REST over some APIs (protip: use pydantic for models)
  • A curl command to call the model and get response
  • an ipython notebook that contains steps to run this

About

NimbleBox Apprenticeship ML Engineer Task - 1. This project demonstrates the implementation of a Language Model Server using FastAPI and gRPC. It leverages a large language model to generate coherent text based on user input.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published