Langchain didn't work when run src/api_demo.py Meta-Llama-3-8B-Instruct ，but chat.completions.create calling works fine. #3421

hzgdeerHo · 2024-04-24T14:42:05Z

Reminder

I have read the README and searched the existing issues.

Reproduction

RUN a server like this:

 CUDA_VISIBLE_DEVICES=0 API_PORT=8090 python src/api_demo.py \
    --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \
    --template llama3 \
    --infer_backend vllm \
    --vllm_enforce_eager

Run this on the Client:

 llm = ChatOpenAI(model_name=Model.SQL, temperature=0.01, streaming=True,
                    #  stop=stop_str_list,
                    #  max_tokens=4000,
                     base_url=Model.base_url, 
                 api_key=Model.api_key,
                    callbacks=[callback_handler])


    QA_CHAIN_PROMPT = PromptTemplate.from_template(template)# Run chain
    qa= RetrievalQA.from_chain_type(
        llm,
        retriever=retriever,
        chain_type="stuff",
        # return_intermediate_steps=True,
        return_source_documents=True,
        chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
    )

I got nothing output!

Expected behavior

No response

System Info

No response

Others

No response

The text was updated successfully, but these errors were encountered:

codemayq · 2024-04-25T08:39:45Z

Please first use a simple script like blow to check if your api server is running OK. Then use it in langchain

Bless.

import os
from openai import OpenAI
from transformers.utils.versions import require_version


require_version("openai>=1.5.0", "To fix: pip install openai>=1.5.0")


if __name__ == '__main__':
    client = OpenAI(
        api_key="0",
        base_url="http://localhost:{}/v1".format(os.environ.get("API_PORT", 8000)),
    )
    messages = []
    messages.append({"role": "user", "content": "hello, where is USA"})
    result = client.chat.completions.create(messages=messages, model="test")
    print(result.choices[0].message)
    
‵‵

hzgdeerHo · 2024-05-01T08:17:30Z

I am sure it is running OK when calling the api like this way:
client = OpenAI(
api_key="0",
base_url="http://localhost:{}/v1".format(os.environ.get("API_PORT", 8000)),
)

hiyouga added the pending This problem is yet to be addressed. label Apr 24, 2024

hiyouga assigned codemayq Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Langchain didn't work when run src/api_demo.py Meta-Llama-3-8B-Instruct ，but chat.completions.create calling works fine. #3421

Langchain didn't work when run src/api_demo.py Meta-Llama-3-8B-Instruct ，but chat.completions.create calling works fine. #3421

hzgdeerHo commented Apr 24, 2024 •

edited by hiyouga

codemayq commented Apr 25, 2024 •

edited

hzgdeerHo commented May 1, 2024

Langchain didn't work when run src/api_demo.py Meta-Llama-3-8B-Instruct ，but chat.completions.create calling works fine. #3421

Langchain didn't work when run src/api_demo.py Meta-Llama-3-8B-Instruct ，but chat.completions.create calling works fine. #3421

Comments

hzgdeerHo commented Apr 24, 2024 • edited by hiyouga

Reminder

Reproduction

Expected behavior

System Info

Others

codemayq commented Apr 25, 2024 • edited

hzgdeerHo commented May 1, 2024

hzgdeerHo commented Apr 24, 2024 •

edited by hiyouga

codemayq commented Apr 25, 2024 •

edited