Streaming output support? #160

xiangqi1997 · 2024-05-10T02:58:03Z

Wondering if streaming output is supported? Or are there any results about the time to first token and time per output token? Thanks.

NiYueLiuFeng · 2024-05-13T13:10:36Z

Copying the code of def stream_chat(self, ....) to modeling_internvl_chat.py from modeling_internlm2.py https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5/blob/main/modeling_internlm2.py and make very small changes , I implement it and verify that is usefull.

Provide feedback