{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":634081686,"defaultBranch":"main","name":"mlc-llm","ownerLogin":"mlc-ai","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-04-29T01:59:25.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/106173866?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1716155690.0","currentOid":""},"activityList":{"items":[{"before":"d2ae3083dc835e03ab79d9055bc1bcdad5d4ab24","after":"a15831154d11d05991630a0ea39374f9352a13fb","ref":"refs/heads/gh-pages","pushedAt":"2024-05-20T17:59:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Mon May 20 17:59:06 UTC 2024","shortMessageHtmlLink":"Build at Mon May 20 17:59:06 UTC 2024"}},{"before":"9998076153d5309ec87dc32c373e1759813ee84e","after":"beb126cc9ba712478eeed386de796f12be23e9a5","ref":"refs/heads/main","pushedAt":"2024-05-20T17:53:45.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[Fix][Serving] Fix prefill chunk in interactive mode (#2363)\n\nThis PR fixes a bug of prefill chunking in the interactive mode.\r\nThe bug counts requests with remaining inputs as running requests\r\nwhich turns out disabling the prefill of the remaining inputs.\r\n\r\nThis PR fixes by no longer counting requests with unfinished inputs\r\nas running requests for decode.","shortMessageHtmlLink":"[Fix][Serving] Fix prefill chunk in interactive mode (#2363)"}},{"before":null,"after":"21fac34bc3cad86708441b5d56bd85a6afb23023","ref":"refs/heads/cli","pushedAt":"2024-05-19T21:54:50.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[CLI] Migrate CLI to use the new Engine\n\nThis PR migrates the CLI to the new JSON FFI Engine.\nThe resulting generation will be faster, we still need to ensure\nwe can enable sliding window support when needed.","shortMessageHtmlLink":"[CLI] Migrate CLI to use the new Engine"}},{"before":"1528e6d5bf2457c529656045d30b0eb21f259047","after":"e2f7d70135280a58571a993d6046002089091d57","ref":"refs/heads/ios-engine","pushedAt":"2024-05-19T20:49:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"Minor refactor","shortMessageHtmlLink":"Minor refactor"}},{"before":"d7c81261b5b9b85c9c7ee59479c3bbd89689d23f","after":"d2ae3083dc835e03ab79d9055bc1bcdad5d4ab24","ref":"refs/heads/gh-pages","pushedAt":"2024-05-19T16:54:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sun May 19 16:54:54 UTC 2024","shortMessageHtmlLink":"Build at Sun May 19 16:54:54 UTC 2024"}},{"before":"0e3d53698742861bedddeeb4029a14ade4002a57","after":"9998076153d5309ec87dc32c373e1759813ee84e","ref":"refs/heads/main","pushedAt":"2024-05-19T16:49:45.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[JSONFFI] Fix JSONFFI conv template. Add unit tests (#2360)","shortMessageHtmlLink":"[JSONFFI] Fix JSONFFI conv template. Add unit tests (#2360)"}},{"before":"5fbbac05e033e5dbfbd27ebda19afcf1c257b88f","after":"1528e6d5bf2457c529656045d30b0eb21f259047","ref":"refs/heads/ios-engine","pushedAt":"2024-05-19T00:44:47.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[iOS] Switch MLC Chat to use MLCEngine\n\nThis PR switchs MLC Chat to use MLC Engine","shortMessageHtmlLink":"[iOS] Switch MLC Chat to use MLCEngine"}},{"before":"29de10aa0975ebf9faefc9d03d5636441b0414a5","after":"d7c81261b5b9b85c9c7ee59479c3bbd89689d23f","ref":"refs/heads/gh-pages","pushedAt":"2024-05-18T23:50:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sat May 18 23:50:45 UTC 2024","shortMessageHtmlLink":"Build at Sat May 18 23:50:45 UTC 2024"}},{"before":"96fc28994a30c35939c86f28f86a0d7a552a435f","after":"0e3d53698742861bedddeeb4029a14ade4002a57","ref":"refs/heads/main","pushedAt":"2024-05-18T23:44:45.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[iOS] Update MLCEngine API to latest JSON FFI convention (#2359)\n\nThis PR updates the MLCEngine API to latest JSON FFI convention.","shortMessageHtmlLink":"[iOS] Update MLCEngine API to latest JSON FFI convention (#2359)"}},{"before":"e8f63a7f3e80734a65832f5c661009ef173e9dba","after":"5fbbac05e033e5dbfbd27ebda19afcf1c257b88f","ref":"refs/heads/ios-engine","pushedAt":"2024-05-18T23:42:36.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[iOS] Update MLCEngine API to latest JSON FFI convention\n\nThis PR updates the MLCEngine API to latest JSON FFI convention.","shortMessageHtmlLink":"[iOS] Update MLCEngine API to latest JSON FFI convention"}},{"before":"1fc7a8db2dd06aa186b1217def201be976dd6558","after":"e8f63a7f3e80734a65832f5c661009ef173e9dba","ref":"refs/heads/ios-engine","pushedAt":"2024-05-18T23:41:09.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[iOS] Update MLCEngine API to latest JSON FFI convention\n\nThis PR updates the MLCEngine API to latest JSON FFI convention.","shortMessageHtmlLink":"[iOS] Update MLCEngine API to latest JSON FFI convention"}},{"before":"67db3f7b5dc1c0a68029454f822412525a331081","after":"29de10aa0975ebf9faefc9d03d5636441b0414a5","ref":"refs/heads/gh-pages","pushedAt":"2024-05-18T13:48:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sat May 18 13:48:40 UTC 2024","shortMessageHtmlLink":"Build at Sat May 18 13:48:40 UTC 2024"}},{"before":"ac1cd51b14501cf046203f370a31c4b27ea63c00","after":"96fc28994a30c35939c86f28f86a0d7a552a435f","ref":"refs/heads/main","pushedAt":"2024-05-18T13:43:50.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[JSON FFI] Example Android Application using JSON FFI Engine (#2322)\n\n* pass str to callback and not List[str]\r\n\r\nadd json ffif android example\r\n\r\nfix lint\r\n\r\nRefactor MLCEngineExample and MLCEngine.kt\r\n\r\nUse ChatCompletionMessageContent class\r\n\r\nChatCompletionMessageContent: text and parts\r\n\r\n* JSONFFIEngine: Cast request_stream_callback argument to std::string. Decode in Android as List\r\n\r\n---------\r\n\r\nCo-authored-by: Animesh Bohara ","shortMessageHtmlLink":"[JSON FFI] Example Android Application using JSON FFI Engine (#2322)"}},{"before":"aa680103de43ba6969fa44284a2fda7ce0eb01f9","after":"67db3f7b5dc1c0a68029454f822412525a331081","ref":"refs/heads/gh-pages","pushedAt":"2024-05-18T03:44:59.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sat May 18 03:44:58 UTC 2024","shortMessageHtmlLink":"Build at Sat May 18 03:44:58 UTC 2024"}},{"before":"152ecc43cf20158ff9cd89a9d2398142f6a61067","after":"ac1cd51b14501cf046203f370a31c4b27ea63c00","ref":"refs/heads/main","pushedAt":"2024-05-18T03:38:55.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[Bugfix] Make sequence_length dtype int64 in EngineConfig. Fix Mistral engine serving issue (#2358)\n\n* [Bugfix] Make sequence_length dtype int64 in EngineConfig. Fix Mistral engine serving issue","shortMessageHtmlLink":"[Bugfix] Make sequence_length dtype int64 in EngineConfig. Fix Mistra…"}},{"before":"13efd17af930c9129f2c0b2e5d25885b67458532","after":"aa680103de43ba6969fa44284a2fda7ce0eb01f9","ref":"refs/heads/gh-pages","pushedAt":"2024-05-16T13:02:34.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Thu May 16 13:01:02 UTC 2024","shortMessageHtmlLink":"Build at Thu May 16 13:01:02 UTC 2024"}},{"before":"56ea1560a02a3672f6b7802853447236e777cd60","after":"152ecc43cf20158ff9cd89a9d2398142f6a61067","ref":"refs/heads/main","pushedAt":"2024-05-16T12:55:12.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[Serving] Add reset_engine in debug_entrypoints (#2347)","shortMessageHtmlLink":"[Serving] Add reset_engine in debug_entrypoints (#2347)"}},{"before":"fd1c48780cd1fb86616f594cb6be7919680a68e8","after":"13efd17af930c9129f2c0b2e5d25885b67458532","ref":"refs/heads/gh-pages","pushedAt":"2024-05-15T16:24:30.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed May 15 16:24:29 UTC 2024","shortMessageHtmlLink":"Build at Wed May 15 16:24:29 UTC 2024"}},{"before":"9b89e048a5bcd84b68f9df3675d1599e502884df","after":"56ea1560a02a3672f6b7802853447236e777cd60","ref":"refs/heads/main","pushedAt":"2024-05-15T16:18:39.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[JSONFFIEngine] Refactor device argument and request_stream_callback argument (#2334)\n\n* 1. Refactor init_background_engine in JSONFFIEngine to use device_type and device_id arguments.\r\n2. request_stream_callback is called on each string of the array of strings.\r\n\r\n* Calling callback on string of list of JSON dicts instead of each string of JSON dict multiple times\r\n\r\n---------\r\n\r\nCo-authored-by: Animesh Bohara ","shortMessageHtmlLink":"[JSONFFIEngine] Refactor device argument and request_stream_callback …"}},{"before":"227dbb87260b2e14d030a6d880c4f69d475c7022","after":"9b89e048a5bcd84b68f9df3675d1599e502884df","ref":"refs/heads/main","pushedAt":"2024-05-15T05:50:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Auto updated submodule references","shortMessageHtmlLink":"Auto updated submodule references"}},{"before":"4565bc40896f1386a5931113363d6a846859459e","after":"fd1c48780cd1fb86616f594cb6be7919680a68e8","ref":"refs/heads/gh-pages","pushedAt":"2024-05-15T05:33:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed May 15 05:33:55 UTC 2024","shortMessageHtmlLink":"Build at Wed May 15 05:33:55 UTC 2024"}},{"before":"2bbbd52cde62aeed2d0a6f7975c5af81ba84da4a","after":"227dbb87260b2e14d030a6d880c4f69d475c7022","ref":"refs/heads/main","pushedAt":"2024-05-15T05:28:33.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"Add false for arg worker0_only in disco.empty (#2344)","shortMessageHtmlLink":"Add false for arg worker0_only in disco.empty (#2344)"}},{"before":"83aa8055d965d9905c61c791cc36359040f4df3c","after":"4565bc40896f1386a5931113363d6a846859459e","ref":"refs/heads/gh-pages","pushedAt":"2024-05-15T03:36:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed May 15 03:36:18 UTC 2024","shortMessageHtmlLink":"Build at Wed May 15 03:36:18 UTC 2024"}},{"before":"b247f8d2c733c71924c4afc2abc427f6c8d0ab91","after":"2bbbd52cde62aeed2d0a6f7975c5af81ba84da4a","ref":"refs/heads/main","pushedAt":"2024-05-15T03:30:45.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"vinx13","name":"Wuwei Lin","path":"/vinx13","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7195739?s=80&v=4"},"commit":{"message":"Fix cublas offloading (#2343)","shortMessageHtmlLink":"Fix cublas offloading (#2343)"}},{"before":"1fd3c1d812e0c9fda0ebff2c32d41e3da2abb3a2","after":"83aa8055d965d9905c61c791cc36359040f4df3c","ref":"refs/heads/gh-pages","pushedAt":"2024-05-14T22:54:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Tue May 14 22:54:17 UTC 2024","shortMessageHtmlLink":"Build at Tue May 14 22:54:17 UTC 2024"}},{"before":"7a281f41921f617d6c9b041e64b1a395a405ea51","after":"1fd3c1d812e0c9fda0ebff2c32d41e3da2abb3a2","ref":"refs/heads/gh-pages","pushedAt":"2024-05-14T22:52:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Tue May 14 22:52:56 UTC 2024","shortMessageHtmlLink":"Build at Tue May 14 22:52:56 UTC 2024"}},{"before":"0c03537e284e92bc7b27832ba86cc1dea224b9a5","after":"b247f8d2c733c71924c4afc2abc427f6c8d0ab91","ref":"refs/heads/main","pushedAt":"2024-05-14T22:48:56.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[Serving] Add Medusa speculative decoding (#2337)\n\n\r\n* [Serving] Add Medusa speculative decoding","shortMessageHtmlLink":"[Serving] Add Medusa speculative decoding (#2337)"}},{"before":"bc6e3eddbd0979d365d8f8586c2c88d480bc1699","after":"0c03537e284e92bc7b27832ba86cc1dea224b9a5","ref":"refs/heads/main","pushedAt":"2024-05-14T22:47:42.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[DebugChat] Fix DebugChat softmax function and save logits to debug folder (#2342)\n\n* [DebugChat] Fix DebugChat softmax function and save logits to debug folder\r\n\r\n* Fix lint","shortMessageHtmlLink":"[DebugChat] Fix DebugChat softmax function and save logits to debug f…"}},{"before":"4c39404c6d01790ed199d0b2a93b1f0e76bc1f68","after":"7a281f41921f617d6c9b041e64b1a395a405ea51","ref":"refs/heads/gh-pages","pushedAt":"2024-05-14T21:24:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Tue May 14 21:24:32 UTC 2024","shortMessageHtmlLink":"Build at Tue May 14 21:24:32 UTC 2024"}},{"before":"1c66bfa27eea56295b404916114becccbc03cf1d","after":"bc6e3eddbd0979d365d8f8586c2c88d480bc1699","ref":"refs/heads/main","pushedAt":"2024-05-14T21:18:30.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"rickzx","name":"Rick Zhou","path":"/rickzx","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/22135348?s=80&v=4"},"commit":{"message":"[Serving][Grammar] Refactor GrammarStateMatcher and support LLaMA-3 (#2335)\n\nThis PR refactors GrammarStateMatcher and support the LLaMA-3 tokenizer.\r\n\r\nCommon tokenizers, including Phi-2, Gemma, LLaMA-2, etc. are also\r\nsupported.\r\n\r\nThe performance is optimized for LLaMA-3 tokenizer since its token table\r\nhas size 128k, much larger than LLaMA-2 tokenizer.\r\n\r\nThese changes are introduced to the grammar library:\r\n\r\nThese changes are introduced to the grammar library:\r\n1. Introduce ByteString rule expression and simplify CharacterClass\r\n and CharacterClassStar\r\n2. Refactor BNFGrammarVisitor and BNFGrammarMutator for visiting and\r\n mutating grammar rules\r\n3. Now GrammarStateMatcherBase, the internally impl of the\r\n GrammarStateMatcher, accepts char by char, instead of codepoint by\r\n codepoint. So it supports any valid UTF-8 string, even if the token\r\n is not a complete codepoint.\r\n4. Support lookahead assertion for rules to specify the rule must be\r\n followed by a sequence. This can eliminate some uncertain tokens\r\n in preprocessing.\r\n\r\nMinor changes:\r\n1. Introduce template hash function HashCombine\r\n2. Update the UTF8 encoding handling functions\r\n\r\nPerformance:\r\n1. For JSON, finding mask requires <30us on 5900X with single thread.\r\n The uncertain tokens is <30 in most cases.\r\n2. For JSON schema, finding mask requires <30us on 5900X with single\r\n thread. The uncertain tokens is <30 in most cases.","shortMessageHtmlLink":"[Serving][Grammar] Refactor GrammarStateMatcher and support LLaMA-3 (#…"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAETw6u5wA","startCursor":null,"endCursor":null}},"title":"Activity · mlc-ai/mlc-llm"}