{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":547806116,"defaultBranch":"main","name":"text-generation-inference","ownerLogin":"huggingface","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2022-10-08T10:26:28.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/25720743?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1717231641.0","currentOid":""},"activityList":{"items":[{"before":null,"after":"799a193b109662743bed1b18a09af1fdcd508c8b","ref":"refs/heads/fix_phi3","pushedAt":"2024-06-01T08:47:21.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing Phi3.","shortMessageHtmlLink":"Fixing Phi3."}},{"before":"08b3eac2ce54e25bec12088fd7e69ee3c07adaf5","after":"799a193b109662743bed1b18a09af1fdcd508c8b","ref":"refs/heads/main","pushedAt":"2024-06-01T08:47:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing Phi3.","shortMessageHtmlLink":"Fixing Phi3."}},{"before":"8d99b8269d6fc4eba817ca728868eaacd2bdb6af","after":"be3f84831bb8abcbd3a4b698f16c7261f143fce9","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-31T22:56:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fix non decoding paths.","shortMessageHtmlLink":"Fix non decoding paths."}},{"before":"4b4a8e886139a1c730906c03ba0b6c7c4e61ddb0","after":"8d99b8269d6fc4eba817ca728868eaacd2bdb6af","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-31T22:54:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fix Cohere.","shortMessageHtmlLink":"Fix Cohere."}},{"before":"2c6430d48b119ef2dc5749c1031169fe833a7012","after":"4b4a8e886139a1c730906c03ba0b6c7c4e61ddb0","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-31T21:42:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Making it work on non flash decoding.","shortMessageHtmlLink":"Making it work on non flash decoding."}},{"before":"f2813ee081a2b4538ed0572a997e95f6925975b8","after":"2c6430d48b119ef2dc5749c1031169fe833a7012","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-31T16:56:21.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Using flash decoding\n\nConditional flashdecoding.\n\nFix max_q.\n\nWorking kvcache\n\nWorking version with flash decoding.\n\nMake it work for mistral.\n\nFix after rebase..\n\nLess intrusive.\n\nREvert changes in modeling.\n\nSpeedup flashdecoding.\n\nHHachweew\nHack to make other models work.\n\nFixing non flash decoding llama path.\n\nRouter logic knows about page size.\n\nMissing 2 models.\n\nMissing cohere.\n\nFixing cohere flash decoding.\n\nRevamped all this architecture.\n\nFix cohere.\n\nFixing falcon.\n\nEnabling custom block size schedule.\n\nUpdate router/src/infer.rs\n\nNot sending preallocated output.","shortMessageHtmlLink":"Using flash decoding"}},{"before":"64a4b887661021071892e1f3062c16c66e884887","after":null,"ref":"refs/heads/minor-docs-fix","pushedAt":"2024-05-31T16:42:15.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"}},{"before":"5ab4cef67ef6326429a0e4e3d44b9710d9f26c53","after":"08b3eac2ce54e25bec12088fd7e69ee3c07adaf5","ref":"refs/heads/main","pushedAt":"2024-05-31T16:42:14.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"single char ` addition for docs (#1989)\n\n# What does this PR do?\r\n\r\nI think this will fix the docs from being weirdly formatted. All the\r\nsections after MAX_TOP_N_TOKENS don't show up in the bar on the right\r\n(https://huggingface.co/docs/text-generation-inference/basic_tutorials/launcher#maxtopntokens)\r\n\r\n\r\n## Before submitting\r\n- [x] This PR fixes a typo or improves the docs (you can dismiss the\r\nother checks if that's the case).\r\n- [ ] Did you read the [contributor\r\nguideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),\r\n Pull Request section?\r\n- [ ] Was this discussed/approved via a Github issue or the\r\n[forum](https://discuss.huggingface.co/)? Please add a link\r\n to it if that's the case.\r\n- [ ] Did you make sure to update the documentation with your changes?\r\nHere are the\r\n[documentation\r\nguidelines](https://github.com/huggingface/transformers/tree/main/docs),\r\nand\r\n[here are tips on formatting\r\ndocstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).\r\n- [ ] Did you write any new necessary tests?\r\n\r\n\r\n## Who can review?\r\n\r\n@merveenoyan\r\n\r\n---------\r\n\r\nCo-authored-by: Nicolas Patry ","shortMessageHtmlLink":"single char ` addition for docs (#1989)"}},{"before":null,"after":"c79cf825bd81a42a8da3f05aad2670de70789e41","ref":"refs/heads/maintenance/merge-vlm-input-prep","pushedAt":"2024-05-31T16:14:58.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"WIP","shortMessageHtmlLink":"WIP"}},{"before":"37f955dd14857be9f4f04807a05bdc6eacbdb081","after":"64a4b887661021071892e1f3062c16c66e884887","ref":"refs/heads/minor-docs-fix","pushedAt":"2024-05-31T16:07:02.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing the CLI.","shortMessageHtmlLink":"Fixing the CLI."}},{"before":"5b58262fea014e21b89133c04d6a749570a50f70","after":null,"ref":"refs/heads/fix_exl2_scratch","pushedAt":"2024-05-31T16:01:44.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"}},{"before":"06edde94910594eef86988934cbbc43d775eb965","after":"5ab4cef67ef6326429a0e4e3d44b9710d9f26c53","ref":"refs/heads/main","pushedAt":"2024-05-31T16:01:43.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing exl2 scratch buffer. (#1990)\n\n# What does this PR do?\r\n\r\n\r\n\r\n\r\n\r\nFixes # (issue)\r\n\r\n\r\n## Before submitting\r\n- [ ] This PR fixes a typo or improves the docs (you can dismiss the\r\nother checks if that's the case).\r\n- [ ] Did you read the [contributor\r\nguideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),\r\n Pull Request section?\r\n- [ ] Was this discussed/approved via a Github issue or the\r\n[forum](https://discuss.huggingface.co/)? Please add a link\r\n to it if that's the case.\r\n- [ ] Did you make sure to update the documentation with your changes?\r\nHere are the\r\n[documentation\r\nguidelines](https://github.com/huggingface/transformers/tree/main/docs),\r\nand\r\n[here are tips on formatting\r\ndocstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).\r\n- [ ] Did you write any new necessary tests?\r\n\r\n\r\n## Who can review?\r\n\r\nAnyone in the community is free to review the PR once the tests have\r\npassed. Feel free to tag\r\nmembers/contributors who may be interested in your PR.\r\n\r\n","shortMessageHtmlLink":"Fixing exl2 scratch buffer. (#1990)"}},{"before":"be87c840b80577c53bb5c4ec8ac8a19e0376473a","after":"f2813ee081a2b4538ed0572a997e95f6925975b8","ref":"refs/heads/flashdecoding","pushedAt":"2024-05-31T16:00:04.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Not sending preallocated output.","shortMessageHtmlLink":"Not sending preallocated output."}},{"before":"659bd67fec0a874e325fc2a2afd0c2ed2af692f0","after":"06edde94910594eef86988934cbbc43d775eb965","ref":"refs/heads/main","pushedAt":"2024-05-31T15:57:02.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Purely refactors paged/attention into `layers/attention` and make hardware differences more obvious with 1 file per hardware. (#1986)\n\n# What does this PR do?\r\n\r\n\r\n\r\n\r\n\r\nFixes # (issue)\r\n\r\n\r\n## Before submitting\r\n- [ ] This PR fixes a typo or improves the docs (you can dismiss the\r\nother checks if that's the case).\r\n- [ ] Did you read the [contributor\r\nguideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),\r\n Pull Request section?\r\n- [ ] Was this discussed/approved via a Github issue or the\r\n[forum](https://discuss.huggingface.co/)? Please add a link\r\n to it if that's the case.\r\n- [ ] Did you make sure to update the documentation with your changes?\r\nHere are the\r\n[documentation\r\nguidelines](https://github.com/huggingface/transformers/tree/main/docs),\r\nand\r\n[here are tips on formatting\r\ndocstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).\r\n- [ ] Did you write any new necessary tests?\r\n\r\n\r\n## Who can review?\r\n\r\nAnyone in the community is free to review the PR once the tests have\r\npassed. Feel free to tag\r\nmembers/contributors who may be interested in your PR.\r\n\r\n","shortMessageHtmlLink":"Purely refactors paged/attention into layers/attention and make har…"}},{"before":"b0c168d249c22c1e18f84a78cede0e4f307034e9","after":null,"ref":"refs/heads/rearchitecture_attention_code","pushedAt":"2024-05-31T15:57:02.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"}},{"before":"d44688b6ac83584f6a196fd1ce7705636c016269","after":"b0c168d249c22c1e18f84a78cede0e4f307034e9","ref":"refs/heads/rearchitecture_attention_code","pushedAt":"2024-05-31T15:56:09.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Update server/text_generation_server/layers/attention/xpu.py","shortMessageHtmlLink":"Update server/text_generation_server/layers/attention/xpu.py"}},{"before":null,"after":"5b58262fea014e21b89133c04d6a749570a50f70","ref":"refs/heads/fix_exl2_scratch","pushedAt":"2024-05-31T15:18:55.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Fixing exl2 scratch buffer.","shortMessageHtmlLink":"Fixing exl2 scratch buffer."}},{"before":"659bd67fec0a874e325fc2a2afd0c2ed2af692f0","after":"37f955dd14857be9f4f04807a05bdc6eacbdb081","ref":"refs/heads/minor-docs-fix","pushedAt":"2024-05-31T15:08:59.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"nbroad1881","name":"Nicholas Broad","path":"/nbroad1881","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/24982805?s=80&v=4"},"commit":{"message":"single char ` addition","shortMessageHtmlLink":"single char ` addition"}},{"before":null,"after":"659bd67fec0a874e325fc2a2afd0c2ed2af692f0","ref":"refs/heads/minor-docs-fix","pushedAt":"2024-05-31T15:07:53.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"nbroad1881","name":"Nicholas Broad","path":"/nbroad1881","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/24982805?s=80&v=4"},"commit":{"message":"Update documentation version to 2.0.4 (#1980)\n\nAs per title\r\n\r\ncc @Narsil","shortMessageHtmlLink":"Update documentation version to 2.0.4 (#1980)"}},{"before":"967ced2ff4565a5358d45a1372d32fbab113700b","after":"659bd67fec0a874e325fc2a2afd0c2ed2af692f0","ref":"refs/heads/main","pushedAt":"2024-05-31T14:03:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Update documentation version to 2.0.4 (#1980)\n\nAs per title\r\n\r\ncc @Narsil","shortMessageHtmlLink":"Update documentation version to 2.0.4 (#1980)"}},{"before":"99d4c9e2133bfc0d00860851da9fa0e7b66bb7c1","after":"6c5598eb92c10e6308af4710e00bb027f184ebb0","ref":"refs/heads/feature/chunked-input","pushedAt":"2024-05-31T13:04:33.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"router: send the input as chunks to the backend\n\nBefore this change, the generation input was sent to the backend as a\nsingle string, encoding images as Base64 and packing them in\nMarkdown-style links.\n\nThis change adds a new chunked input representation that separates text\nchunks from images chunks. Image chunks contain binary data (for smaller\nmessage sizes) and the image's MIME type.\n\nThe stringly-typed inputs are still sent to support backends that do not\nsupport chunked inputs yet.","shortMessageHtmlLink":"router: send the input as chunks to the backend"}},{"before":"c67539fbcc5c1336fcca8b12a6ff59cdeeac514b","after":"d44688b6ac83584f6a196fd1ce7705636c016269","ref":"refs/heads/rearchitecture_attention_code","pushedAt":"2024-05-31T12:43:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Adress comments + fix 2nd path in falcon.","shortMessageHtmlLink":"Adress comments + fix 2nd path in falcon."}},{"before":"9ff4f18130e0d0a77937614fe45d54e8212eeb07","after":"a9beab04b516c718c4874bf48dfa99389afc07f6","ref":"refs/heads/feat/move_cache_manager_rust","pushedAt":"2024-05-31T12:37:59.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"OlivierDehaene","name":null,"path":"/OlivierDehaene","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/23298448?s=80&v=4"},"commit":{"message":"rebased and added other profile","shortMessageHtmlLink":"rebased and added other profile"}},{"before":"544ddf5c290445408caeaf30ae4012e0a83a9e9d","after":"64bea2427fe8e466693ca8b82f37b4cdff28b403","ref":"refs/heads/feature/server-chunks","pushedAt":"2024-05-31T12:00:15.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"server: use chunked inputs\n\nThe router will now send the input as chunks besides as a single\nstring. This change modifies the server to process chunked input\nrather than strings. This also allows us to remove the image\nextraction code from the server.","shortMessageHtmlLink":"server: use chunked inputs"}},{"before":"91f55ea2b5edf5d83d9fdaa9ed85681f7831cb27","after":"c67539fbcc5c1336fcca8b12a6ff59cdeeac514b","ref":"refs/heads/rearchitecture_attention_code","pushedAt":"2024-05-31T10:51:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Update server/text_generation_server/utils/import_utils.py","shortMessageHtmlLink":"Update server/text_generation_server/utils/import_utils.py"}},{"before":null,"after":"91f55ea2b5edf5d83d9fdaa9ed85681f7831cb27","ref":"refs/heads/rearchitecture_attention_code","pushedAt":"2024-05-31T10:16:55.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"Narsil","name":"Nicolas Patry","path":"/Narsil","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/204321?s=80&v=4"},"commit":{"message":"Removing flash decoding part so it gets merged.","shortMessageHtmlLink":"Removing flash decoding part so it gets merged."}},{"before":"474d66bc651d50bcf31726a6a6512a1d171c9640","after":"544ddf5c290445408caeaf30ae4012e0a83a9e9d","ref":"refs/heads/feature/server-chunks","pushedAt":"2024-05-31T09:53:24.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"server: use chunked inputs\n\nThe router will now send the input as chunks besides as a single\nstring. This change modifies the server to process chunked input\nrather than strings. This also allows us to remove the image\nextraction code from the server.","shortMessageHtmlLink":"server: use chunked inputs"}},{"before":"863d2ea035ecda5dcf6e8b97d49a8ccbd592f9d4","after":"474d66bc651d50bcf31726a6a6512a1d171c9640","ref":"refs/heads/feature/server-chunks","pushedAt":"2024-05-31T08:56:52.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"server: use chunked inputs\n\nThe router will now send the input as chunks besides as a single\nstring. This change modifies the server to process chunked input\nrather than strings. This also allows us to remove the image\nextraction code from the server.","shortMessageHtmlLink":"server: use chunked inputs"}},{"before":"c60b2b017cf3f451325f9892661ed10961f52fa9","after":"863d2ea035ecda5dcf6e8b97d49a8ccbd592f9d4","ref":"refs/heads/feature/server-chunks","pushedAt":"2024-05-31T08:30:35.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"danieldk","name":"Daniël de Kok","path":"/danieldk","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49398?s=80&v=4"},"commit":{"message":"Support chunks in non-VLM models","shortMessageHtmlLink":"Support chunks in non-VLM models"}},{"before":null,"after":"b4e0e6f59aaa1022a8b9f94e13cae0ba73cb5284","ref":"refs/heads/lora-internal","pushedAt":"2024-05-30T19:16:24.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"drbh","name":"drbh","path":"/drbh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9896130?s=80&v=4"},"commit":{"message":"feat: first draft load multiple lora","shortMessageHtmlLink":"feat: first draft load multiple lora"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEWaOoMwA","startCursor":null,"endCursor":null}},"title":"Activity · huggingface/text-generation-inference"}