Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🗣️ feat: STT & TTS #1603

Closed
wants to merge 191 commits into from
Closed
Show file tree
Hide file tree
Changes from 101 commits
Commits
Show all changes
191 commits
Select commit Hold shift + click to select a range
1af6751
Update TextChat.jsx
bsu3338 Aug 4, 2023
b3636ab
Update SubmitButton.jsx
bsu3338 Aug 4, 2023
4401d0d
Update TextChat.jsx
bsu3338 Aug 4, 2023
07b2af1
Merge branch 'danny-avila:main' into Speech-to-Text
bsu3338 Aug 5, 2023
5a67874
Update SubmitButton.jsx
bsu3338 Aug 5, 2023
14f4d66
Create ListeningIcon.tsx
bsu3338 Aug 5, 2023
65a7b2b
Update index.ts
bsu3338 Aug 5, 2023
31441ed
Update SubmitButton.jsx
bsu3338 Aug 5, 2023
74fa8d1
Update TextChat.jsx
bsu3338 Aug 5, 2023
37c0f5b
Update ListeningIcon.tsx
bsu3338 Aug 5, 2023
46c53d1
Update ListeningIcon.tsx
bsu3338 Aug 5, 2023
2ffb5be
Create SpeechRecognition.tsx
bsu3338 Aug 5, 2023
49a9dae
Update TextChat.jsx
bsu3338 Aug 5, 2023
eb842c6
Update TextChat.jsx
bsu3338 Aug 5, 2023
8982ec1
Update SpeechRecognition.tsx
bsu3338 Aug 5, 2023
ca3f064
Update TextChat.jsx
bsu3338 Aug 5, 2023
2522d76
Update SpeechRecognition.tsx
bsu3338 Aug 5, 2023
d9a4d2f
Update SpeechRecognition.tsx
bsu3338 Aug 5, 2023
42aadd2
Update SpeechRecognition.tsx
bsu3338 Aug 5, 2023
5ad9927
Merge branch 'danny-avila:main' into Speech-to-Text
bsu3338 Aug 5, 2023
5d76082
Update SpeechRecognition.tsx
bsu3338 Aug 5, 2023
93ceae6
Merge branch 'danny-avila:main' into Speech-to-Text
bsu3338 Aug 5, 2023
b49024f
Update SubmitButton.jsx
bsu3338 Aug 6, 2023
28a00a5
Update TextChat.jsx
bsu3338 Aug 6, 2023
69ff48d
Update SpeechRecognition.tsx
bsu3338 Aug 6, 2023
cfe6325
Merge branch 'main' into Speech-to-Text
bsu3338 Aug 6, 2023
fd23679
Merge branch 'main' into Speech-to-Text
bsu3338 Aug 6, 2023
148a71b
Merge branch 'main' into Speech-to-Text
bsu3338 Aug 7, 2023
252325d
Merge branch 'main' into Speech-to-Text
bsu3338 Aug 9, 2023
f9ed2ad
Create SpeechSynthesis.tsx
bsu3338 Aug 9, 2023
09c68d1
Update index.jsx
bsu3338 Aug 9, 2023
e7d7d73
Update SpeechSynthesis.tsx
bsu3338 Aug 9, 2023
e313637
Update SpeechRecognition.tsx
bsu3338 Aug 9, 2023
78278b5
Merge branch 'main' into Speech-to-Text
bsu3338 Aug 9, 2023
776daa1
Update TextChat.jsx
bsu3338 Aug 9, 2023
c02d43b
Update SpeechRecognition.tsx
bsu3338 Aug 9, 2023
c7ffb25
Update SpeechRecognition.tsx
bsu3338 Aug 9, 2023
10e3be5
Update SpeechRecognition.tsx
bsu3338 Aug 9, 2023
78a8106
Update TextChat.jsx
bsu3338 Aug 11, 2023
7e8bae2
Merge branch 'danny-avila:main' into Speech-to-Text
bsu3338 Aug 11, 2023
ed4b25b
Squashed commit of the following:
bsu3338 Sep 3, 2023
ad3c78f
Merge branch 'danny-avila:main' into Speech-September
bsu3338 Sep 3, 2023
863af2c
Create VolumeMuteIcon.tsx
bsu3338 Sep 3, 2023
b03001d
Create VolumeIcon.tsx
bsu3338 Sep 3, 2023
29a5b55
Update index.ts
bsu3338 Sep 3, 2023
8d5114b
Update SubmitButton.jsx
bsu3338 Sep 3, 2023
6583877
Update SubmitButton.jsx
bsu3338 Sep 3, 2023
9a3e67f
Update TextChat.jsx
bsu3338 Sep 3, 2023
6033eb3
Update TextChat.jsx
bsu3338 Sep 4, 2023
d405454
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
7f101bd
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
9a27e56
Update TextChat.jsx
bsu3338 Sep 4, 2023
5542f8e
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
c7eea96
Update TextChat.jsx
bsu3338 Sep 4, 2023
3e36c16
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
c041c32
Update useServerStream.ts
bsu3338 Sep 4, 2023
4b30c13
Update useServerStream.ts
bsu3338 Sep 4, 2023
8ed04e4
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
875ce4b
Update useServerStream.ts
bsu3338 Sep 4, 2023
609d1df
Update useServerStream.ts
bsu3338 Sep 4, 2023
4b4afcd
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
6133531
Update VolumeIcon.tsx
bsu3338 Sep 4, 2023
37c828d
Update VolumeMuteIcon.tsx
bsu3338 Sep 4, 2023
95cf300
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
e9882de
Update SpeechSynthesis.tsx
bsu3338 Sep 4, 2023
7ae0e7e
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
c794f07
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
d95fa19
Update SpeechSynthesis.tsx
bsu3338 Sep 4, 2023
c5ce576
Update SpeechSynthesis.tsx
bsu3338 Sep 4, 2023
4c6d067
Update HoverButtons.tsx
bsu3338 Sep 4, 2023
39e84ef
Update SpeechSynthesis.tsx
bsu3338 Sep 4, 2023
5b80ddf
Update package.json
bsu3338 Sep 4, 2023
6686126
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
0b35dbe
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
67f111c
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
1019529
Update SpeechRecognition.tsx
bsu3338 Sep 4, 2023
c1087ac
Squashed commit of the following:
bsu3338 Sep 4, 2023
f3b7b3e
Merge branch 'Speech-September' into Speech-to-Text
bsu3338 Sep 4, 2023
3217b40
Update package-lock.json
bsu3338 Sep 4, 2023
86bffc8
Merge remote-tracking branch 'upstream/main' into Speech-to-Text
bsu3338 Sep 22, 2023
ae1ba09
Merge branch 'danny-avila:main' into Speech-to-Text
bsu3338 Sep 22, 2023
17bf603
Update SubmitButton.tsx
bsu3338 Sep 23, 2023
04720a0
Update SpeechRecognition.tsx
bsu3338 Sep 23, 2023
c1a38ad
fix: typescript error
berry-13 Nov 3, 2023
4679ba2
Merge branch 'main' into Speech-to-Text
berry-13 Jan 14, 2024
3f0de68
style: moved to new UI
berry-13 Jan 15, 2024
3767123
fix:(SpeechRecognition) lint error
berry-13 Jan 19, 2024
e5bf3af
Merge branch 'main' into Speech-to-Text
berry-13 Jan 19, 2024
7c0af5e
moved everything to hooks
berry-13 Jan 19, 2024
15004ae
feat: support stt external
berry-13 Jan 20, 2024
af5f6a8
Merge branch 'main' into Speech-to-Text
berry-13 Jan 20, 2024
c62053a
fix(useExternalSpeechRecognition): recording the audio
berry-13 Jan 20, 2024
2995686
feat: whisper api support
berry-13 Jan 23, 2024
6e928e2
refactor(SpeechReecognition); fix(HoverButtons): set isSpeakling corr…
berry-13 Jan 23, 2024
a5c3461
fix: spelling errors
berry-13 Jan 25, 2024
c3bec3a
fix: renamed files
berry-13 Jan 25, 2024
4164159
BIG FIX
berry-13 Jan 28, 2024
146b5a8
feat: whisper support
berry-13 Jan 28, 2024
10b0622
fixed some ChatForm bugs and added the tts route
berry-13 Jan 28, 2024
7a4e854
handling more errors
berry-13 Jan 28, 2024
88b7b37
Fix audio stream initialization and cleanup in useSpeechToTextExternal
berry-13 Jan 28, 2024
7b69cf3
feat: Elevenlabs TTS
berry-13 Feb 1, 2024
8931c53
fixed some req issues
berry-13 Feb 4, 2024
3a2fdf8
Merge branch 'main' into Speech-to-Text
berry-13 Feb 7, 2024
3f5bb8c
fix: stt not activating on Mac
berry-13 Feb 7, 2024
3d0d942
Merge branch 'main' into Speech-to-Text
berry-13 Feb 12, 2024
27af0df
Merge branch 'main' into Speech-to-Text
berry-13 Feb 14, 2024
3ee6992
fix: send audio blob to frontend
berry-13 Feb 22, 2024
b8de3cf
fix(ChatForm): startupConfig var
berry-13 Mar 1, 2024
aa29a85
Update text-to-speech and speech-to-text services
berry-13 Mar 1, 2024
853e9ea
handle more errors correctly
berry-13 Mar 2, 2024
be4da8a
Remove console.log statements
berry-13 Mar 2, 2024
0ef4cdf
Merge branch 'main' into Speech-to-Text
berry-13 Mar 7, 2024
d815b69
feat: added manual trigger with button
berry-13 Mar 9, 2024
374cad9
fix: SpeechToText and SpeechToTextExernal + AudioRecorder
berry-13 Mar 9, 2024
881b90d
refactor: TTS component
berry-13 Mar 10, 2024
5651f90
chore: removed unused variable
berry-13 Mar 10, 2024
ac6acce
feat: azure stt
berry-13 Mar 15, 2024
e736e21
feat: dedicated speech panel
berry-13 Mar 16, 2024
dd3a886
feat: STT button switch: fix: TextArea pr value adapted
berry-13 Mar 16, 2024
fec7e1f
Merge branch 'main' into Speech-to-Text
berry-13 Mar 16, 2024
5502b7d
refactor: textToSpeech function and useTextToSpeechMutation
berry-13 Mar 16, 2024
774cfc3
Merge branch 'main' into Speech-to-Text
berry-13 Mar 22, 2024
e95be15
fix: typo data-service
berry-13 Mar 22, 2024
9116fd5
fix: blob backend to frontend
berry-13 Mar 22, 2024
664b7de
feat: TTS button for external
berry-13 Mar 22, 2024
e7e38df
feat: librechat.yaml
berry-13 Mar 23, 2024
1c37ebe
style: spinner when loading TTS
berry-13 Mar 23, 2024
6fca8d4
feat: hold click to download file
berry-13 Mar 24, 2024
c747867
style: disabled when apiKey not provided
berry-13 Mar 24, 2024
eccf7bf
fix: typo startupConfig?.speechToTextExternal
berry-13 Mar 24, 2024
b6c2857
style: update icons
berry-13 Mar 24, 2024
e56d860
fix(useTextToSpeech): set isSpeaking when audio finish
berry-13 Mar 24, 2024
8f01ba4
fix: small issues with local TTS
berry-13 Mar 24, 2024
bb713b2
style: update settings dark theme
berry-13 Mar 24, 2024
1e22721
Merge branch 'main' into Speech-to-Text
berry-13 Mar 25, 2024
6423b38
docs: STT & TTS
berry-13 Mar 25, 2024
85b3168
Merge branch 'main' into Speech-to-Text
berry-13 Mar 26, 2024
d47b7ed
WIP: chat audio automatic; docs(custom_config): update to new .yaml v…
berry-13 Mar 27, 2024
5a58a62
fix: send button disabled
berry-13 Mar 27, 2024
8e98620
fix: interval update
berry-13 Mar 28, 2024
4572ebf
localization
berry-13 Mar 28, 2024
4417864
removed unused test code
berry-13 Mar 28, 2024
b7985f8
revert interval update to 100
berry-13 Mar 28, 2024
c701758
feat: auto-send message
berry-13 Mar 28, 2024
c0c9477
Merge branch 'main' into Speech-to-Text
berry-13 Mar 29, 2024
8a90f93
fix: chat audio automatic, default false
berry-13 Mar 29, 2024
d970158
Merge branch 'Speech-to-Text' of https://github.com/danny-avila/libre…
berry-13 Mar 29, 2024
65249f4
refactor: moved all logic to hooks
berry-13 Apr 1, 2024
18a9cc7
Merge branch 'main' into Speech-to-Text
Apr 4, 2024
4eb6841
Merge branch 'main' into Speech-to-Text
berry-13 Apr 15, 2024
79a6901
Merge branch 'main' into Speech-to-Text
berry-13 Apr 23, 2024
c235b38
chore: renamed ChatAudio to conversationMode
berry-13 Apr 23, 2024
151be34
refactor: organized Speech panel
berry-13 Apr 23, 2024
03db6ef
feat: autoSendText switch
berry-13 Apr 23, 2024
26e0df1
feat: moved chataudio to conversationMode and improved error handling…
berry-13 Apr 23, 2024
bc8121d
refactor: Auto transcribe audio
berry-13 Apr 23, 2024
8807431
test: AutoSendTextSwitch, AutoTranscribeAudioSwitch and ConversationM…
berry-13 Apr 24, 2024
01abb65
fix: various speechTab fixes
berry-13 Apr 24, 2024
63fe703
refactor(useSpeechToTextBrowser):: handle more errors
berry-13 Apr 24, 2024
7f69f3f
feat: engine select
berry-13 Apr 25, 2024
78bda40
Merge branch 'main' into Speech-to-Text
berry-13 Apr 25, 2024
2acc9a9
feat: advanced mode
berry-13 Apr 27, 2024
de1dd10
chore: converted hooks to TS
berry-13 Apr 27, 2024
f595225
feat: cache TTS
berry-13 Apr 27, 2024
a38ba05
feat: delete cache; fix: cache issues
berry-13 Apr 27, 2024
619d336
refactor(useTextToSpeechExternal): removed unused import
berry-13 Apr 27, 2024
8d4bea9
feat: cache switch; refactor: moved to dir STT/TTS
berry-13 Apr 27, 2024
59861b9
tests: CacheTTS, TextToSpeech, SpeechToText
berry-13 Apr 27, 2024
3e40ad0
feat: custom elevenlabs compatibility
berry-13 Apr 27, 2024
7f48031
fix(useTextToSpeechExternal): cache switch not working
berry-13 Apr 27, 2024
0875fe5
Merge branch 'main' into Speech-to-Text
berry-13 May 2, 2024
e39d0eb
feat: animation for STT
berry-13 May 3, 2024
db4fc17
Merge branch 'main' into Speech-to-Text
berry-13 May 5, 2024
9f07c80
Merge branch 'main' into Speech-to-Text
berry-13 May 6, 2024
415a869
Merge branch 'main' into Speech-to-Text
berry-13 May 7, 2024
e06a13b
Merge branch 'main' into Speech-to-Text
berry-13 May 10, 2024
ca12731
fix: settings var not working
berry-13 May 8, 2024
f3b78cf
chore: remove unused var
berry-13 May 8, 2024
486740a
feat: voice dropdown; refactor: yaml changes
berry-13 May 11, 2024
d3f5878
fix(textToSpeech): remove undefined properties
berry-13 May 11, 2024
8647cc3
refactor: Remove console logs and unused variable
berry-13 May 11, 2024
cc35f77
Merge branch 'main' into Speech-to-Text
berry-13 May 13, 2024
b619b80
fix: TTS; feat: support coqui and piper
berry-13 May 13, 2024
6c1f7df
fix: some STT issues
berry-13 May 13, 2024
ece8f89
fix: stt test
berry-13 May 14, 2024
24ad1d9
fix: STT backend sending wrong data
berry-13 May 14, 2024
74a8ef5
BREAKING: switch to react-speech-recognition, add regenerator-runtime…
berry-13 May 16, 2024
80b6689
feat: websocket backend
berry-13 May 17, 2024
e27f59e
Merge branch 'main' into Speech-to-Text
berry-13 May 17, 2024
edc5c8e
foundations for websocket
berry-13 May 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 8 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,14 @@ MEILI_NO_ANALYTICS=true
MEILI_HOST=http://0.0.0.0:7700
MEILI_MASTER_KEY=DrhYf7zENyR6AlUCKmnz0eYASOQdl6zxH7s7MKFSfFCt


#==================================================#
# Speech to Text & Text to Speech #
#==================================================#

WHISPER_API_KEY=
ELEVENLABS_API_KEY=

#===================================================#
# User System #
#===================================================#
Expand Down
1 change: 1 addition & 0 deletions api/server/routes/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ router.get('/', async function (req, res) {
!!process.env.EMAIL_PASSWORD &&
!!process.env.EMAIL_FROM,
checkBalance: isEnabled(process.env.CHECK_BALANCE),
speechToTextExternal: !!process.env.WHISPER_API_KEY,
};

if (typeof process.env.CUSTOM_FOOTER === 'string') {
Expand Down
4 changes: 4 additions & 0 deletions api/server/routes/files/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,16 @@ const {
const files = require('./files');
const images = require('./images');
const avatar = require('./avatar');
const stt = require('./stt');
const tts = require('./tts');

router.use(requireJwtAuth);
router.use(checkBan);
router.use(uaParser);

router.use('/', files);
router.use('/stt', stt);
router.use('/tts', tts);
router.use('/images', images);
router.use('/images/avatar', avatar);

Expand Down
17 changes: 17 additions & 0 deletions api/server/routes/files/stt.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
const express = require('express');
const router = express.Router();
const { requireJwtAuth } = require('~/server/middleware/');
const multer = require('multer');
const { speechToTextLocal, speechToTextWhisper } = require('~/server/services/Files/Audio');

const upload = multer();

router.post('/', requireJwtAuth, upload.single('audio'), async (req, res) => {
if (process.env.WHISPER_LOCAL === 'true') {
await speechToTextLocal(req, res);
} else {
await speechToTextWhisper(req, res);
}
});

module.exports = router;
13 changes: 13 additions & 0 deletions api/server/routes/files/tts.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
const express = require('express');
const router = express.Router();
const { requireJwtAuth } = require('~/server/middleware/');
const textToSpeechLocal = require('~/server/services/Files/Audio');

router.post('/', requireJwtAuth, async (req, res) => {
console.log('Received FormData');

const audioBuffer = await textToSpeechLocal(req, res);
res.send(audioBuffer);
});

module.exports = router;
9 changes: 9 additions & 0 deletions api/server/services/Files/Audio/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
const speechToTextLocal = require('./speechToTextLocal');
const textToSpeechLocal = require('./textToSpeechLocal');
const speechToTextWhisper = require('./speechToTextWhisper');

module.exports = {
speechToTextLocal,
textToSpeechLocal,
speechToTextWhisper,
};
39 changes: 39 additions & 0 deletions api/server/services/Files/Audio/speechToTextLocal.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
const axios = require('axios');
const FormData = require('form-data');
const { Readable } = require('stream');

async function speechToTextLocal(req, res) {
if (!req.file || !req.file.buffer) {
console.error('No audio file provided in the FormData');
return res.status(400).json({ message: 'No audio file provided in the FormData' });
}

const audioBuffer = req.file.buffer;

// Create a readable stream from the audio buffer
const audioReadStream = Readable.from(audioBuffer);
// Set the filename for mimeType detection
audioReadStream.path = 'audio.wav';

const formData = new FormData();
formData.append('file', audioReadStream, { filename: 'audio.wav', contentType: 'audio/wav' });
formData.append('model', 'whisper');

try {
// Make the POST request using axios
const response = await axios.post('http://localhost:8080/v1/audio/transcriptions', formData, {
headers: formData.getHeaders(),
});

if (response && response.status && response.data && response.data.text) {
const text = response.data.text.trim();
res.json({ text });
}
} catch (error) {
console.error(error);
console.error('Server response:', error.response.data);
res.status(500).json({ message: 'An error occurred while processing the audio' });
}
}

module.exports = speechToTextLocal;
37 changes: 37 additions & 0 deletions api/server/services/Files/Audio/speechToTextWhisper.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
const axios = require('axios');
const FormData = require('form-data');

async function speechToTextLocal(req, res) {
if (!req.file || !req.file.buffer) {
console.error('No audio file provided in the FormData');
return res.status(400).json({ message: 'No audio file provided in the FormData' });
}

const audioBuffer = req.file.buffer;
const audioBlob = new Blob([audioBuffer], { type: req.file.mimetype });

const formData = new FormData();
formData.append('file', audioBlob);
formData.append('model', 'whisper-1');

try {
// Make the POST request using axios
const response = await axios.post('https://api.openai.com/v1/audio/transcriptions', formData, {
headers: {
Authorization: `Bearer ${process.env.WHISPER_API_KEY}`,
'Content-Type': 'multipart/form-data',
},
});

if (response && response.status && response.data && response.data.text) {
const text = response.data.text.trim();
res.json({ text });
}
} catch (error) {
console.error(error);
console.error('Server response:', error.response.data);
res.status(500).json({ message: 'An error occurred while processing the audio' });
}
}

module.exports = speechToTextLocal;
15 changes: 15 additions & 0 deletions api/server/services/Files/Audio/textToSpeechLocal.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// const { Buffer } = require('buffer');

async function textToSpeechLocal(req, res) {
const response = 'Test response';

console.log(req);

res.send('Test response');

// const mp3Buffer = Buffer.from(await response.arrayBuffer());

return response;
}

module.exports = textToSpeechLocal;
1 change: 1 addition & 0 deletions client/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
"downloadjs": "^1.4.7",
"export-from-json": "^1.7.2",
"filenamify": "^6.0.0",
"hotkeys-js": "^3.12.0",
"html-to-image": "^1.11.11",
"image-blob-reduce": "^4.1.0",
"librechat-data-provider": "*",
Expand Down
3 changes: 2 additions & 1 deletion client/src/components/Auth/LoginForm.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ function LoginForm({ onSubmit }: TLoginFormProps) {
aria-label="Sign in"
data-testid="login-button"
type="submit"
className="w-full transform rounded-md bg-green-500 px-4 py-3 tracking-wide text-white transition-colors duration-200 hover:bg-green-600 focus:bg-green-600 focus:outline-none">
className="w-full transform rounded-md bg-green-500 px-4 py-3 tracking-wide text-white transition-colors duration-200 hover:bg-green-600 focus:bg-green-600 focus:outline-none"
>
{localize('com_auth_continue')}
</button>
</div>
Expand Down
33 changes: 32 additions & 1 deletion client/src/components/Chat/Input/ChatForm.tsx
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { useRecoilState } from 'recoil';
import { useEffect } from 'react';
import type { ChangeEvent } from 'react';
import { useChatContext } from '~/Providers';
import { useRequiresKey } from '~/hooks';
Expand All @@ -8,6 +9,8 @@ import SendButton from './SendButton';
import Images from './Files/Images';
import Textarea from './Textarea';
import store from '~/store';
import { useSpeechToText, useSpeechToTextExternal } from '~/hooks';
import { useGetStartupConfig } from 'librechat-data-provider/react-query';

export default function ChatForm({ index = 0 }) {
const [text, setText] = useRecoilState(store.textByIndex(index));
Expand All @@ -32,6 +35,29 @@ export default function ChatForm({ index = 0 }) {
const { requiresKey } = useRequiresKey();
const { endpoint: _endpoint, endpointType } = conversation ?? { endpoint: null };
const endpoint = endpointType ?? _endpoint;
const { data: startupConfig } = useGetStartupConfig();
berry-13 marked this conversation as resolved.
Show resolved Hide resolved
const useExternalSpeech = startupConfig?.speechToTextExternal;

const {
isListening: speechIsListening,
isLoading: speechIsLoading,
text: speechText,
} = useSpeechToText();

const {
isListening: externalIsListening,
isLoading: externalIsLoading,
text: externalSpeechText,
} = useSpeechToTextExternal();

const isListening = useExternalSpeech ? externalIsListening : speechIsListening;
const isLoading = useExternalSpeech ? externalIsLoading : speechIsLoading;
const speechTextForm = useExternalSpeech ? externalSpeechText : speechText;
const finalText = speechText || externalSpeechText ? speechTextForm : text;

useEffect(() => {
return setText(finalText);
}, [finalText, setText]);

return (
<form
Expand Down Expand Up @@ -60,7 +86,12 @@ export default function ChatForm({ index = 0 }) {
<StopButton stop={handleStopGenerating} setShowStopButton={setShowStopButton} />
) : (
endpoint && (
<SendButton text={text} disabled={filesLoading || isSubmitting || requiresKey} />
<SendButton
text={text}
disabled={filesLoading || isSubmitting || requiresKey}
isListening={isListening}
isLoading={isLoading}
/>
)
)}
</div>
Expand Down
49 changes: 36 additions & 13 deletions client/src/components/Chat/Input/SendButton.tsx
Original file line number Diff line number Diff line change
@@ -1,19 +1,42 @@
import { SendIcon } from '~/components/svg';
import { SendIcon, ListeningIcon, Spinner } from '~/components/svg';
import { cn } from '~/utils';

export default function SendButton({ text, disabled }) {
export default function SendButton({ text, disabled, isListening, isLoading }) {
return (
<button
disabled={!text || disabled}
className={cn(
'absolute bottom-1.5 right-2 rounded-lg border border-black p-0.5 text-white transition-colors enabled:bg-black disabled:bg-black disabled:text-gray-400 disabled:opacity-10 dark:border-white dark:bg-white dark:hover:bg-gray-900 dark:disabled:bg-white dark:disabled:hover:bg-transparent md:bottom-3 md:right-3',
<>
{isListening ? (
<button
className="group absolute bottom-0 right-0 z-[101] flex h-[100%] w-[50px] items-center justify-center bg-transparent p-1 text-gray-500"
disabled={true}
>
<span className="" data-state="closed">
<ListeningIcon />
</span>
</button>
) : isLoading ? (
<button
className="group absolute bottom-0 right-0 z-[101] flex h-[100%] w-[50px] items-center justify-center bg-transparent p-1 text-gray-500"
disabled={true}
>
<span className="" data-state="closed">
<Spinner className="icon-sm m-auto text-white" />
</span>
</button>
) : (
<button
disabled={!text || disabled}
className={cn(
'absolute rounded-lg rounded-md border border-black p-0.5 p-1 text-white transition-colors enabled:bg-black disabled:bg-black disabled:text-gray-400 disabled:opacity-10 dark:border-white dark:bg-white enabled:dark:bg-white dark:disabled:bg-white ',
'bottom-1.5 right-1.5 md:bottom-2.5 md:right-3 md:p-[2px]',
)}
data-testid="send-button"
type="submit"
>
<span className="" data-state="closed">
<SendIcon size={24} />
</span>
</button>
)}
data-testid="send-button"
type="submit"
>
<span className="" data-state="closed">
<SendIcon size={24} />
</span>
</button>
</>
);
}
38 changes: 36 additions & 2 deletions client/src/components/Chat/Messages/HoverButtons.tsx
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
import { useState } from 'react';
import { useRecoilState } from 'recoil';
import type { TConversation, TMessage } from 'librechat-data-provider';
import { Clipboard, CheckMark, EditIcon, RegenerateIcon, ContinueIcon } from '~/components/svg';
import { useGenerationsByLatest, useLocalize } from '~/hooks';
import {
Clipboard,
CheckMark,
EditIcon,
RegenerateIcon,
ContinueIcon,
VolumeIcon,
VolumeMuteIcon,
} from '~/components/svg';
import { useGenerationsByLatest, useLocalize, useTextToSpeech } from '~/hooks';
import { cn } from '~/utils';
import store from '~/store';

type THoverButtons = {
isEditing: boolean;
Expand Down Expand Up @@ -31,6 +41,10 @@ export default function HoverButtons({
const { endpoint: _endpoint, endpointType } = conversation ?? {};
const endpoint = endpointType ?? _endpoint;
const [isCopied, setIsCopied] = useState(false);
const [isSpeaking, setIsSpeaking] = useState(false);
const { synthesizeSpeech, cancelSpeech } = useTextToSpeech();
const [TextToSpeech] = useRecoilState<boolean>(store.TextToSpeech);

const { hideEditButton, regenerateEnabled, continueSupported } = useGenerationsByLatest({
isEditing,
isSubmitting,
Expand All @@ -51,8 +65,28 @@ export default function HoverButtons({
enterEdit();
};

const toggleSpeech = () => {
if (isSpeaking) {
cancelSpeech();
setIsSpeaking(false);
} else {
synthesizeSpeech(message?.text ?? '', () => setIsSpeaking(false));
}
setIsSpeaking(!isSpeaking);
};

return (
<div className="visible mt-0 flex justify-center gap-1 self-end text-gray-400 lg:justify-start">
{TextToSpeech && (
<button
className="hover-button rounded-md p-1 pl-0 text-gray-400 hover:text-gray-950 dark:text-gray-400/70 dark:hover:text-gray-200 disabled:dark:hover:text-gray-400 md:group-hover:visible md:group-[.final-completion]:visible"
onClick={toggleSpeech}
type="button"
title={isSpeaking ? localize('com_ui_stop_speaking') : localize('com_ui_speak')}
>
{isSpeaking ? <VolumeMuteIcon /> : <VolumeIcon />}
</button>
)}
<button
className={cn(
'hover-button rounded-md p-1 pl-0 text-gray-400 hover:text-gray-950 dark:text-gray-400/70 dark:hover:text-gray-200 disabled:dark:hover:text-gray-400 md:group-hover:visible md:group-[.final-completion]:visible',
Expand Down
2 changes: 1 addition & 1 deletion client/src/components/Input/Generations/Regenerate.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ export default function Regenerate({ onClick }: TGenButtonProps) {
return (
<Button onClick={onClick}>
<RegenerateIcon className="h-3 w-3 flex-shrink-0 text-gray-600/90 dark:text-gray-400" />
{localize('com_ui_regenerate')}
{localize('com_ui_regenerate')}
</Button>
);
}
2 changes: 1 addition & 1 deletion client/src/components/Input/Generations/Stop.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ export default function Stop({ onClick }: TGenButtonProps) {
return (
<Button type="stop" onClick={onClick}>
<StopGeneratingIcon className="text-gray-600/90 dark:text-gray-400 " />
{localize('com_ui_stop')}
{localize('com_ui_stop')}
</Button>
);
}