Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📥 feat: Import Conversations from LibreChat, ChatGPT, Chatbot UI #2355

Merged
merged 57 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
231e70a
Basic implementation of ChatGPT conversation import
DenisPalnitsky Apr 8, 2024
f3dceec
remove debug code
DenisPalnitsky Apr 8, 2024
5a971b4
Handle citations
DenisPalnitsky Apr 8, 2024
3b7b160
Fix updatedAt in import
DenisPalnitsky Apr 8, 2024
6df3a8b
update default model
DenisPalnitsky Apr 9, 2024
9ce357d
Use job scheduler to handle import requests
DenisPalnitsky Apr 9, 2024
02548a1
import job status endpoint
DenisPalnitsky Apr 9, 2024
68eacf6
Add wrapper around Agenda
DenisPalnitsky Apr 10, 2024
5e94867
Rate limits for import endpoint
DenisPalnitsky Apr 10, 2024
f067c90
rename import api path
DenisPalnitsky Apr 10, 2024
d9f7a30
Batch save import to mongo
DenisPalnitsky Apr 10, 2024
78c48d9
Improve naming
DenisPalnitsky Apr 10, 2024
ad9da4e
Add documenting comments
DenisPalnitsky Apr 11, 2024
304a419
Test for importers
DenisPalnitsky Apr 12, 2024
97b5ae1
Change button for importing conversations
jakubmieszczak Apr 11, 2024
45a2010
Frontend changes
jakubmieszczak Apr 11, 2024
2f3fc19
Import job status endpoint
DenisPalnitsky Apr 11, 2024
fd5e798
Import endpoint response
DenisPalnitsky Apr 11, 2024
7a34d41
Add translations to new phrases
jakubmieszczak Apr 12, 2024
5d30dcc
Fix conversations refreshing
jakubmieszczak Apr 12, 2024
ae26ccc
cleanup unused functions
DenisPalnitsky Apr 12, 2024
ff559c8
set timeout for import job status polling
DenisPalnitsky Apr 12, 2024
3f2eb58
Add documentation
DenisPalnitsky Apr 12, 2024
7fe0384
get extra spaces back
DenisPalnitsky Apr 12, 2024
e00a9be
Improve error message
DenisPalnitsky Apr 12, 2024
60239b6
Fix translation files after merge
DenisPalnitsky Apr 15, 2024
45279ae
fix translation files 2
DenisPalnitsky Apr 15, 2024
491967c
Add zh translation for import functionality
DenisPalnitsky Apr 15, 2024
fe2ed86
Sync mailisearch index after import
DenisPalnitsky Apr 15, 2024
7116daf
chore: add dummy uri for jest tests, as MONGO_URI should only be real…
danny-avila Apr 24, 2024
008c81c
Merge branch 'main' into feature/chats-import
DenisPalnitsky Apr 27, 2024
fe40a5f
Merge remote-tracking branch 'librechat/main' into feature/chats-import
danny-avila May 1, 2024
1fdb6cf
docs: fix links
danny-avila May 1, 2024
61ac76c
docs: fix conversationsImport section
danny-avila May 1, 2024
84518da
fix: user role issue for librechat imports
danny-avila May 1, 2024
1ce8e8e
refactor: import conversations from json
danny-avila May 1, 2024
5ddfa53
fix: undefined metadata edge case and replace ChatGtp -> ChatGpt
danny-avila May 1, 2024
9ff802b
Refactor importChatGptConvo function to handle undefined metadata edg…
danny-avila May 1, 2024
73edea9
fix: chatgpt importer
danny-avila May 1, 2024
a66a4d1
feat: maintain tree relationship for librechat messages
danny-avila May 1, 2024
3a2e33d
chore: use enum
danny-avila May 1, 2024
63e1e0d
refactor: saveMessage to use single object arg, replace console logs,…
danny-avila May 1, 2024
0fbe3d2
chore: additional comment
danny-avila May 1, 2024
327506c
chore: multer edge case
danny-avila May 1, 2024
cd9bd1e
feat: first pass, maintain tree relationship
danny-avila May 1, 2024
7951b55
chore: organize
danny-avila May 1, 2024
f9e161d
chore: remove log
danny-avila May 1, 2024
f406aaa
ci: add heirarchy test for chatgpt
danny-avila May 2, 2024
dfc8895
ci: test maintaining of heirarchy for librechat
danny-avila May 2, 2024
00c54e1
wip: allow non-text content type messages
danny-avila May 2, 2024
8be339a
refactor: import content part object json string
danny-avila May 2, 2024
ade6971
refactor: more content types to format
danny-avila May 2, 2024
8e7f458
chore: consolidate messageText formatting
danny-avila May 2, 2024
38c9db5
docs: update on changes, bump data-provider/config versions, update r…
danny-avila May 2, 2024
edef308
refactor(indexSync): singleton pattern for MeiliSearchClient
danny-avila May 2, 2024
137c7b3
refactor: debug log after batch is done
danny-avila May 2, 2024
2a486a6
chore: add back indexSync error handling
danny-avila May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ config.local.ts
**/storageState.json
junit.xml
**/.venv/
**/venv/

# docker override file
docker-compose.override.yaml
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@
- Русский, 日本語, Svenska, 한국어, Tiếng Việt, 繁體中文, العربية, Türkçe, Nederlands, עברית
- 🤖 AI model selection: OpenAI, Azure OpenAI, BingAI, ChatGPT, Google Vertex AI, Anthropic (Claude), Plugins, Assistants API (including Azure Assistants)
- 💾 Create, Save, & Share Custom Presets
- 🎨 Customizable Dropdown & Interface: Adapts to both power users and newcomers.
- 🔄 Edit, Resubmit, and Continue messages with conversation branching
- 📥 Import Conversations from LibreChat, ChatGPT, Chatbot UI
- 📤 Export conversations as screenshots, markdown, text, json.
- 🔍 Search all messages/conversations
- 🔌 Plugins, including web access, image generation with DALL-E-3 and more
Expand Down
42 changes: 22 additions & 20 deletions api/lib/db/indexSync.js
Original file line number Diff line number Diff line change
@@ -1,32 +1,39 @@
const { MeiliSearch } = require('meilisearch');
const Message = require('~/models/schema/messageSchema');
const Conversation = require('~/models/schema/convoSchema');
const Message = require('~/models/schema/messageSchema');
const { logger } = require('~/config');

const searchEnabled = process.env?.SEARCH?.toLowerCase() === 'true';
let currentTimeout = null;

class MeiliSearchClient {
static instance = null;

static getInstance() {
if (!MeiliSearchClient.instance) {
if (!process.env.MEILI_HOST || !process.env.MEILI_MASTER_KEY) {
throw new Error('Meilisearch configuration is missing.');
}
MeiliSearchClient.instance = new MeiliSearch({
host: process.env.MEILI_HOST,
apiKey: process.env.MEILI_MASTER_KEY,
});
}
return MeiliSearchClient.instance;
}
}

// eslint-disable-next-line no-unused-vars
async function indexSync(req, res, next) {
if (!searchEnabled) {
return;
}

try {
if (!process.env.MEILI_HOST || !process.env.MEILI_MASTER_KEY || !searchEnabled) {
throw new Error('Meilisearch not configured, search will be disabled.');
}

const client = new MeiliSearch({
host: process.env.MEILI_HOST,
apiKey: process.env.MEILI_MASTER_KEY,
});
const client = MeiliSearchClient.getInstance();

const { status } = await client.health();
// logger.debug(`[indexSync] Meilisearch: ${status}`);
const result = status === 'available' && !!process.env.SEARCH;

if (!result) {
if (status !== 'available' || !process.env.SEARCH) {
throw new Error('Meilisearch not available');
}

Expand All @@ -37,12 +44,8 @@ async function indexSync(req, res, next) {
const messagesIndexed = messages.numberOfDocuments;
const convosIndexed = convos.numberOfDocuments;

logger.debug(
`[indexSync] There are ${messageCount} messages in the database, ${messagesIndexed} indexed`,
);
logger.debug(
`[indexSync] There are ${convoCount} convos in the database, ${convosIndexed} indexed`,
);
logger.debug(`[indexSync] There are ${messageCount} messages and ${messagesIndexed} indexed`);
logger.debug(`[indexSync] There are ${convoCount} convos and ${convosIndexed} indexed`);

if (messageCount !== messagesIndexed) {
logger.debug('[indexSync] Messages out of sync, indexing');
Expand All @@ -54,7 +57,6 @@ async function indexSync(req, res, next) {
Conversation.syncWithMeili();
}
} catch (err) {
// logger.debug('[indexSync] in index sync');
if (err.message.includes('not found')) {
logger.debug('[indexSync] Creating indices...');
currentTimeout = setTimeout(async () => {
Expand Down
18 changes: 18 additions & 0 deletions api/models/Conversation.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,24 @@ module.exports = {
return { message: 'Error saving conversation' };
}
},
bulkSaveConvos: async (conversations) => {
try {
const bulkOps = conversations.map((convo) => ({
updateOne: {
filter: { conversationId: convo.conversationId, user: convo.user },
update: convo,
upsert: true,
timestamps: false,
},
}));

const result = await Conversation.bulkWrite(bulkOps);
return result;
} catch (error) {
logger.error('[saveBulkConversations] Error saving conversations in bulk', error);
throw new Error('Failed to save conversations in bulk.');
}
},
getConvosByPage: async (user, pageNumber = 1, pageSize = 25) => {
try {
const totalConvos = (await Conversation.countDocuments({ user })) || 1;
Expand Down
19 changes: 19 additions & 0 deletions api/models/Message.js
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,25 @@ module.exports = {
throw new Error('Failed to save message.');
}
},

async bulkSaveMessages(messages) {
try {
const bulkOps = messages.map((message) => ({
updateOne: {
filter: { messageId: message.messageId },
update: message,
upsert: true,
},
}));

const result = await Message.bulkWrite(bulkOps);
return result;
} catch (err) {
logger.error('Error saving messages in bulk:', err);
throw new Error('Failed to save messages in bulk.');
}
},

/**
* Records a message in the database.
*
Expand Down
69 changes: 69 additions & 0 deletions api/server/middleware/importLimiters.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
const rateLimit = require('express-rate-limit');
const { ViolationTypes } = require('librechat-data-provider');
const logViolation = require('~/cache/logViolation');

const getEnvironmentVariables = () => {
const IMPORT_IP_MAX = parseInt(process.env.IMPORT_IP_MAX) || 100;
const IMPORT_IP_WINDOW = parseInt(process.env.IMPORT_IP_WINDOW) || 15;
const IMPORT_USER_MAX = parseInt(process.env.IMPORT_USER_MAX) || 50;
const IMPORT_USER_WINDOW = parseInt(process.env.IMPORT_USER_WINDOW) || 15;

const importIpWindowMs = IMPORT_IP_WINDOW * 60 * 1000;
const importIpMax = IMPORT_IP_MAX;
const importIpWindowInMinutes = importIpWindowMs / 60000;

const importUserWindowMs = IMPORT_USER_WINDOW * 60 * 1000;
const importUserMax = IMPORT_USER_MAX;
const importUserWindowInMinutes = importUserWindowMs / 60000;

return {
importIpWindowMs,
importIpMax,
importIpWindowInMinutes,
importUserWindowMs,
importUserMax,
importUserWindowInMinutes,
};
};

const createImportHandler = (ip = true) => {
const { importIpMax, importIpWindowInMinutes, importUserMax, importUserWindowInMinutes } =
getEnvironmentVariables();

return async (req, res) => {
const type = ViolationTypes.FILE_UPLOAD_LIMIT;
const errorMessage = {
type,
max: ip ? importIpMax : importUserMax,
limiter: ip ? 'ip' : 'user',
windowInMinutes: ip ? importIpWindowInMinutes : importUserWindowInMinutes,
};

await logViolation(req, res, type, errorMessage);
res.status(429).json({ message: 'Too many conversation import requests. Try again later' });
};
};

const createImportLimiters = () => {
const { importIpWindowMs, importIpMax, importUserWindowMs, importUserMax } =
getEnvironmentVariables();

const importIpLimiter = rateLimit({
windowMs: importIpWindowMs,
max: importIpMax,
handler: createImportHandler(),
});

const importUserLimiter = rateLimit({
windowMs: importUserWindowMs,
max: importUserMax,
handler: createImportHandler(false),
keyGenerator: function (req) {
return req.user?.id; // Use the user ID or NULL if not available
},
});

return { importIpLimiter, importUserLimiter };
};

module.exports = { createImportLimiters };
2 changes: 2 additions & 0 deletions api/server/middleware/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ const validateRegistration = require('./validateRegistration');
const validateImageRequest = require('./validateImageRequest');
const moderateText = require('./moderateText');
const noIndex = require('./noIndex');
const importLimiters = require('./importLimiters');

module.exports = {
...uploadLimiters,
Expand All @@ -39,5 +40,6 @@ module.exports = {
validateModel,
moderateText,
noIndex,
...importLimiters,
checkDomainAllowed,
};
52 changes: 52 additions & 0 deletions api/server/routes/convos.js
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
const multer = require('multer');
const express = require('express');
const { CacheKeys } = require('librechat-data-provider');
const { initializeClient } = require('~/server/services/Endpoints/assistants');
const { getConvosByPage, deleteConvos, getConvo, saveConvo } = require('~/models/Conversation');
const { IMPORT_CONVERSATION_JOB_NAME } = require('~/server/utils/import/jobDefinition');
const { storage, importFileFilter } = require('~/server/routes/files/multer');
const requireJwtAuth = require('~/server/middleware/requireJwtAuth');
const { createImportLimiters } = require('~/server/middleware');
const jobScheduler = require('~/server/utils/jobScheduler');
const getLogStores = require('~/cache/getLogStores');
const { sleep } = require('~/server/utils');
const { logger } = require('~/config');
Expand Down Expand Up @@ -99,4 +104,51 @@ router.post('/update', async (req, res) => {
}
});

const { importIpLimiter, importUserLimiter } = createImportLimiters();
const upload = multer({ storage: storage, fileFilter: importFileFilter });

/**
* Imports a conversation from a JSON file and saves it to the database.
* @route POST /import
* @param {Express.Multer.File} req.file - The JSON file to import.
* @returns {object} 201 - success response - application/json
*/
router.post(
'/import',
importIpLimiter,
importUserLimiter,
upload.single('file'),
async (req, res) => {
try {
const filepath = req.file.path;
const job = await jobScheduler.now(IMPORT_CONVERSATION_JOB_NAME, filepath, req.user.id);

res.status(201).json({ message: 'Import started', jobId: job.id });
} catch (error) {
logger.error('Error processing file', error);
res.status(500).send('Error processing file');
}
},
);

// Get the status of an import job for polling
router.get('/import/jobs/:jobId', async (req, res) => {
try {
const { jobId } = req.params;
const { userId, ...jobStatus } = await jobScheduler.getJobStatus(jobId);
if (!jobStatus) {
return res.status(404).json({ message: 'Job not found.' });
}

if (userId !== req.user.id) {
return res.status(403).json({ message: 'Unauthorized' });
}

res.json(jobStatus);
} catch (error) {
logger.error('Error getting job details', error);
res.status(500).send('Error getting job details');
}
});

module.exports = router;
2 changes: 1 addition & 1 deletion api/server/routes/files/index.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
const express = require('express');
const createMulterInstance = require('./multer');
const { uaParser, checkBan, requireJwtAuth, createFileLimiters } = require('~/server/middleware');
const { createMulterInstance } = require('./multer');

const files = require('./files');
const images = require('./images');
Expand Down
12 changes: 11 additions & 1 deletion api/server/routes/files/multer.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,16 @@ const storage = multer.diskStorage({
},
});

const importFileFilter = (req, file, cb) => {
if (file.mimetype === 'application/json') {
cb(null, true);
} else if (path.extname(file.originalname).toLowerCase() === '.json') {
cb(null, true);
} else {
cb(new Error('Only JSON files are allowed'), false);
}
};

const fileFilter = (req, file, cb) => {
if (!file) {
return cb(new Error('No file provided'), false);
Expand All @@ -42,4 +52,4 @@ const createMulterInstance = async () => {
});
};

module.exports = createMulterInstance;
module.exports = { createMulterInstance, storage, importFileFilter };
63 changes: 63 additions & 0 deletions api/server/services/AppService.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,69 @@ describe('AppService', () => {
expect(process.env.FILE_UPLOAD_USER_MAX).toEqual('initialUserMax');
expect(process.env.FILE_UPLOAD_USER_WINDOW).toEqual('initialUserWindow');
});

it('should not modify IMPORT environment variables without rate limits', async () => {
// Setup initial environment variables
process.env.IMPORT_IP_MAX = '10';
process.env.IMPORT_IP_WINDOW = '15';
process.env.IMPORT_USER_MAX = '5';
process.env.IMPORT_USER_WINDOW = '20';

const initialEnv = { ...process.env };

await AppService(app);

// Expect environment variables to remain unchanged
expect(process.env.IMPORT_IP_MAX).toEqual(initialEnv.IMPORT_IP_MAX);
expect(process.env.IMPORT_IP_WINDOW).toEqual(initialEnv.IMPORT_IP_WINDOW);
expect(process.env.IMPORT_USER_MAX).toEqual(initialEnv.IMPORT_USER_MAX);
expect(process.env.IMPORT_USER_WINDOW).toEqual(initialEnv.IMPORT_USER_WINDOW);
});

it('should correctly set IMPORT environment variables based on rate limits', async () => {
// Define and mock a custom configuration with rate limits
const importLimitsConfig = {
rateLimits: {
conversationsImport: {
ipMax: '150',
ipWindowInMinutes: '60',
userMax: '50',
userWindowInMinutes: '30',
},
},
};

require('./Config/loadCustomConfig').mockImplementationOnce(() =>
Promise.resolve(importLimitsConfig),
);

await AppService(app);

// Verify that process.env has been updated according to the rate limits config
expect(process.env.IMPORT_IP_MAX).toEqual('150');
expect(process.env.IMPORT_IP_WINDOW).toEqual('60');
expect(process.env.IMPORT_USER_MAX).toEqual('50');
expect(process.env.IMPORT_USER_WINDOW).toEqual('30');
});

it('should fallback to default IMPORT environment variables when rate limits are unspecified', async () => {
// Setup initial environment variables to non-default values
process.env.IMPORT_IP_MAX = 'initialMax';
process.env.IMPORT_IP_WINDOW = 'initialWindow';
process.env.IMPORT_USER_MAX = 'initialUserMax';
process.env.IMPORT_USER_WINDOW = 'initialUserWindow';

// Mock a custom configuration without specific rate limits
require('./Config/loadCustomConfig').mockImplementationOnce(() => Promise.resolve({}));

await AppService(app);

// Verify that process.env falls back to the initial values
expect(process.env.IMPORT_IP_MAX).toEqual('initialMax');
expect(process.env.IMPORT_IP_WINDOW).toEqual('initialWindow');
expect(process.env.IMPORT_USER_MAX).toEqual('initialUserMax');
expect(process.env.IMPORT_USER_WINDOW).toEqual('initialUserWindow');
});
});

describe('AppService updating app.locals and issuing warnings', () => {
Expand Down