Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LangChain YoutubeLoader #154

Open
andrewnguonly opened this issue Apr 3, 2024 · 4 comments
Open

Add LangChain YoutubeLoader #154

andrewnguonly opened this issue Apr 3, 2024 · 4 comments

Comments

@andrewnguonly
Copy link
Owner

https://js.langchain.com/docs/integrations/document_loaders/web_loaders/youtube

@Draculabo
Copy link
Contributor

I will finish it in a few days.

@Draculabo
Copy link
Contributor

Draculabo commented Apr 10, 2024

js.langchain.com/docs/integrations/document_loaders/web_loaders/youtube

https://github.com/langchain-ai/langchainjs/blob/d6e25af137873493d30bdf5732d46b842e421ffa/langchain/src/document_loaders/web/youtube.ts
I encountered some issues while developing YoutubeLoader.

  1. Recently, YouTube changed their API interface response fields, causing the original youtubei.js library to become ineffective. Specific problems can be viewed at the following link, which I have fixed according to the guidelines. However, for certain videos, they may not necessarily return subtitles.
    Github
    GitHub
  2. When sending fetch requests in Chrome extensions, it will automatically include the current origin and cannot be modified, which may result in our requests being intercepted. You can view the following links:
    javascript - Overridding XMLHttpRequest Prototype For Chrome Extension - Stack Overflow
    javascript - Chrome Extension: how to change origin in AJAX request header? - Stack Overflow
    Perhaps we can try not using YoutubeLoader and instead use a browser search engine API such as Serper API (Serper - The World's Fastest and Cheapest Google Search API)

@andrewnguonly
Copy link
Owner Author

Let's pause this feature for now. It looks like the issue from item 1 was also reported in LangChainJS's repo: langchain-ai/langchainjs#4994. Maybe we can push a fix to LangChainJS. It looks like several other people have implemented workarounds/solutions.

Regarding item 2, were you running the document loader from the background script or from the extension popup?

@Draculabo
Copy link
Contributor

Let's pause this feature for now. It looks like the issue from item 1 was also reported in LangChainJS's repo: langchain-ai/langchainjs#4994. Maybe we can push a fix to LangChainJS. It looks like several other people have implemented workarounds/solutions.让我们暂时暂停此功能。看起来第 1 项中的问题也在 LangChainJS 的存储库中报告: langchain-ai/langchainjs#4994 .也许我们可以向LangChainJS推送修复程序。看起来其他几个人已经实施了变通方法/解决方案。

Regarding item 2, were you running the document loader from the background script or from the extension popup?关于第 2 项,您是从后台脚本还是从扩展弹出窗口运行文档加载器?

I run the document loader from the extension popup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants