Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable deep searching of specific URLs #92

Closed
miurla opened this issue Apr 28, 2024 · 13 comments · Fixed by #155
Closed

Enable deep searching of specific URLs #92

miurla opened this issue Apr 28, 2024 · 13 comments · Fixed by #155
Labels
enhancement New feature or request

Comments

@miurla
Copy link
Owner

miurla commented Apr 28, 2024

When a specific URL is included in the query, retrieve the data from that URL instead of searching.

@miurla miurla added the enhancement New feature or request label Apr 28, 2024
@miurla
Copy link
Owner Author

miurla commented Apr 29, 2024

Try this: https://github.com/mendableai/firecrawl

@albertdbio
Copy link
Contributor

Firecrawl is pretty slow when I tested it
Is this fine? I'm sure the user would understand as long as we provide a spinner

@albertdbio
Copy link
Contributor

Ahh nevermind, the crawl api which scrapes multiple pages is slow, just scraping one page is decent

@miurla
Copy link
Owner Author

miurla commented Apr 30, 2024

It's a good try!
I got the following advice on our Discord. I'm going to try this.

You don't need firecrawl for this, IMHO, can just use Playwright/fetch and then use turndown to convert html to markdown

@albertdbio
Copy link
Contributor

albertdbio commented Apr 30, 2024

Firecrawl uses playwright + turndown 😆
We can also use this one https://jina.ai/reader

I'll give it a shot, already got the researcher able to refer to the previous sources it pulled, just that the types are a bit tricky.
Just need to integrate those API's

I'll work on this one first and I'll see if I can get back to video section next week, it's been busy at work

@albertdbio
Copy link
Contributor

jina seems to me to be a tad bit faster btw, if you wanna try it out

@miurla
Copy link
Owner Author

miurla commented Apr 30, 2024

Thank you. I was stuck with the chat history feature. I'll be able to start working on a different task soon.

@albertdbio
Copy link
Contributor

Hey @miurla, fixed the types related to enabling follow up conversations and also found a way to enable function calling with groq so now I'm moving forward with adding FireCrawl, planning to push a PR by Sunday!

@miurla
Copy link
Owner Author

miurla commented May 2, 2024

@albertdbio Wow, that's great. Is the function calling an implementation that uses groq's sdk to replace the researcher's model? Looking forward to the PR.

@miurla
Copy link
Owner Author

miurla commented May 3, 2024

@albertdbio I've merged significant changes, so it's best to develop from main.

@albertdbio
Copy link
Contributor

albertdbio commented May 3, 2024

@miurla not using the groq sdk, it turns out the vercel ai sdk needed to be upgraded to a minor version. That minor version set's content as optional in the OpenAI zod schema. Groq doesn't return content when doing function calls so it would throw an invalid response error. Going to test it out with the inquire agent to verify that it's working.
Sounds good, I'll use main!

@albertdbio
Copy link
Contributor

Got caught up with work over the weekend. Will continue chipping away at it this week. First I'll rebase on to main.

@miurla
Copy link
Owner Author

miurla commented May 18, 2024

I'm working on this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants