Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I don't know how to use autocomplete or it is not working for me. #20

Open
AtmanActive opened this issue Feb 17, 2024 · 9 comments
Open
Labels
question Further information is requested

Comments

@AtmanActive
Copy link

Hi,

Thank you for this great software.

Unfortunately, I can't make autocomplete work on my computer.

This is on Windows 10 Pro x64, VSCodium v1.85.1, Release 23348, Privy v0.2.7.

The code explanation panel is working fine, at least that one shows errors so I could fix the problems and get it working.

The autocomplete function, on the other hand, I have no idea how to make it work, and I don't understand why it is not working.
As soon as I try to turn it on, my VSCodium's autocomplete stops working altogether.

Ollama is working fine, Privy looks like it is connecting fine, the LLM model seems to be matched correctly, the Privy's VSCodium output pannel is showing like it is working correctly, sifting through my file source lines, but autocomplete is nowhere to be found.

Please let me know if I can provide some more debug info.

Thank you for your help.

@AtmanActive AtmanActive added the question Further information is requested label Feb 17, 2024
@srikanth235
Copy link
Owner

Hi,

Thank you for trying the extension.

You can troubleshoot this by using the manual mode available for the autocomplete feature. Once you enable the manual mode by choosing "manual" in the settings section, you can trigger the autocomplete feature manually by using the default keyboard shortcut of Alt + \.

Screenshot 2024-02-17 at 20 41 53

You can check whether autocomplete requests are being triggered or not, by looking at the output panel. If it works, you should see the autocomplete prompts like the one below.

Screenshot 2024-02-17 at 20 36 36

Please let me know if this helps you further.

Thanks,
Srikanth.

@srikanth235
Copy link
Owner

One of our users faced the same issue. It happened as this picked the wrong model variant for autocompletion. Please use deepseek-coder:{1.3b or 6.7b or 33b }-base or codellama:{7b or 13b or 34b}-code family of models for autocompletion.

@AtmanActive
Copy link
Author

Thank you for your help.

I just started it again with the mission to record a video for you and now, all of a sudden, it seems like it is working.
This time I started deepseek-coder:latest and made sure to put the same value in Privy's Autocomplete: Model VSCodium settings.

There is some gray text that shows up that is not really an autocomplete suggestion but it does look similar to your screenshot in your README.md.

So, instead of getting autocomplete suggestions, I'm actually getting this: It seems like you've posted a piece of code that is not complete or well-formatted. Could you please provide more details about what exactly the problematic section does? Are there any errors in it, and if so, which ones are they? Also note whether this part should be placed inside an Apache module (if yes), standalone script(s) like a Perl one line or as per your requirement.

I guess this is just LLM being stupid, nothing to do with your extension.

I believe my previous inability to make this work was due to me not fully understanding how to work with Ollama on windows. It seems to me now that each and every time I type ollama run XXX, Ollama is adding that LLM engine on top, instead of overwriting the old engine, as I thought. I can see this by CPU/GPU/RAM usage going up and up and up. So, silly me, after testing/starting several LLM models with Ollama, I actually had them all running at once on my machine making Ollama crawl to a halt. Only when I right-click on my Ollama tray icon and choose Quit Ollama do I see all of the resources freed up again. So, on my previous tests, it seemed like it is working, but it was actually so slow that it looked frozen. Even though, Privy's VSCodium output log was showing that something is happening.

By the way, my advice would be to add timestamps to your log lines, since now I can't tell when was that log output generated. Was it generated right now, or am I looking at something from few minutes back? I can't tell. If each log line would start with a unix timestamp, or a simple clock like XX:YY:ZZ, then I would be able to follow the logs.

Since it is now kind of working, I will be running it for a few days to see how it behaves. Will report back here.

Thanks.

@srikanth235
Copy link
Owner

srikanth235 commented Feb 21, 2024

Thanks a lot for the detailed write-up. This helps us assist other users too!

As for autocomplete suggestions, please use only base models like deepseek-coder:1.3b-base, deepseek-coder:6.7b-base etc. The model deepseek-coder:latest you mentioned is instruct-tuned model and hence only suited for the chat.

Hope this helps!

@AtmanActive
Copy link
Author

Ah yes, thank you for the clarification.

It is so easy to get lost in these Ollama models.
The very fact that there is deepseek-coder:6.7b and then there is deepseek-coder:6.7b-base is very confusing.
Unless someone gives the exact copy-paste commands, people can easily end up running the wrong model.

OK, so now I have the situation that I originally reported: everything is running, but Privy's autocomplete suggestions are either non-existent, also blocking VSCodium's autocomplete box, or sometimes, very rarely, are shown on the screen but are complete nonsense.

Let me explain where I am coming from: I'm a professional full stack developer and I work on a project with ~30000 source code files, some Perl, some SQL, some HTML, some JavaScript and some CSS. For the last several years I've been using Atom IDE with TabNineLocal, a local CPU-bound autocomplete AI engine which, honestly, felt like miracle from day one. TabNine could easily guess my whole lines so my coding would just be: type a few letters, hit TAB a few times and the line is finished. TabNine could never write whole code snippets for me, that's true, but would supercharge autocomplete so much that my coding would become 10x faster.

I read people saying the same thing about Github Copilot, but I don't want to run anything that can't work locally on my machine. I was hoping Privy would be the same or better.

Unfortunately, right now, for the reasons I am not entirely sure about, Privy is, more or less, not working at all for me. My VSCodium's built-in autocomplete works decent enough when Privy is disabled, but, when Privy is enabled with these base models, not only that I don't get any useful autocomplete suggestions from Privy itself, but also, my VSCodium's autocomplete suggestions box doesn't show up at all. So, right now, using Privy I am down to 1x coding speed. 😃 When I disable Privy, I am at least at some 3x coding speed due to VSCodium's built-in autocomplete, which is not very smart, it is limited to one word only, but at least helps somewhat.

Running deepseek-coder:6.7b-base, I can see my RTX4070Ti GPU load at 90%, I can see Privy's log output filling up, but, obviously this LLM model is either crashing with nothing useful to say, or, rarely, it is saying total nonsense.

Running deepseek-coder:1.3b-base, I can see my RTX4070Ti GPU load at 52%, I can see Privy's log output filling up, but, obviously this LLM model is either crashing with nothing useful to say, or, rarely, it is saying total nonsense.

I guess these LLM models can't really digest huge Perl files, and also, I guess, Privy hangs while waiting for the LLM, thus blocking VSCodium's autocomplete suggestions box.

These are my early experiences with Privy+Ollama.

Please let me know if I can provide any debug info.

Thanks.

@srikanth235
Copy link
Owner

I guess these LLM models can't really digest huge Perl files, and also, I guess, Privy hangs while waiting for the LLM, thus blocking VSCodium's autocomplete suggestions box.

This is a mistake from our side. We didn't add support for Perl earlier. The new release v0.2.8 should fix this. Have added timestamps too as part of this release. Please give this one a try!

Unfortunately, right now, for the reasons I am not entirely sure about, Privy is, more or less, not working at all for me. My VSCodium's built-in autocomplete works decent enough when Privy is disabled, but, when Privy is enabled with these base models, not only that I don't get any useful autocomplete suggestions from Privy itself, but also, my VSCodium's autocomplete suggestions box doesn't show up at all. So, right now, using Privy I am down to 1x coding speed. 😃 When I disable Privy, I am at least at some 3x coding speed due to VSCodium's built-in autocomplete, which is not very smart, it is limited to one word only, but at least helps somewhat.

There is a manual mode setting for Privy's autocomplete feature. You can use it in-tandem with your VSCodium editor autocomplete suggestions. I do agree with the overall sentiment of your observation. If the tool doesn't help you with becoming more productive, there is no point in using it. The space of coding LLMs is moving fast and we're working towards integrating the relevant pieces into Privy. Hopefully, in the coming days, the output will match your expectations 😃.

@theashishmaurya
Copy link

theashishmaurya commented Feb 23, 2024

Anyway, I am using the gemma:2b model for chat, but for some reason, autocomplete does not work for me. neither with codellama:7b.
Can we only use them with the base model?
In the log, I am getting the response but this does not populate the code.
image

Sometimes it does get us completion but, it has a lot of context, I understand you asked to use the base model as of now but a low-end mac can't handle that much load when multiple servers and docker servers are running.
Is there any way we can improve this? I will have a look at the code base and try to find a solution but let me know if you have any idea in you mind.
image

@srikanth235
Copy link
Owner

We ran benchmarks for gemma-7b model against the existing ones, but results weren't encouraging :(

Screenshot 2024-02-22 at 17 38 38

For autocompletion, the recommended model is deepseek-coder:{1.3b or 6.7b or 33b }-base. I'm currently (16GB Mac M1 Pro) using 1.3b variant as it is lightweight and still gives reasonable output.

Please take a look at recommended models section in our README.md file.

The benchmark image is generated using benchllama. You can use it for testing different models on your system and choose accordingly.

@nonetrix
Copy link

nonetrix commented Apr 14, 2024

It's working for me but the results are really disappointing with deepseek-coder:6.7-base like for example I write check_for_profile() { and it outputs something like this or doesn't generate anything at all, or sometimes just copy and pastes my code. The chat interface however is quite useful. I think something is wrong because it keeps giving me strange tokens like </pre>
image
image
Not sure if I am just expecting too much from the model but this is just really bad
Also here is another one, completely fails at the first line and again adds the weird </pre> token
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants