Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoDev Catalyser #30

Closed
2 of 5 tasks
phodal opened this issue May 14, 2024 · 0 comments
Closed
2 of 5 tasks

AutoDev Catalyser #30

phodal opened this issue May 14, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@phodal
Copy link
Member

phodal commented May 14, 2024

Validations

  • I'm not able to find an open issue that requests the same enhancement

Problem

A catalysis for coding

Solution

No response

Todos

  • Code RAG
  • Code Extract
  • Symbol Search
  • Document RAG
@phodal phodal added the enhancement New feature or request label May 14, 2024
phodal added a commit that referenced this issue May 14, 2024
Add Catalyser class for semantic code search functionality.
phodal added a commit that referenced this issue May 14, 2024
Refactor the retrieveContextItems function in DefaultRetrieval.ts for clarity and consistency.
phodal added a commit that referenced this issue May 14, 2024
- Update method name from getHydeTemplate to renderHydeTemplate for clarity.
phodal added a commit that referenced this issue May 14, 2024
- Add evaluation step to prompt processing for keyword analysis.
phodal added a commit that referenced this issue May 14, 2024
Added console.log statements for search results and evaluation step in Catalyser.ts for better debugging and visibility.
phodal added a commit that referenced this issue May 14, 2024
Added functionality to request user input in the sidebar webview.

This commit enhances the extension by allowing the sidebar webview to request user input, improving user interaction.
phodal added a commit that referenced this issue May 14, 2024
- Update class and method names from RankedKeywords to QuestionKeywords
- Rename file from RankedKeywords.ts to QuestionKeywords.ts
- Adjust references and tests accordingly.
phodal added a commit that referenced this issue May 15, 2024
- Update method names in LlmProvider for clarity and consistency.
- Replace `instance()` with `codeCompletion()` and `chatCompletion()`.
phodal added a commit that referenced this issue May 15, 2024
Add enum for LLM strategy options and update code and chat completion methods to accept a strategy parameter. Default strategy set for each method.
phodal added a commit that referenced this issue May 15, 2024
Moved keyword context interfaces from the prompt management file to the search strategy file for better organization and cohesion.
phodal added a commit that referenced this issue May 15, 2024
- Refactor HydeKeywordsStrategy and Catalyser for optimization.
phodal added a commit that referenced this issue May 15, 2024
Added support for text ranges in code search to improve code retrieval accuracy and display.
phodal added a commit that referenced this issue May 15, 2024
- Update search strategy steps to include 'Retrieve' step in HydeKeywordsStrategy.ts and HydeStep.ts.
phodal added a commit that referenced this issue May 15, 2024
Refactor step handling in HydeKeywordsStrategy.ts for better clarity and maintainability.
phodal added a commit that referenced this issue May 15, 2024
- Added logging for the instruction parameter in the HydeKeywordsStrategy file to aid debugging and monitoring.
phodal added a commit that referenced this issue May 15, 2024
…#30

Refactor query term creation in HydeKeywordsStrategy to improve readability and maintainability.
phodal added a commit that referenced this issue May 15, 2024
The `HydeKeywordsStrategy` class is refactored to prioritize text code in a specific order: current document, recently documents, and all documents. This improves the generation of keywords from a query and the retrieval of similar code by symbols.
phodal added a commit that referenced this issue May 15, 2024
- Refactored code chunking and indexing logic for better performance and readability.
- Updated the `Chunker` classes to include the `language` property in the generated chunks.
- Added the `languageFromPath` function to determine the language based on the file extension.
- Optimized the `BasicChunker` and `ConstructCodeChunker` to pass the `language` parameter to the generated chunks.
- Updated the database schema and queries to include the `language` column.
- Updated the test cases to reflect the changes in the code chunking process.
phodal added a commit that referenced this issue May 15, 2024
phodal added a commit that referenced this issue May 15, 2024
phodal added a commit that referenced this issue May 15, 2024
phodal added a commit that referenced this issue May 15, 2024
Refactor the code for handling embeddings in the LanceDbIndex file. Instead of directly assigning the result of the embeddingsProvider.embed() method to the embeddings variable, wrap it in a try-catch block. This change allows for better error handling and logging when embedding chunks fails.
phodal added a commit that referenced this issue May 15, 2024
This commit adds support for Tree-sitter language tags in the code search feature. It includes the addition of language-specific tag schemas for various languages such as QL, Go, Elm, C++, C, Elixir, Java, PHP, Python, Rust, Ruby, OCaml, and TypeScript. These schemas define the structure and naming conventions for different code elements like classes, functions, methods, modules, and more. Additionally, the commit also modifies the CodeSnippetsCodebaseIndex and LanceDbIndex files to import the necessary dependencies and update file paths.
phodal added a commit that referenced this issue May 16, 2024
…30

Add support for generating code snippets that can be returned as answers to code search engine queries. The snippets should be written in a programming or markup language likely given the query and should be between 5 and 10 lines long. The snippets are surrounded by triple backticks for proper formatting.
phodal added a commit that referenced this issue May 16, 2024


Add support for generating synthetic documents based on user input to improve semantic search recall. This includes the implementation of the `HydeCodeStrategy` class, which generates synthetic documents based on the query and extracts code. The strategy first tries semantic search and falls back to code search if there are no or few results. Additionally, the code includes the extraction of the Chunk with `NamedElement` and `extra_chunks` to a class or function.

Related files:
- `src/code-search/search-strategy/HydeCodeStrategy.ts`
- `src/test/codesearch/strategy/RankedKeywords.test.ts`
phodal added a commit that referenced this issue May 16, 2024
Add logging statements to print the keywords and chunks during the code search process.
phodal added a commit that referenced this issue May 16, 2024
- Added a new optional `language` parameter to the `retrieveChunks` and `retrieveContextItems` functions in `HydeKeywordsStrategy.ts` and `DefaultRetrieval.ts` respectively.
- The `language` parameter allows for filtering results based on the specified programming language.
- Modified the SQL query in `FullTextSearch.ts` to include the `language` parameter in the WHERE clause when retrieving chunks.
phodal added a commit that referenced this issue May 20, 2024
- Add support for creating a new session in the autoDev commands.
phodal added a commit that referenced this issue May 20, 2024
Ensure proper handling of empty code snippets in search strategies.

- Add checks for empty code chunks and provide appropriate feedback.
- Display a message when failing to embed a query or chunk.
phodal added a commit that referenced this issue May 20, 2024
Ensure proper handling when no ranges are available to avoid errors.
phodal added a commit that referenced this issue May 20, 2024
- Update the UUID generation method to use the `crypto.randomUUID()` function for better security and uniqueness.
phodal added a commit that referenced this issue May 20, 2024
Add a session check before initializing the LocalEmbeddingProvider to prevent redundant initialization.
phodal added a commit that referenced this issue May 21, 2024
Added methods to retrieve git history and changes by commit hash in the Retrieval and GitAction classes. This allows for indexing and searching relative to commit history.
phodal added a commit that referenced this issue May 21, 2024
Added an option to retrieve Git changes in the code search functionality. This includes changes to the `DefaultRetrieval.ts`, `IdeAction.ts`, and `Retrieval.ts` files. The new feature allows for the retrieval of Git changes based on commit messages using the TfIdfSemanticChunkSearch.
phodal added a commit that referenced this issue May 21, 2024
- Added `withGitChange` option to `HydeKeywordsStrategy` and `HydeCodeStrategy`.
- Changed return type of `search` method in `TfIdfSemanticChunkSearch` from `string[]` to `number[]`.
- Updated `retrieveGit` method in `Retrieval` to include a threshold parameter and improved commit retrieval logic.
- Reduced the default number of commits retrieved in `getHistoryCommits` method in `GitAction`.
- Updated `diffWith` method in `GitAction` to return a joined string of changes.
- Replaced `stopwords` with `ourStopwords` in `Tfidf` to allow for custom stopwords.
phodal added a commit that referenced this issue May 21, 2024
…tions #30

Increased the maximum number of history commits from 50 to 500 in `getHistoryCommits` method. Added type annotations to `diffResult` and `diffs` variables in `getRepositoryChanges` and `getChangeByHash` methods respectively. Also, added a comment to fix `getChangeByHashInRepo` method in the future as the official API doesn't support this feature yet.
phodal added a commit that referenced this issue May 21, 2024
…ements #30

The GitAction class in the editor-api module has been significantly enhanced. A new method `exec` has been added to execute git commands. The method `_getRepo` has been refactored to use the new `gitApi` promise. The `constructor` has been updated to initialize `gitApi` and `gitExecutablePath`. Additionally, copyright notice has been added at the beginning of the file.
phodal added a commit that referenced this issue May 21, 2024
…fIdfSemanticChunkSearch #30

Replaced 'withGitChange' option with 'withCommitMessageSearch' in HydeKeywordsStrategy, HydeCodeStrategy, and DefaultRetrieval. Renamed 'TfIdfSemanticChunkSearch' to 'TfIdfChunkSearch' across multiple files. Added error handling for git retrieval in DefaultRetrieval. Updated test cases to reflect these changes.
phodal added a commit that referenced this issue May 21, 2024
The commit message search feature has been enabled in HydeKeywordsStrategy. The retrieval process in DefaultRetrieval has been updated to accommodate this change. The number of retrievals is reduced when searching commit messages due to their length. A TODO has been added in GitAction to split show by multiple content with git changes.
phodal added a commit that referenced this issue May 21, 2024
Added a new GitParser module to the editor-api that can parse Git logs into a structured format. This includes the status of the file (added, modified, deleted) and the changes made. Also updated the GitAction module to use this new parser when retrieving changes by hash. Added corresponding tests to verify the correct functionality of the parser.
phodal added a commit that referenced this issue May 21, 2024
Add LlmReranker prompt support to PromptManager and templates.
phodal added a commit that referenced this issue May 21, 2024
- Update LLMReranker constructor to use OpenAICompletion.
- Refactor complete method in OpenAICompletion for better abstraction.
phodal added a commit that referenced this issue May 21, 2024
- Update embedding provider instantiation to use non-null assertion operator.

- Export `OllamaEmbeddingsProvider` class instead of default export.

- Add a todo comment to load configuration for creating an embedding provider.
phodal added a commit that referenced this issue May 21, 2024
Refactor the embedding provider management by introducing a new `EmbeddingsProviderManager` namespace. This change includes renaming `LocalEmbeddingProvider` to `LocalEmbeddingsProvider`, updating references, and centralizing provider creation and initialization logic.
phodal added a commit that referenced this issue May 21, 2024
- Add support for multiple provider types including Local, OpenAI, and Ollama.
phodal added a commit that referenced this issue May 21, 2024
- Update ParsedFileChange interface to include content field.
- Refactor GitParser and GitAction to handle content changes.
- Return ParsedFileChange array instead of concatenated string.
phodal added a commit that referenced this issue May 21, 2024
A new method, `pathSimilarity`, has been added to the `JaccardSimilarity` class. This method calculates the similarity score between a path and a set of strings. Corresponding test case has also been added to ensure the correct functionality of the method.
phodal added a commit that referenced this issue May 21, 2024
The commit introduces the Jaccard similarity algorithm to improve the search results in the code search feature. It filters out results with a similarity score less than 0.5 to ensure more relevant search results.
phodal added a commit that referenced this issue May 21, 2024
Refactor Jaccard similarity calculation method for efficiency and clarity.
phodal added a commit that referenced this issue May 21, 2024
Renamed DomainTerm to TeamTerm and updated related services and files.
phodal added a commit that referenced this issue May 21, 2024
- Added support for custom team terms with a new CSV file and updated service to handle custom prompts directory.
- Updated TeamTerm interface to include clearer documentation for term and language fields.
- Refactored SettingService to handle legacy config migration for custom prompts directory.
phodal added a commit that referenced this issue May 21, 2024
Added new terms to the team terms CSV file and improved the TeamTermService by adding a singleton pattern and fetch method for retrieving terms.
phodal added a commit that referenced this issue May 21, 2024
- Added QueryExpansion class to handle query expansion by replacing terms with their localized versions along with term identifiers.
phodal added a commit that referenced this issue May 21, 2024
- Refactored query expansion logic to enhance performance and accuracy.
@phodal phodal closed this as completed May 22, 2024
phodal added a commit that referenced this issue May 22, 2024
Refactored TreeSitterFileManager to improve cache handling and update mechanism. Removed unnecessary imports and adjusted the usage of TreeSitterFileManager in AutoDevCodeLensProvider. Also, updated the pub-sub mechanism in TreeSitterFile to support one-time listeners.
phodal added a commit that referenced this issue May 22, 2024
The Map used for caching in TreeSitterFileManager has been replaced with an LRUCache. This change should improve the performance and efficiency of the cache management.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant