You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm not sure if this would add anything substantially different/better than the existing tree-sitter implementation, but I spent some time recently looking into stack graphs + related tooling, and I'm wondering if there may be benefits to this project to look at using tree-sitter-graph and/or the stack-graphs project (eg. tree-sitter-stack-graphs-javascript, etc); rather than just plain tree-sitter.
A few notes/links/references I recently collated RE: stack graphs + related libs:
Stack Graphs (an evolution of Scope Graphs) sound like they could be really interesting/useful with regards to code navigation, symbol mapping, etc. Perhaps we could use them for module identification, or variable/function identifier naming stabilisation or similar?
Precise code navigation is now available for all TypeScript repositories.
Precise code navigation gives more accurate results by only considering the set of classes, functions, and imported definitions that are visible at a given point in your code.
Precise code navigation is powered by stack graphs, a new open source framework we’ve created that lets you define the name binding rules for a programming language using a declarative, domain-specific language (DSL). With stack graphs, we can generate code navigation data for a repository without requiring any configuration from the repository owner, and without tapping into a build process or other CI job.
LOTS of interesting stuff in this post..
As part of developing stack graphs, we’ve added a new graph construction language to Tree-sitter, which lets you construct arbitrary graph structures (including but not limited to stack graphs) from parsed CSTs. You use stanzas to define the gadget of graph nodes and edges that should be created for each occurrence of a Tree-sitter query, and how the newly created nodes and edges should connect to graph content that you’ve already created elsewhere.
tree-sitter-graph
The tree-sitter-graph library defines a DSL for constructing arbitrary graph structures from source code that has been parsed using tree-sitter.
To dig even deeper and learn more, I encourage you to check out my Strange Loop talk and the stack-graphs crate: our open source Rust implementation of these ideas.
Stack graphs
The crates in this repository provide a Rust implementation of stack graphs, which allow you to define the name resolution rules for an arbitrary programming language in a way that is efficient, incremental, and does not need to tap into existing build or program analysis tools.
tree-sitter-stack-graphs definition for JavaScript
This project defines tree-sitter-stack-graphs rules for JavaScript using the tree-sitter-javascript grammar.
The command-line program for tree-sitter-stack-graphs-javascript lets you do stack graph based analysis and lookup from the command line.
tree-sitter-stack-graphs definition for TypeScript
This project defines tree-sitter-stack-graphs rules for TypeScript using the tree-sitter-typescript grammar.
The command-line program for tree-sitter-stack-graphs-typescript lets you do stack graph based analysis and lookup from the command line.
Incremental, zero-config Code Navigation using stack graphs.
In this talk I’ll describe stack graphs, which use a graphical notation to define the name binding rules for a programming language. They work equally well for dynamic languages like Python and JavaScript, and for static languages like Go and Java. Our solution is fast — processing most commits within seconds of us receiving your push. It does not require setting up a CI job, or tapping into a project-specific build process. And it is open-source, building on the tree-sitter project’s existing ecosystem of language tools.
Precise and search-based navigation
Certain languages supported by GitHub have access to precise code navigation, which uses an algorithm (based on the open source stack-graphs library) that resolves definitions and references based on the set of classes, functions, and imported definitions that are visible at any given point in your code. Other languages use search-based code navigation, which searches all definitions and references across a repository to find entities with a given name. Both strategies are effective at finding results and both make sure to avoid inappropriate results such as comments, but precise code navigation can give more accurate results, especially when a repository contains multiple methods or functions with the same name.
Scope graphs provide a new approach to defining the name binding rules of programming languages. A scope graph represents the name binding facts of a program using the basic concepts of declarations and reference associated with scopes that are connected by edges. Name resolution is defined by searching for paths from references to declarations in a scope graph. Scope graph diagrams provide an illuminating visual notation for explaining the bindings in programs.
Improving GPT-4's codebase understanding with ctags
I understand it not being a current priority; but to discount the concept entirely (particularly without reasoning beyond seemingly personal opinion) seems counterintuitive to getting the best agent/outcome here.
Further to this, aider just set a new SOTA and topped the SWE-bench lite leaderboard with 26.3%. While all of that performance gain can't be attributed to just their smart code search/repo map'; I would happily bet that it helped it achieve it:
Aider scored 26.3% on the SWE Bench Lite benchmark, achieving a state-of-the-art result. The current top leaderboard entry is 20.3% from Amazon Q Developer Agent. The best result reported elsewhere seems to be 25% from OpenDevin.
It would be interesting to see how aider's existing repo map compares/contrasts with stack graphs/similar; and whether that would improve the performance on the SWE-bench lite even further.
I'm not sure if this would add anything substantially different/better than the existing
tree-sitter
implementation, but I spent some time recently looking into stack graphs + related tooling, and I'm wondering if there may be benefits to this project to look at usingtree-sitter-graph
and/or thestack-graphs
project (eg.tree-sitter-stack-graphs-javascript
, etc); rather than just plaintree-sitter
.A few notes/links/references I recently collated RE: stack graphs + related libs:
The text was updated successfully, but these errors were encountered: