Skip to content

Feature/tree sitter#3306

Draft
sdottaka wants to merge 104 commits into
masterfrom
feature/tree-sitter
Draft

Feature/tree sitter#3306
sdottaka wants to merge 104 commits into
masterfrom
feature/tree-sitter

Conversation

@sdottaka

@sdottaka sdottaka commented Apr 6, 2026

Copy link
Copy Markdown
Member

No description provided.

Thorium and others added 9 commits March 26, 2026 19:49
Integrate tree-sitter as an optional syntax highlighting engine that
supplements the existing keyword-based CrystalEdit parsers. When a
grammar DLL and highlight query (.scm) are present in the
TreeSitterGrammars directory, tree-sitter provides full AST-based
highlighting; otherwise the existing parser runs unchanged.

Core components:
- TreeSitterParser.h/.cpp: CTreeSitterParser, CTreeSitterColorMap,
  CTreeSitterLanguage, and TreeSitterRegistry classes
- ParseLine virtual override in CMergeEditView for tree-sitter results
- Incremental parsing via ts_tree_edit() on each edit operation
- Lazy reparse with dirty flag (fires once per paint cycle)
- Status bar indicator showing [TS:language] in encoding pane
- Post-build step to copy grammar DLLs from Release to Debug/Test

Supported languages: bash, c, c-sharp, cpp, css, dtd, flow, fsharp,
fsharp_signature, go, html, java, javascript, json, php, php_only,
python, ruby, rust, tsx, typescript, xml.

Grammar DLLs are built separately via build-grammars.ps1.
- build-grammars.ps1: downloads and compiles grammar DLLs from GitHub
  releases using MSVC cl.exe/link.exe
- grammars.json: defines 17 grammar repos and release tags
- fsharp-highlights.scm: F# syntax highlight queries for tree-sitter
Wire in scope-aware highlighting (locals.scm) and language injection
(injections.scm) alongside the existing highlights.scm support.

- CTreeSitterLanguage: add LoadQuery() helper, load all three .scm files
- CTreeSitterParser: add RunLocalsQuery() for scope/def/ref tracking,
  RunInjectionQuery() for embedded language highlighting, GetSetProperty()
  for #set! predicate parsing; RunHighlightQuery() cross-references locals
- TreeSitterRegistry: add GetLanguageForName() for injection language lookup
- build-grammars.ps1: resolve and copy locals.scm and injections.scm files
- Fix type mismatch (RefInfo vs PendingRef) and remove dead code
- Add tree-sitter shared items to solution and projects
- Update SampleStatic project to include tree-sitter
- Fix build-grammars.ps1 to use Git Bash explicitly
- Add missing <algorithm> include
- Minor solution cleanup and add Italian translation
* fix: bundle inherited tree-sitter queries for grammars

Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0

Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com>

* refine tree-sitter query bundling helpers

Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0

Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com>

* polish tree-sitter query bundle handling

Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0

Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com>

* Earlier CoPilot feedback addressed.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com>
Comment thread Src/TreeSitterParser.cpp Fixed
Comment thread Externals/tree-sitter/lib/include/tree_sitter/api.h Fixed
sdottaka and others added 9 commits April 12, 2026 07:03
* Doc - Italian language - Updated (#3319)

* Update Italian.po

* Fix issue #3321: [BUG] Incorrect string used with beta releases

* Show error message when entering path in header bar (#3322)

* Prioritize explicitly selected plugins over archive detection (#3324)

* Prioritize explicitly selected plugins over archive detection

* Update Src/7zCommon.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Src/7zCommon.cpp

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Use 7-Zip IsArc API for archive detection and refactor format guessing logic (#3323)

* Use 7-Zip IsArc API for archive detection and refactor format guessing logic

* Update ArchiveSupport/Merge7z/Merge7zCommon.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Restore extension-only fallback in GuessFormatEx and handle NEED_MORE result

Agent-Logs-Url: https://github.com/WinMerge/winmerge/sessions/47af4d0f-fc0a-4e33-ab81-8ec95c0f599e

Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com>

* Use 7-Zip IsArc API for archive detection and refactor format guessing logic (2)

* Use 7-Zip IsArc API for archive detection and refactor format guessing logic (3)

* Prioritize explicitly selected plugins over archive detection

* Update Src/7zCommon.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Src/7zCommon.cpp

* Update Merge7zCommon.cpp

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com>

* Merge7z: Bump revision to 2600.1

* Merge7z: Bump revision to 2600.1 (2)

* Update French Manual (#3325)

* Refactor: unify open parameters and move recurse to OpenFolderParams (#3326)

* Update Manual/French.po

* Refactor: unify open parameters and move recurse to OpenFolderParams (#3326) (2)

(cherry picked from commit 83af229)

* Add Folder comparison mode with archive extraction support (#3320)

* Update Manual/French.po

* Update Brazilian.po (#3328)

Added translation for "Add Folder comparison mode with archive extraction support (#3320)"

* Update German.po (#3329)

* update zh-cn translation (#3331)

* Update Turkish.po (#3333)

New string entries

* Update Korean (#3334)

* Code review fixes for 5 oldest source files#3327 #1

* Code review fixes for 5 oldest source files#3327 #2

* Update Turkish.po

* Update TranslationsStatus

* Update ChangeLog&ReleaseNotes

* Italian language (#3335)

* Stabilize tree-sitter highlight precedence

Make overlapping captures resolve deterministically so syntax colors stay consistent across panes and languages. Also accept local.* capture prefixes so newer query conventions keep local symbol highlighting working.

* Unify tree-sitter block ordering

Use one parser-wide block order counter so injected-language highlights cannot collide with primary highlight ordering when the final precedence tie-breaker runs.

---------

Co-authored-by: bovirus <1262554+bovirus@users.noreply.github.com>
Co-authored-by: Takashi Sawanaka <sdottaka@users.sourceforge.net>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com>
Co-authored-by: t3chnob0y <t3chnob0y@users.noreply.github.com>
Co-authored-by: Marcellomco <70959309+Marcellomco@users.noreply.github.com>
Co-authored-by: René T. Nicolaus <12006431+Havoc7891@users.noreply.github.com>
Co-authored-by: YG <1246410+yingang@users.noreply.github.com>
Co-authored-by: bilimiyorum <131397022+bilimiyorum@users.noreply.github.com>
Co-authored-by: VenusGirl❤ <venusgirl@outlook.com>
* Finish tree-sitter runtime integration for compare views

Wire the runtime grammar bundle, compare-view UI, and same-file navigation together so tree-sitter features are actually available in built binaries. This also updates the F# grammar bundle to include tags and disables Go to Definition when the current caret position cannot resolve.

* Fix tree-sitter follow-up packaging issues

Guard the WiX grammar component reference when harvested files are absent, and remove the redundant TreeSitterWrapper include to avoid the _T macro redefinition warning.
# Conflicts:
#	ArchiveSupport/Merge7z/BuildArc.cmd
#	Docs/Users/ChangeLog.html
#	Docs/Users/ChangeLog.md
#	Docs/Users/ReleaseNotes.html
#	Docs/Users/ReleaseNotes.md
#	DownloadDeps.cmd
#	Src/FilepathEdit.cpp
#	Src/Merge.vcxproj.filters
#	Src/res/new_folder.bmp
#	Translations/TranslationsStatus.md
#	Translations/WinMerge/Arabic.po
#	Translations/WinMerge/Basque.po
#	Translations/WinMerge/Brazilian.po
#	Translations/WinMerge/Bulgarian.po
#	Translations/WinMerge/Catalan.po
#	Translations/WinMerge/ChineseSimplified.po
#	Translations/WinMerge/ChineseTraditional.po
#	Translations/WinMerge/Corsican.po
#	Translations/WinMerge/Croatian.po
#	Translations/WinMerge/Czech.po
#	Translations/WinMerge/Danish.po
#	Translations/WinMerge/Dutch.po
#	Translations/WinMerge/English.pot
#	Translations/WinMerge/Finnish.po
#	Translations/WinMerge/French.po
#	Translations/WinMerge/Galician.po
#	Translations/WinMerge/German.po
#	Translations/WinMerge/Greek.po
#	Translations/WinMerge/Hebrew.po
#	Translations/WinMerge/Hungarian.po
#	Translations/WinMerge/Italian.po
#	Translations/WinMerge/Japanese.po
#	Translations/WinMerge/Korean.po
#	Translations/WinMerge/Lithuanian.po
#	Translations/WinMerge/Norwegian.po
#	Translations/WinMerge/Persian.po
#	Translations/WinMerge/Polish.po
#	Translations/WinMerge/Portuguese.po
#	Translations/WinMerge/Romanian.po
#	Translations/WinMerge/Russian.po
#	Translations/WinMerge/Serbian.po
#	Translations/WinMerge/Sinhala.po
#	Translations/WinMerge/Slovak.po
#	Translations/WinMerge/Slovenian.po
#	Translations/WinMerge/Spanish.po
#	Translations/WinMerge/Swedish.po
#	Translations/WinMerge/Tamil.po
#	Translations/WinMerge/Turkish.po
#	Translations/WinMerge/Ukrainian.po
#	Translations/WinMerge/Vietnamese.po
…s and FolderCompare projects are not yet buildable. MFC dependencies still need to be removed from TreeSitterParser.
* Fix tree-sitter go to definition from context menus

Update right-click navigation to resolve the symbol under the mouse and prefer tagged type definitions when the position-based lookup stays on the current line.

* Update tree-sitter context-menu definition handling
# Conflicts:
#	Src/Merge.vcxproj
#	Src/MergeDoc.cpp
#	Src/MergeDoc.h

@github-advanced-security github-advanced-security AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

sdottaka added 10 commits May 31, 2026 05:41
Replace ITextBuffer* parameter in NotifyEdit with TextEdit struct.
Move notification to buffer layer (AddUndoRecord) for consistency.
- Move TreeSitterParser and TreeSitterWrapper from Externals/crystaledit/editlib to Src/
- Move tree-sitter library from Externals/crystaledit/editlib/ to Externals/ (top-level)
- Remove TreeSitter references from editlibparsers.vcxitems (CrystalEdit shared items)
- Update include paths in WinMerge source files to reference local TreeSitter headers
- Update project files and solution configuration

This decouples tree-sitter from CrystalEdit, making CrystalEdit a pure text editor
library while keeping tree-sitter as a WinMerge-specific feature.
…esign

Remove stored buffer reference from CTreeSitterParser and pass ITextBuffer*
explicitly to methods that need it. This eliminates hidden state and makes
buffer dependencies explicit at call sites.

Changes:
- Remove m_pBuffer, SetBuffer(), and GetBuffer() from CTreeSitterParser
- Add ITextBuffer* parameter to FindDefinition() and TryGetTagDefinitionByNameAt()
- Introduce TreeSitterParseContext struct to hold both parser and buffer references
- Update MergeDoc to create and own TreeSitterParseContext instances
- Update ParseLineTreeSitter() to use context for lazy reparse with explicit buffer
- Update all call sites in MergeEditView to pass buffer parameter
Keep only the highest priority highlight when multiple captures match
the same token range, preventing conflicting color indices.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates tree-sitter as an optional/alternative syntax parsing backend in WinMerge (alongside the existing CrystalEdit line-based parsers), adds runtime-loaded grammar/query support, introduces a Tree-sitter mode option in Editor settings, and wires up Go to Definition (F12) using tree-sitter tags/locals.

Changes:

  • Add a new CTreeSitterParser implementation with registry/grammar DLL loading, highlight caching, locals/tags resolution, and optional injection highlighting.
  • Add UI/config plumbing for selecting Tree-sitter preference order and register parser factories accordingly.
  • Package TreeSitter grammar/query assets in build scripts/installer and add new language IDs / color scheme strings for additional formats.

Reviewed changes

Copilot reviewed 39 out of 40 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
WinMerge.vs2017.sln Adds tree-sitter shared-items project and SharedMSBuildProjectFiles entries (currently with a path mismatch).
WinMerge.sln Adds tree-sitter shared-items project and SharedMSBuildProjectFiles entries.
Translations/WinMerge/StringBlacklist.txt Adds newly supported language names to translation blacklist.
Testing/GoogleTest/UnitTests/UnitTests.vcxproj.filters Includes Src\TreeSitterParser.cpp in UnitTests project filters.
Testing/GoogleTest/UnitTests/UnitTests.vcxproj Imports tree-sitter shared items and compiles TreeSitterParser.cpp in UnitTests project.
Testing/FolderCompare/FolderCompare.vcxproj.filters Includes Src\TreeSitterParser.cpp in FolderCompare test project filters.
Testing/FolderCompare/FolderCompare.vcxproj Imports tree-sitter shared items and compiles TreeSitterParser.cpp in FolderCompare test project.
Src/TreeSitterParser.h Introduces public API for tree-sitter language loading, color mapping, parser, and registry/factory.
Src/TreeSitterParser.cpp Implements tree-sitter parsing/highlighting, locals/tags, injections, caching, and registry lazy-loading.
Src/resource.h Adds Tree-sitter UI control/menu IDs and expands color-scheme IDs for new languages.
Src/PropEditor.h Adds m_nTreeSitterMode option field to editor options panel.
Src/PropEditor.cpp Binds OPT_TREE_SITTER_MODE and populates Tree-sitter mode combo (contains a “Built-in” typo).
Src/OptionsInit.cpp Initializes default value for OPT_TREE_SITTER_MODE.
Src/OptionsDef.h Defines OPT_TREE_SITTER_MODE.
Src/MergeEditView.h Adds tree-sitter parser member + Go to Definition hooks.
Src/MergeEditView.cpp Wires Go to Definition command + incremental edit notifications for the on-demand parser.
Src/Merge.vcxproj.filters Adds TreeSitterParser source/header to Merge project filters.
Src/Merge.vcxproj Imports tree-sitter shared items and compiles TreeSitterParser.
Src/Merge.rc Adds Go to Definition menu item + accelerator, editor option UI controls, and new strings (contains a “Built-in” typo).
Src/Merge.cpp Registers parser factories in preferred order based on OPT_TREE_SITTER_MODE (missing default case).
Installer/InnoSetup/WinMergeX86.iss Installs TreeSitterGrammars directory into {app}.
Installer/InnoSetup/WinMergeX64NonAdmin.iss Installs TreeSitterGrammars directory into {app}.
Installer/InnoSetup/WinMergeX64.iss Installs TreeSitterGrammars directory into {app}.
Installer/InnoSetup/WinMergeX64.is6.iss Installs TreeSitterGrammars directory into {app}.
Installer/InnoSetup/WinMergeARM64.is6.iss Installs TreeSitterGrammars directory into {app}.
Externals/versions.txt Documents tree-sitter and included grammar versions.
Externals/tree-sitter.vcxitems Adds shared-items build of tree-sitter lib.c + include paths.
Externals/crystaledit/editlib/TextDefinition.h Adds new LanguageId entries (F# signature, Markdown, TSX, TypeScript, YAML).
Externals/crystaledit/editlib/TextDefinition.cpp Adds new TextDefinitions and adjusts JS/TS extension mapping.
Externals/crystaledit/editlib/editlib.vcxitems.filters Normalizes formatting (line-numbered diff) without functional changes.
DownloadDeps.cmd Downloads tree-sitter grammar packs and stages TreeSitterGrammars into Build output.
Docs/Users/Contributors.txt Adds tree-sitter and grammar projects to external components list.
BuildArc.cmd Packages TreeSitterGrammars into distribution ZIP layout.
ALL.vs2017.sln Adds tree-sitter shared-items project + SharedMSBuildProjectFiles entries (currently with a path mismatch).
ALL.sln Adds tree-sitter shared-items project + SharedMSBuildProjectFiles entries.
.gitmodules Adds tree-sitter and tree-sitter-grammars submodules.
.github/workflows/main.yml Enables core.longpaths before recursive submodule checkout.
.github/workflows/codeql-analysis.yml Enables core.longpaths before recursive submodule checkout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Src/TreeSitterParser.cpp
Comment thread Src/TreeSitterParser.cpp Outdated
Comment thread Src/TreeSitterParser.cpp Outdated
Comment on lines +1336 to +1357
uint32_t parentStartRow = inj.startPoint.row + capStart.row;
uint32_t parentEndRow = inj.startPoint.row + capEnd.row;
uint32_t parentStartCol = (capStart.row == 0)
? inj.startPoint.column + capStart.column
: capStart.column;

// Add to parent's line blocks
for (uint32_t row = parentStartRow;
row <= parentEndRow && row < static_cast<uint32_t>(m_nLineCount);
row++)
{
uint32_t byteCol = (row == parentStartRow) ? parentStartCol : 0;
int charPos = byteCol / sizeof(wchar_t);

TreeSitterLineBlock block;
block.nCharPos = charPos;
block.nColorIndex = colorIndex;
block.nPriority = MakeCapturePriority(sCapName,
ts_node_start_byte(capNode), ts_node_end_byte(capNode));
block.nOrder = NextBlockOrder();
m_lineBlocks[row].push_back(block);
}
Comment thread Src/TreeSitterParser.cpp Outdated
Comment thread Src/Merge.cpp
Comment thread WinMerge.vs2017.sln
Comment thread ALL.vs2017.sln
Comment thread ALL.vs2017.sln
Comment thread ALL.vs2017.sln
Comment thread ALL.vs2017.sln
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants