Skip to content

Add interactive segmentation, stereo, and multi-polygon support#1582

Merged
BryonLewis merged 44 commits into
mainfrom
dev/add-interactive-seg-and-stereo
Jun 27, 2026
Merged

Add interactive segmentation, stereo, and multi-polygon support#1582
BryonLewis merged 44 commits into
mainfrom
dev/add-interactive-seg-and-stereo

Conversation

@mattdawkins

@mattdawkins mattdawkins commented Jan 26, 2026

Copy link
Copy Markdown
Member

Interactive segmentation (desktop)

  • New segmentationpointclick.ts recipe: point-click SAM-style masks, multi-frame support, video frame-time seek, confirm/reset/undo
  • New SegmentationPointsLayer for fg/bg prompt dots
  • Desktop native backend: unified interactive.ts Python subprocess for seg + stereo; IPC in ipcService.ts / api.ts
  • Gated to desktop runtime; loading indicator + instructions in UI
  • Editor menu Segment button (s hotkey), loading state, type-specific edit icons

Interactive stereo (desktop)

  • Stereo line transfer, length measurement, dense disparity via same interactive service
  • Warped lines become normal editable line annotations; human edits locked from auto re-warp
  • Head/tail auto-lines keyed under HeadTailLineKey
  • Two settings in TrackSettingsPanel: auto-compute on draw, auto-update lengths on edit
  • Bottom sidebar controls for interactive stereo
  • Line box aspect ratio capped at 6:1

Multicam

  • One-click cross-camera detection select/edit
  • New detection creation on any camera without breaking flow
  • LayerManager / ViewerLoader wiring for stereo track linking and warping

Annotation lifecycle

  • Finalize in-progress shape when starting new detection
  • Skip recipe confirm when nothing pending
  • Remove empty detections when nothing drawn
  • Restore prior polygon on segmentation reset
  • No new detection on background segmentation clicks; continuous mode spawns per click
  • Fix stereo track ID race; re-arm recipe on Point re-edit

Polygon / editing

  • Multi-polygon support with holes in polygonbase.ts / PolygonLayer
  • EditAnnotationLayer expanded for segmentation mask display and editing
  • Custom image cursor during annotation (AnnotatorImageCursor, useAnnotatorImageCursor)

Serialization / server

  • VIAME CSV read/write for stereo measurements and segmentation metadata (viame.ts client + server)
  • New tests in test_serialize_viame_csv.py; updated viame.spec.json

Dev / build

  • dev:electron support for seg + stereo in development
  • build:electron:dir npm script for unpacked desktop build
  • Removed NativeVideoAnnotator and text-query button (moved to separate branch)
  • apispec.ts types for seg/stereo requests

Documentation updates

  • New guide: docs/Interactive-Annotation.md — desktop interactive segmentation and stereo (setup, usage, troubleshooting)
  • User docs updated: feature table, shortcuts, quickstart, editing bar, multicam, desktop, data formats, FAQ, UI overview, screenshots, and mkdocs.yml nav
  • Developer docs updated: client/README.md, client/platform/desktop/README.md, and architecture doc for interactive service, recipes, and dev/build scripts
  • New screenshots: SegmentationMode.png, StereoSettings.png, CrossCameraEdit.png (UI mockups; swap for real captures when you can)
  • Main topics covered: point-click segmentation (s), stereo auto-warp/length settings, cross-camera editing, VIAME CSV polygons/holes/length export

@mattdawkins mattdawkins force-pushed the dev/add-interactive-seg-and-stereo branch from ca7e45e to 974ecab Compare April 29, 2026 10:03
@mattdawkins mattdawkins force-pushed the dev/add-interactive-seg-and-stereo branch 2 times, most recently from 2c52f00 to b55bd20 Compare April 29, 2026 10:39
@mattdawkins mattdawkins force-pushed the dev/add-interactive-seg-and-stereo branch from b55bd20 to 7858f2f Compare June 21, 2026 16:31
@mattdawkins mattdawkins changed the title Add interactive segmentation and text query features Add interactive segmentation, stereo, and multi-polygon support Jun 21, 2026
Brings in the SAM2/SAM3-based interactive segmentation feature, the
SAM3 text-query workflow, and the desktop interactive stereo mode.
Web-girder paths are intentionally untouched for now — web support
will come in a follow-up.

- New segmentation point-click recipe + EditorMenu wiring; SAM2/SAM3
  models loaded via VIAME install configs.
- Desktop backend: viame_segmentation_service-backed IPC handlers and
  matching frontend API for segmentationInitialize/Predict/SetImage/
  ClearImage/Shutdown/IsReady, textQuery/refineDetections/
  runTextQueryPipeline, and stereoEnable/Disable/SetFrame/GetStatus/
  TransferLine/TransferPoints/SetCalibration/IsEnabled, plus disparity
  ready/error event hooks.
- EditAnnotationLayer: track shift-key state and right-click for Point
  mode, propagate background flag for negative SAM points.
- Sidebar / ViewerLoader / Viewer: stereo annotation mode UI, error
  dialog when seg or text-query model fails to load, dot-only-on-source
  -frame fix.
- useModeManager / EditAnnotationLayer / recipes: keep existing geometry
  type when current editing mode already matches; right-click in Point
  creation finalises and deselects.
A track-frame's polygon now expands to a list of polygons each with
their own keys, and each polygon supports holes.

- Server CSV (de)serializer: emit polygon-key column per polygon, support
  holes in the geoJSON FeatureCollection; auto_key path to append a new
  polygon to an existing track frame.
- Client recipes / useModeManager: handleAddHole / handleAddPolygon /
  handleCancelCreation; PolygonLayer emits polygon-clicked.
- Hole drawing reuses the polygon edit pipeline (left-click places a
  hole vertex without exiting creation mode).
- Test fixtures cover multi-polygon and polygons-with-holes round-trip.
(cherry picked from commit c2f3cd0)
…tton)

Strip the SAM3 text-query button, dialog, API, and IPC handlers from the interactive editor, keeping segmentation and stereo intact. The full text-query feature lives on the follow-up branch dev/text-query-annot-button.
Strip the no-transcode NativeVideoAnnotator path that should not be on the
segmentation/stereo branch: removes the residual Viewer.vue async-component,
nativeVideoPath plumbing, and template branch, plus the stale settings field.
@mattdawkins mattdawkins force-pushed the dev/add-interactive-seg-and-stereo branch from 7858f2f to 20303f0 Compare June 22, 2026 16:08
BryonLewis and others added 8 commits June 24, 2026 08:33
In continuous Detection mode, each interactive-segmentation point click now finalizes its own detection and immediately starts a fresh one, instead of refining a single detection. Non-continuous mode is unchanged: clicks still accumulate to refine one detection until confirmed. Frame-navigation preview restores are excluded so they don't spuriously create detections.
Capture the completed track ID before newTrackSettingsAfterLogic, which
in continuous detection mode spawns a new track and changes
selectedTrackId, so stereo annotation-complete events attach to the
correct detection. Also re-activate the segmentation recipe when
re-editing a finalized Point detection so clicks resume predicting.

Ported from viame/master (2cad9aa, 4008a1f).
Add a _removeIfEmpty helper and call it from selectTrack when leaving
edit mode, so a detection created but never drawn (e.g. clicked away or
right-click deselect) doesn't linger as an empty track. handleEscapeMode
now reuses the same helper.

Ported from viame/master (4139f78).
handleConfirmRecipe now returns early unless an active segmentation
recipe has a pending prediction or was explicitly reset, so the
contextmenu event from a right-click that enters Point edit mode no
longer immediately deselects/deactivates before any points are placed.
Adds a wasReset flag on the recipe to allow finalizing after a reset.

Ported from viame/master (c4d149c, 69bc0f8).
Wire the edit layer's finalizeInProgress through a handler callback that
handleAddTrackOrDetection invokes, so pressing 'n' or starting a new
detection commits a valid in-progress polygon (or discards it) instead
of leaving it dangling. Also commit a pending segmentation prediction
before the track switch so a reset-on-deselect doesn't leave an empty
detection.

Ported from viame/master (842a20c, dea9653).
In continuous mode a background (negative) click is a refinement of the
current mask, not a new object, so it should no longer commit and start a
fresh detection.
The reset button only restored a default-key ('') polygon, so resetting a
detection whose existing polygon was segmentation-keyed removed it
entirely. Capture the pre-existing polygon keys in the snapshot and
restore all original polygon geometry, removing only segmentation-added
polygons.

Ported from viame/master.
@mattdawkins

Copy link
Copy Markdown
Member Author

Bryon — pushed 6 commits porting interactive-seg/stereo fixes that exist on viame/master but weren't on this branch (the seg/stereo work is further along there). Each commit is attributed to its master origin. Built and smoke-tested locally.

Added

  • Stereo track-ID race: capture the completed track id before newTrackSettingsAfterLogic (continuous mode swaps selectedTrackId), so stereo annotation-complete events attach to the right detection (2cad9aa1)
  • _removeIfEmpty: drop detections created but never drawn (4139f78e)
  • wasReset flag + skip-confirm guard in handleConfirmRecipe (c4d149c0, 69bc0f8c)
  • Re-arm the seg recipe when re-editing a Point detection (4008a1ff)
  • Finalize the in-progress shape on new detection / n — wires up the existing but unused finalizeInProgress (842a20c2, dea96533)
  • Continuous mode: don't spawn a new detection on background/negative clicks
  • Reset now restores the prior polygon (incl. segmentation-keyed), removing only segmentation-added polygons — the branch version restored only the default-key ('') polygon, so resetting a re-segmented detection deleted it (ported from master; kept your fishLength/attributes/stereo-line restore)

Please review — these touch your integration work:

  1. handleConfirmRecipe now returns early when no active seg recipe has a pending prediction or wasReset. That gates your onStereoSegmentationFinalize?.() and the deselect. Given your right-click→confirm-annotation change (00e42438), please confirm there's no case where you want the stereo finalize/deselect to fire on an empty confirm.
  2. The Point re-edit re-arm sets segRecipe.active.value = true directly, bypassing your async initializeServiceFn / segInitialized path. If the SAM service isn't initialized when re-editing, this skips your loading/error handling — worth a look.

Happy to adjust either if they don't match the intent.

mattdawkins and others added 9 commits June 25, 2026 00:41
ViewerLoader already builds a getFrameTime (frame/fps) and the backend
already forwards frame_time to the service, but the recipe ignored it and
never set frameTime on the predict request, so interactive segmentation
on video datasets couldn't seek to the current frame. Accept getFrameTime
in the recipe and include frameTime in the request.

Ported from viame/master (23ccb25, c363531).
- Await set_frame (ensureStereoFrame) before transfer in the draw handler,
  so drawing on the frame stereo was enabled on no longer stalls in the
  backend's 120s deferred-disparity wait. Factor the duplicated set-frame
  logic (enable kickoff + frame watcher) into the one helper.
- Use renderer-safe path helpers instead of npath.* (node 'path' is
  externalized under contextIsolation -> "npath is not defined").
- Declare the missing stereoCameraFps ref ("stereoCameraFps is not defined").
Port the warped-line fixes from viame/master so the line transferred to
the second camera is a normal line-mode-editable annotation:

- Preserve the source line's key through the transfer (key: params.key
  instead of '') and thread it through StereoAnnotationCompleteParams.
- Emit head/tail Point markers alongside the LineString so endpoint
  handles render and can be dragged.
- Expand the warped bounds by 10% to match the source side (headtail.ts).
- Preserve editing mode when left-clicking onto a camera that already has
  the selected track (the warped annotation), so it can be adjusted
  immediately.
In interactive stereo, only the user may modify a line a human authored:

- Mark a camera's line human-authored when the user draws/edits it (the
  stereo warp writes geometry directly and never fires this event, so the
  event firing always means a human edit).
- Warp source -> other only when the other side is absent or still
  machine-generated. Once the other side has been hand-edited it is frozen;
  further edits on the first camera no longer overwrite it (and vice versa).
- Length keeps tracking the shifting geometry, except when length_method is
  'user_set' (a new detection attribute the user can set to lock a length);
  the stereo update then leaves that length alone while still refreshing
  range/midpoint. Auto-computed lengths record length_method = 'stereo'.
A near-horizontal/vertical line otherwise produced a razor-thin box. After
the usual 10% expansion, grow the shorter side about its center until the
longer:shorter ratio is at most 6:1. Applied to both the drawn box
(headtail.ts tightBoundsExpanded) and the stereo-warped box (ViewerLoader).
Clicking a detection in a camera that isn't selected used to be ignored
(LayerManager Clicked early-returned), so it took one click to switch
cameras and another to act on the detection. Now that click switches to the
clicked camera and acts on the detection in the same click: left-click
selects it, right-click edits it. Select-then-edit keeps the result
deterministic, and it bails if a mode (e.g. linking) blocked the switch.
When creating a new detection, the creation cursor is now live on every
camera (not just the selected one), and a draw is routed to whichever camera
it lands on (switch + materialize the new track there). Works for all
creation types.

- LayerManager: enable the edit layer in creation mode on non-selected
  cameras (isCreatingNewDetection); route the drawn shape to the drawn-on
  camera in the update:geojson handler; suppress the select that rides the
  click which finalizes a shape, and the first-corner click in the
  cross-camera branch, so an overlapping detection isn't grabbed instead.
- Viewer: don't intercept camera-view mousedown mid-creation (it would
  preventDefault the rectangle drag); let the draw land and route.

Known limitation: a line's 2nd vertex landing inside an existing detection
still selects that detection (event-ordering quirk); accepted for now.
@mattdawkins

Copy link
Copy Markdown
Member Author

I'm seeing one outstanding issue on this - when I make a new line in interactive stereo mode and it makes it on the other camera, I seem to be able to edit the line endpoints on the other camera but I can't on the first camera where I originally drew the first line. In line edit mode, I can move the vertices, but it ignores them and as soon as I click off the detection it reverts back to the older line (left or right click). When editing the vertices the vertices themselves move but the drawn line stays in the original position (and after getting off the detection the vertices move back to their original location.

applyStereoLine stored the stereo/segmentation-created LineString under an
empty key, so it wasn't recognized by the HeadTail recipe and edit-layer like
hand-drawn lines. Store it under HeadTailLineKey ('HeadTails') instead, matching
hand-drawn head/tail lines.

(The controller-init guard from the source commit is already covered here by the
getViewerFrame() helper, so only the keying fix is ported.)
@mattdawkins mattdawkins force-pushed the dev/add-interactive-seg-and-stereo branch from a975061 to 438fb10 Compare June 26, 2026 19:35
mattdawkins and others added 5 commits June 26, 2026 15:58
Replace the single 'Interactive Mode' stereo toggle with two independent Stereo
Settings controls: 'Update lengths when modified' (on by default) recomputes the
stereo measurement when a linked line is modified, and 'Auto-compute location on
other camera' (off by default) warps an annotation to the other camera when it
has no detection there yet. The backend service starts whenever either feature
is on, and the load-time auto-enable degrades silently on failure.

@BryonLewis BryonLewis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding:

  • There is still some questions as regards to the user experience for editing tracks in interactive mode after the inital drawing. It is entially not meant to update but can occasionally update until the user selects another track. There is also some weirdness in drawing on the right camera. I don't think this should be a blocker because of all of the other features/benefits this PR provides.

I've updated the PR description to be a high level summary of all of our updates and what has changed for this PR.

@BryonLewis BryonLewis merged commit 555a31f into main Jun 27, 2026
3 checks passed
@BryonLewis BryonLewis deleted the dev/add-interactive-seg-and-stereo branch June 27, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants