Add interactive segmentation, stereo, and multi-polygon support#1582
Conversation
ca7e45e to
974ecab
Compare
2c52f00 to
b55bd20
Compare
b55bd20 to
7858f2f
Compare
Brings in the SAM2/SAM3-based interactive segmentation feature, the SAM3 text-query workflow, and the desktop interactive stereo mode. Web-girder paths are intentionally untouched for now — web support will come in a follow-up. - New segmentation point-click recipe + EditorMenu wiring; SAM2/SAM3 models loaded via VIAME install configs. - Desktop backend: viame_segmentation_service-backed IPC handlers and matching frontend API for segmentationInitialize/Predict/SetImage/ ClearImage/Shutdown/IsReady, textQuery/refineDetections/ runTextQueryPipeline, and stereoEnable/Disable/SetFrame/GetStatus/ TransferLine/TransferPoints/SetCalibration/IsEnabled, plus disparity ready/error event hooks. - EditAnnotationLayer: track shift-key state and right-click for Point mode, propagate background flag for negative SAM points. - Sidebar / ViewerLoader / Viewer: stereo annotation mode UI, error dialog when seg or text-query model fails to load, dot-only-on-source -frame fix. - useModeManager / EditAnnotationLayer / recipes: keep existing geometry type when current editing mode already matches; right-click in Point creation finalises and deselects.
A track-frame's polygon now expands to a list of polygons each with their own keys, and each polygon supports holes. - Server CSV (de)serializer: emit polygon-key column per polygon, support holes in the geoJSON FeatureCollection; auto_key path to append a new polygon to an existing track frame. - Client recipes / useModeManager: handleAddHole / handleAddPolygon / handleCancelCreation; PolygonLayer emits polygon-clicked. - Hole drawing reuses the polygon edit pipeline (left-click places a hole vertex without exiting creation mode). - Test fixtures cover multi-polygon and polygons-with-holes round-trip.
(cherry picked from commit c2f3cd0)
(cherry picked from commit 3db1995)
(cherry picked from commit b7d4fa3)
(cherry picked from commit f5d015a)
…tton) Strip the SAM3 text-query button, dialog, API, and IPC handlers from the interactive editor, keeping segmentation and stereo intact. The full text-query feature lives on the follow-up branch dev/text-query-annot-button.
Strip the no-transcode NativeVideoAnnotator path that should not be on the segmentation/stereo branch: removes the residual Viewer.vue async-component, nativeVideoPath plumbing, and template branch, plus the stale settings field.
7858f2f to
20303f0
Compare
The rebase onto main dropped the opening <template> tag, so vite parsed the root <div> as a custom block and the electron build failed. Restore it.
In continuous Detection mode, each interactive-segmentation point click now finalizes its own detection and immediately starts a fresh one, instead of refining a single detection. Non-continuous mode is unchanged: clicks still accumulate to refine one detection until confirmed. Frame-navigation preview restores are excluded so they don't spuriously create detections.
Capture the completed track ID before newTrackSettingsAfterLogic, which in continuous detection mode spawns a new track and changes selectedTrackId, so stereo annotation-complete events attach to the correct detection. Also re-activate the segmentation recipe when re-editing a finalized Point detection so clicks resume predicting. Ported from viame/master (2cad9aa, 4008a1f).
Add a _removeIfEmpty helper and call it from selectTrack when leaving edit mode, so a detection created but never drawn (e.g. clicked away or right-click deselect) doesn't linger as an empty track. handleEscapeMode now reuses the same helper. Ported from viame/master (4139f78).
handleConfirmRecipe now returns early unless an active segmentation recipe has a pending prediction or was explicitly reset, so the contextmenu event from a right-click that enters Point edit mode no longer immediately deselects/deactivates before any points are placed. Adds a wasReset flag on the recipe to allow finalizing after a reset. Ported from viame/master (c4d149c, 69bc0f8).
Wire the edit layer's finalizeInProgress through a handler callback that handleAddTrackOrDetection invokes, so pressing 'n' or starting a new detection commits a valid in-progress polygon (or discards it) instead of leaving it dangling. Also commit a pending segmentation prediction before the track switch so a reset-on-deselect doesn't leave an empty detection. Ported from viame/master (842a20c, dea9653).
In continuous mode a background (negative) click is a refinement of the current mask, not a new object, so it should no longer commit and start a fresh detection.
The reset button only restored a default-key ('') polygon, so resetting a
detection whose existing polygon was segmentation-keyed removed it
entirely. Capture the pre-existing polygon keys in the snapshot and
restore all original polygon geometry, removing only segmentation-added
polygons.
Ported from viame/master.
|
Bryon — pushed 6 commits porting interactive-seg/stereo fixes that exist on Added
Please review — these touch your integration work:
Happy to adjust either if they don't match the intent. |
ViewerLoader already builds a getFrameTime (frame/fps) and the backend already forwards frame_time to the service, but the recipe ignored it and never set frameTime on the predict request, so interactive segmentation on video datasets couldn't seek to the current frame. Accept getFrameTime in the recipe and include frameTime in the request. Ported from viame/master (23ccb25, c363531).
- Await set_frame (ensureStereoFrame) before transfer in the draw handler,
so drawing on the frame stereo was enabled on no longer stalls in the
backend's 120s deferred-disparity wait. Factor the duplicated set-frame
logic (enable kickoff + frame watcher) into the one helper.
- Use renderer-safe path helpers instead of npath.* (node 'path' is
externalized under contextIsolation -> "npath is not defined").
- Declare the missing stereoCameraFps ref ("stereoCameraFps is not defined").
Port the warped-line fixes from viame/master so the line transferred to the second camera is a normal line-mode-editable annotation: - Preserve the source line's key through the transfer (key: params.key instead of '') and thread it through StereoAnnotationCompleteParams. - Emit head/tail Point markers alongside the LineString so endpoint handles render and can be dragged. - Expand the warped bounds by 10% to match the source side (headtail.ts). - Preserve editing mode when left-clicking onto a camera that already has the selected track (the warped annotation), so it can be adjusted immediately.
In interactive stereo, only the user may modify a line a human authored: - Mark a camera's line human-authored when the user draws/edits it (the stereo warp writes geometry directly and never fires this event, so the event firing always means a human edit). - Warp source -> other only when the other side is absent or still machine-generated. Once the other side has been hand-edited it is frozen; further edits on the first camera no longer overwrite it (and vice versa). - Length keeps tracking the shifting geometry, except when length_method is 'user_set' (a new detection attribute the user can set to lock a length); the stereo update then leaves that length alone while still refreshing range/midpoint. Auto-computed lengths record length_method = 'stereo'.
A near-horizontal/vertical line otherwise produced a razor-thin box. After the usual 10% expansion, grow the shorter side about its center until the longer:shorter ratio is at most 6:1. Applied to both the drawn box (headtail.ts tightBoundsExpanded) and the stereo-warped box (ViewerLoader).
Clicking a detection in a camera that isn't selected used to be ignored (LayerManager Clicked early-returned), so it took one click to switch cameras and another to act on the detection. Now that click switches to the clicked camera and acts on the detection in the same click: left-click selects it, right-click edits it. Select-then-edit keeps the result deterministic, and it bails if a mode (e.g. linking) blocked the switch.
When creating a new detection, the creation cursor is now live on every camera (not just the selected one), and a draw is routed to whichever camera it lands on (switch + materialize the new track there). Works for all creation types. - LayerManager: enable the edit layer in creation mode on non-selected cameras (isCreatingNewDetection); route the drawn shape to the drawn-on camera in the update:geojson handler; suppress the select that rides the click which finalizes a shape, and the first-corner click in the cross-camera branch, so an overlapping detection isn't grabbed instead. - Viewer: don't intercept camera-view mousedown mid-creation (it would preventDefault the rectangle drag); let the draw land and route. Known limitation: a line's 2nd vertex landing inside an existing detection still selects that detection (event-ordering quirk); accepted for now.
|
I'm seeing one outstanding issue on this - when I make a new line in interactive stereo mode and it makes it on the other camera, I seem to be able to edit the line endpoints on the other camera but I can't on the first camera where I originally drew the first line. In line edit mode, I can move the vertices, but it ignores them and as soon as I click off the detection it reverts back to the older line (left or right click). When editing the vertices the vertices themselves move but the drawn line stays in the original position (and after getting off the detection the vertices move back to their original location. |
applyStereoLine stored the stereo/segmentation-created LineString under an
empty key, so it wasn't recognized by the HeadTail recipe and edit-layer like
hand-drawn lines. Store it under HeadTailLineKey ('HeadTails') instead, matching
hand-drawn head/tail lines.
(The controller-init guard from the source commit is already covered here by the
getViewerFrame() helper, so only the keying fix is ported.)
a975061 to
438fb10
Compare
Replace the single 'Interactive Mode' stereo toggle with two independent Stereo Settings controls: 'Update lengths when modified' (on by default) recomputes the stereo measurement when a linked line is modified, and 'Auto-compute location on other camera' (off by default) warps an annotation to the other camera when it has no detection there yet. The backend service starts whenever either feature is on, and the load-time auto-enable degrades silently on failure.
…tion functionality
BryonLewis
left a comment
There was a problem hiding this comment.
Outstanding:
- There is still some questions as regards to the user experience for editing tracks in interactive mode after the inital drawing. It is entially not meant to update but can occasionally update until the user selects another track. There is also some weirdness in drawing on the right camera. I don't think this should be a blocker because of all of the other features/benefits this PR provides.
I've updated the PR description to be a high level summary of all of our updates and what has changed for this PR.
Interactive segmentation (desktop)
Interactive stereo (desktop)
Multicam
Annotation lifecycle
Polygon / editing
Serialization / server
Dev / build
Documentation updates