diff --git a/docs/deploy/cloud.mdx b/docs/deploy/cloud.mdx
index a8a21f7863..20bd92c40c 100644
--- a/docs/deploy/cloud.mdx
+++ b/docs/deploy/cloud.mdx
@@ -61,7 +61,7 @@ The credential is **shared with the [`heygen` CLI](https://github.com/heygen-com
3. `~/.heygen/credentials`
- Point the CLI at a different backend with `HEYGEN_API_URL` (default `https://api.heygen.com`). Use `hyperframes auth refresh` to force-refresh an OAuth token before a long job; `hyperframes auth logout` clears the stored credential.
+ Point the CLI at a different backend with `HEYGEN_API_URL` (default `https://api.heygen.com`). Use `hyperframes auth refresh` to force-refresh an OAuth token before a long job; `hyperframes auth logout` clears the stored credential. For the keys voice, music, and capture use across the skills — and the fully local fallback — see [Authentication & API keys](/guides/authentication).
## How a cloud render flows
diff --git a/docs/docs.json b/docs/docs.json
index 7e8ac25996..75a3ac0d26 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -75,6 +75,7 @@
"group": "Guides",
"pages": [
"guides/pipeline",
+ "guides/authentication",
"guides/video-components",
"guides/html-in-canvas",
"guides/website-to-video",
diff --git a/docs/guides/authentication.mdx b/docs/guides/authentication.mdx
new file mode 100644
index 0000000000..a16a42ca8e
--- /dev/null
+++ b/docs/guides/authentication.mdx
@@ -0,0 +1,88 @@
+---
+title: Authentication & API keys
+description: "Sign in to HeyGen, and how the keys for voice, music, and capture resolve across the CLI and skills — including the priority order and the fully local fallback."
+---
+
+HyperFrames uses a HeyGen credential for premium voiceover (TTS) and the music / sound-effects library. Other providers are optional, and **everything runs without any key** — voice and music fall back to fully local engines. This page covers signing in, the keys each capability uses, and the order they resolve.
+
+## Sign in
+
+Signing in is the same OAuth step as creating an account — new users land on the sign-up screen.
+
+
+
+ The default flow opens your browser for OAuth and captures the token on a loopback port:
+
+ ```bash
+ npx hyperframes auth login
+ # ✓ Signed in.
+ ```
+
+ For CI or headless machines, save a long-lived API key instead:
+
+ ```bash
+ npx hyperframes auth login --api-key # hidden-input prompt
+ echo "$HEYGEN_API_KEY" | npx hyperframes auth login --api-key # from stdin
+ ```
+
+
+ ```bash
+ npx hyperframes auth status
+ ```
+
+ Shows the active credential's source and verified identity, and — when you're signed out — which local engines voice and music will use. Add `--json` for `{ configured, recommended_action, offline_engines }` in scripts.
+
+
+
+The credential lives in `~/.heygen/credentials` (mode `0600`) — no per-repo `.env` to manage. Browser OAuth is a `hyperframes auth login` feature. The separate [`heygen` CLI](https://github.com/heygen-com/heygen-cli) (its own install — there's no `npx heygen`) is API-key-only, so `heygen auth login` just stores a key you paste. Both read the same `~/.heygen/credentials`, so signing in with one carries to the other.
+
+
+ No account needed to try HyperFrames. With no credential, voice uses **Kokoro** and music uses **MusicGen**, both fully local and offline — see [Working offline](#working-offline).
+
+
+## How credentials resolve
+
+The HeyGen credential drives TTS and music / SFX **retrieval**. It resolves first-match-wins:
+
+1. `HEYGEN_API_KEY` — environment variable
+2. `HYPERFRAMES_API_KEY` — alias, for parity with other tools
+3. `~/.heygen/credentials` — written by `hyperframes auth login` (or `heygen auth login`)
+
+Point at a different config directory with `HEYGEN_CONFIG_DIR`, or a different backend with `HEYGEN_API_URL`.
+
+## Keys by capability
+
+Each capability picks the **first available provider** in order; the last is always a local engine that needs no key. Cloud providers below the HeyGen line need their own key *and* a local Python dependency.
+
+| Capability | Provider order | Key(s) — first match wins | Local dependency |
+|------------|----------------|---------------------------|------------------|
+| **Voice (TTS)** | HeyGen → ElevenLabs → Kokoro | `HEYGEN_API_KEY` → `HYPERFRAMES_API_KEY` → `~/.heygen` · then `ELEVENLABS_API_KEY` | Kokoro: `pip install kokoro-onnx soundfile` |
+| **Music (BGM)** | HeyGen library → Lyria → MusicGen | HeyGen credential (above) · then `GEMINI_API_KEY` → `GOOGLE_API_KEY` | MusicGen: `pip install transformers torch soundfile numpy` |
+| **Sound effects** | HeyGen library → bundled library | HeyGen credential (above) | bundled — no deps |
+| **Capture descriptions** | OpenRouter → Gemini | `OPENROUTER_API_KEY` → `GEMINI_API_KEY` | — (optional; for [website-to-video](/guides/website-to-video)) |
+
+Run `npx hyperframes doctor` to check which local dependencies are installed. The media skills also run `hyperframes auth status` as a preflight before generating, so you always know whether a run will use HeyGen or a local engine before it starts.
+
+## Working offline
+
+No key configured is a normal state, not an error. The workflow runs entirely on local models:
+
+- **Voice** — Kokoro-82M (54 voices), with Whisper for word-level caption alignment.
+- **Music** — MusicGen (`facebook/musicgen-small`).
+- **Sound effects** — a bundled library.
+
+Local engines are free and offline; HeyGen gives higher-quality voices and a professionally produced music library. Sign in any time to switch a project from local to HeyGen.
+
+## Environment variables
+
+| Variable | Used for |
+|----------|----------|
+| `HEYGEN_API_KEY` | HeyGen credential — voice + music/SFX retrieval. Highest priority. |
+| `HYPERFRAMES_API_KEY` | Alias for `HEYGEN_API_KEY`. |
+| `HEYGEN_API_URL` | API base URL (default `https://api.heygen.com`). |
+| `HEYGEN_CONFIG_DIR` | Credentials directory (default `~/.heygen`). |
+| `ELEVENLABS_API_KEY` | ElevenLabs TTS, used when no HeyGen credential is present. |
+| `GEMINI_API_KEY` / `GOOGLE_API_KEY` | Lyria music generation (and capture descriptions). |
+| `OPENROUTER_API_KEY` | Capture descriptions; takes priority over Gemini for that step. |
+
+See the [`hyperframes auth`](/packages/cli#hyperframes-auth) command reference for subcommand details, and [Cloud rendering](/deploy/cloud) for using the same credential to render in HeyGen's cloud.
diff --git a/docs/packages/cli.mdx b/docs/packages/cli.mdx
index 1fc75cfa35..ac7cb156b4 100644
--- a/docs/packages/cli.mdx
+++ b/docs/packages/cli.mdx
@@ -995,6 +995,8 @@ hyperframes auth logout --yes # skip the confirmation prompt
| `HEYGEN_API_URL` | API base URL (default `https://api.heygen.com`). |
| `HEYGEN_CONFIG_DIR` | Credentials directory (default `~/.heygen`). |
+For the keys other capabilities use — ElevenLabs and Gemini for voice/music fallback, OpenRouter/Gemini for capture — and how the skills prioritize them, see [Authentication & API keys](/guides/authentication).
+
## hyperframes cloud
Render a HyperFrames composition on HeyGen's hosted cloud — no local Chrome, no local ffmpeg, no AWS to manage. Sign in once with `hyperframes auth login` and the same credential drives every `cloud` subcommand.
diff --git a/packages/cli/src/audio/providers.test.ts b/packages/cli/src/audio/providers.test.ts
new file mode 100644
index 0000000000..ddb5b9dd29
--- /dev/null
+++ b/packages/cli/src/audio/providers.test.ts
@@ -0,0 +1,50 @@
+import { describe, expect, it } from "vitest";
+import { decideMusic, decideVoice, KOKORO_PIP, MUSICGEN_PIP } from "./providers.js";
+
+describe("decideVoice — mirrors the skill's heygen → elevenlabs → kokoro order", () => {
+ it("prefers HeyGen when configured", () => {
+ const r = decideVoice({ hasHeygen: true, elevenlabs: true, kokoro: true });
+ expect(r.engine).toBe("heygen");
+ expect(r.ready).toBe(true);
+ });
+
+ it("falls to ElevenLabs only when key + module are both present", () => {
+ expect(decideVoice({ hasHeygen: false, elevenlabs: true, kokoro: true }).engine).toBe(
+ "elevenlabs",
+ );
+ });
+
+ it("falls to Kokoro when no cloud provider is usable", () => {
+ expect(decideVoice({ hasHeygen: false, elevenlabs: false, kokoro: true }).engine).toBe(
+ "kokoro",
+ );
+ });
+
+ it("flags Kokoro as not-ready with a pip hint when deps are missing", () => {
+ const r = decideVoice({ hasHeygen: false, elevenlabs: false, kokoro: false });
+ expect(r.engine).toBe("kokoro");
+ expect(r.ready).toBe(false);
+ expect(r.setupHint).toBe(KOKORO_PIP);
+ });
+
+ it("omits the hint when Kokoro is ready", () => {
+ expect(
+ decideVoice({ hasHeygen: false, elevenlabs: false, kokoro: true }).setupHint,
+ ).toBeUndefined();
+ });
+});
+
+describe("decideMusic — mirrors the skill's heygen → lyria → musicgen order", () => {
+ it("prefers HeyGen, then Lyria, then MusicGen", () => {
+ expect(decideMusic({ hasHeygen: true, lyria: true, musicgen: true }).engine).toBe("heygen");
+ expect(decideMusic({ hasHeygen: false, lyria: true, musicgen: true }).engine).toBe("lyria");
+ expect(decideMusic({ hasHeygen: false, lyria: false, musicgen: true }).engine).toBe("musicgen");
+ });
+
+ it("flags MusicGen as not-ready with a pip hint when deps are missing", () => {
+ const r = decideMusic({ hasHeygen: false, lyria: false, musicgen: false });
+ expect(r.engine).toBe("musicgen");
+ expect(r.ready).toBe(false);
+ expect(r.setupHint).toBe(MUSICGEN_PIP);
+ });
+});
diff --git a/packages/cli/src/audio/providers.ts b/packages/cli/src/audio/providers.ts
new file mode 100644
index 0000000000..a3eb3d4890
--- /dev/null
+++ b/packages/cli/src/audio/providers.ts
@@ -0,0 +1,102 @@
+/**
+ * Which voice / music engine a workflow will actually use, and whether
+ * its local dependencies are present. Mirrors the resolution order the
+ * hyperframes-media skill scripts use, so `auth status` and `doctor`
+ * report the same engine the render pipeline would pick:
+ *
+ * voice: HeyGen Starfish → ElevenLabs (key + `elevenlabs`) → Kokoro (local)
+ * music: HeyGen library → Lyria (key + `google.genai`) → MusicGen (local)
+ *
+ * The decision is split from the probing: `decide*` is pure (unit-tested
+ * without spawning Python); `gather*` collects the live facts.
+ */
+
+import { hasPythonModules } from "../tts/python.js";
+
+/** Python import names probed for each local engine. */
+export const KOKORO_MODULES = ["kokoro_onnx", "soundfile"];
+export const MUSICGEN_MODULES = ["transformers", "torch", "soundfile", "numpy"];
+
+/** pip one-liners shown when a local engine's deps are missing. */
+export const KOKORO_PIP = "pip install kokoro-onnx soundfile";
+export const MUSICGEN_PIP = "pip install transformers torch soundfile numpy";
+
+export type VoiceEngine = "heygen" | "elevenlabs" | "kokoro";
+export type MusicEngine = "heygen" | "lyria" | "musicgen";
+
+export interface EngineReadiness {
+ engine: E;
+ /** Human label, e.g. "Kokoro". */
+ label: string;
+ /** A local engine (no account needed) vs a cloud provider keyed by env. */
+ local: boolean;
+ /** Usable right now: cloud key present, or local deps installed. */
+ ready: boolean;
+ /** Shown when `ready` is false — how to make it ready. */
+ setupHint?: string;
+}
+
+export interface VoiceFacts {
+ hasHeygen: boolean;
+ /** ELEVENLABS_API_KEY set AND the `elevenlabs` module importable. */
+ elevenlabs: boolean;
+ /** Kokoro's local deps importable. */
+ kokoro: boolean;
+}
+
+export interface MusicFacts {
+ hasHeygen: boolean;
+ /** A Gemini/Google key set AND `google.genai` importable. */
+ lyria: boolean;
+ /** MusicGen's local deps importable. */
+ musicgen: boolean;
+}
+
+export function decideVoice(f: VoiceFacts): EngineReadiness {
+ if (f.hasHeygen) return { engine: "heygen", label: "HeyGen Starfish", local: false, ready: true };
+ if (f.elevenlabs) return { engine: "elevenlabs", label: "ElevenLabs", local: false, ready: true };
+ return {
+ engine: "kokoro",
+ label: "Kokoro",
+ local: true,
+ ready: f.kokoro,
+ ...(f.kokoro ? {} : { setupHint: KOKORO_PIP }),
+ };
+}
+
+export function decideMusic(f: MusicFacts): EngineReadiness {
+ if (f.hasHeygen) return { engine: "heygen", label: "HeyGen library", local: false, ready: true };
+ if (f.lyria) return { engine: "lyria", label: "Lyria (Gemini)", local: false, ready: true };
+ return {
+ engine: "musicgen",
+ label: "MusicGen",
+ local: true,
+ ready: f.musicgen,
+ ...(f.musicgen ? {} : { setupHint: MUSICGEN_PIP }),
+ };
+}
+
+/** Collect live voice facts. Skips Python probes when HeyGen is configured. */
+function gatherVoiceFacts(hasHeygen: boolean): VoiceFacts {
+ if (hasHeygen) return { hasHeygen, elevenlabs: false, kokoro: false };
+ const elevenlabs = Boolean(process.env["ELEVENLABS_API_KEY"]) && hasPythonModules(["elevenlabs"]);
+ const kokoro = hasPythonModules(KOKORO_MODULES);
+ return { hasHeygen, elevenlabs, kokoro };
+}
+
+/** Collect live music facts. Skips Python probes when HeyGen is configured. */
+function gatherMusicFacts(hasHeygen: boolean): MusicFacts {
+ if (hasHeygen) return { hasHeygen, lyria: false, musicgen: false };
+ const hasLyriaKey = Boolean(process.env["GEMINI_API_KEY"] || process.env["GOOGLE_API_KEY"]);
+ const lyria = hasLyriaKey && hasPythonModules(["google.genai"]);
+ const musicgen = hasPythonModules(MUSICGEN_MODULES);
+ return { hasHeygen, lyria, musicgen };
+}
+
+export function resolveVoice(hasHeygen: boolean): EngineReadiness {
+ return decideVoice(gatherVoiceFacts(hasHeygen));
+}
+
+export function resolveMusic(hasHeygen: boolean): EngineReadiness {
+ return decideMusic(gatherMusicFacts(hasHeygen));
+}
diff --git a/packages/cli/src/commands/auth/status-guidance.ts b/packages/cli/src/commands/auth/status-guidance.ts
new file mode 100644
index 0000000000..08f5244262
--- /dev/null
+++ b/packages/cli/src/commands/auth/status-guidance.ts
@@ -0,0 +1,105 @@
+/**
+ * Onboarding guidance shown by `auth status` when nothing is configured.
+ *
+ * Kept separate from `status.ts` so the wording is pure (it depends only
+ * on colors, not on the credential resolver / API client / system probe)
+ * and can be unit-tested without booting the whole CLI dependency graph.
+ * Environment detection lives in `status.ts`; this module only renders.
+ */
+
+import { c } from "../../ui/colors.js";
+
+export interface UnconfiguredContext {
+ /** A human can act on guidance now — a TTY, or a coding agent driving the CLI. */
+ interactive: boolean;
+}
+
+/** The local engine a workflow will fall back to, and whether it's ready. */
+export interface OfflineEngineLine {
+ capability: "voice" | "music";
+ /** Engine label, e.g. "Kokoro" / "MusicGen". */
+ label: string;
+ /** Deps installed (local) or key present (cloud) — usable right now. */
+ ready: boolean;
+ /** How to make it ready, shown when `ready` is false. */
+ setupHint?: string;
+}
+
+/** The recommended first step; sign-in and sign-up are the same OAuth flow. */
+const RECOMMENDED_ACTION = "npx hyperframes auth login";
+
+/**
+ * Render the "what offline will use" block from probed engine readiness.
+ * Falls back to a generic one-liner when readiness wasn't probed (e.g. a
+ * caller that didn't want to spawn Python).
+ */
+function offlineEngineLines(engines?: OfflineEngineLine[]): string[] {
+ if (!engines || engines.length === 0) {
+ return [
+ c.dim("Prefer offline? Just continue — local engines (Kokoro · MusicGen) need no account."),
+ ];
+ }
+ const lines = ["Prefer offline? Workflows will use these local engines:"];
+ for (const e of engines) {
+ const cap = e.capability.padEnd(5);
+ if (e.ready) {
+ lines.push(` ${cap} → ${e.label} ${c.success("✓ ready")}`);
+ } else {
+ lines.push(` ${cap} → ${e.label} ${c.warn("⚠ deps missing")}`);
+ if (e.setupHint) lines.push(` ${c.dim(e.setupHint)}`);
+ }
+ }
+ if (engines.some((e) => !e.ready)) {
+ lines.push(c.dim(" (or run `hyperframes doctor` to check the local toolchain)"));
+ }
+ return lines;
+}
+
+/**
+ * Human guidance for an unconfigured machine — registration-first.
+ * Both paths use `npx hyperframes` (zero-install via npm): browser OAuth
+ * (sign-in / sign-up) and `--api-key` both write `~/.heygen`. The separate
+ * `heygen` CLI shares that file but needs its own install (no `npx heygen`),
+ * so it's left to the docs — not dangled here as a command a fresh machine
+ * can't run. Names the local fallback so "no key" never reads as a failure,
+ * and never steers users toward a per-repo `.env`. Mirrors the
+ * hyperframes-media skill's Preflight section.
+ */
+export function buildUnconfiguredLines(
+ ctx: UnconfiguredContext,
+ engines?: OfflineEngineLine[],
+): string[] {
+ if (!ctx.interactive) {
+ return [
+ c.warn("Not signed in to HeyGen (non-interactive)."),
+ c.dim(
+ "Set HEYGEN_API_KEY to use HeyGen, or workflows fall back to local engines (Kokoro voice · MusicGen music).",
+ ),
+ ];
+ }
+ return [
+ c.warn("Not signed in to HeyGen — voice & music will use local engines (free, offline)."),
+ "",
+ "Sign in or sign up (browser OAuth, writes ~/.heygen — no per-repo .env):",
+ ` ${c.accent("npx hyperframes auth login")} ${c.dim("# browser sign-in / sign-up")}`,
+ "",
+ "Or paste an existing HeyGen API key (get one at app.heygen.com/settings/api):",
+ ` ${c.accent("npx hyperframes auth login --api-key")} ${c.dim("# paste at the prompt")}`,
+ "",
+ ...offlineEngineLines(engines),
+ ];
+}
+
+/** Machine-readable form of the unconfigured guidance for `--json`. */
+export function buildUnconfiguredJson(
+ ctx: UnconfiguredContext,
+ engines?: OfflineEngineLine[],
+): Record {
+ return {
+ configured: false,
+ interactive: ctx.interactive,
+ recommended_action: RECOMMENDED_ACTION,
+ fallback: "local",
+ ...(engines ? { offline_engines: engines } : {}),
+ };
+}
diff --git a/packages/cli/src/commands/auth/status.test.ts b/packages/cli/src/commands/auth/status.test.ts
new file mode 100644
index 0000000000..41b44ae936
--- /dev/null
+++ b/packages/cli/src/commands/auth/status.test.ts
@@ -0,0 +1,120 @@
+import { describe, expect, it } from "vitest";
+import {
+ buildUnconfiguredJson,
+ buildUnconfiguredLines,
+ type OfflineEngineLine,
+ type UnconfiguredContext,
+} from "./status-guidance.js";
+
+const INTERACTIVE: UnconfiguredContext = { interactive: true };
+const NON_INTERACTIVE: UnconfiguredContext = { interactive: false };
+
+function joined(ctx: UnconfiguredContext, engines?: OfflineEngineLine[]): string {
+ return buildUnconfiguredLines(ctx, engines).join("\n");
+}
+
+describe("buildUnconfiguredLines — interactive (TTY / agent-driven)", () => {
+ const text = joined(INTERACTIVE);
+
+ it("makes browser OAuth the hyperframes path", () => {
+ expect(text).toContain("hyperframes auth login");
+ expect(text).toMatch(/browser oauth/i);
+ expect(text).toMatch(/sign in or sign up/i);
+ });
+
+ it("never steers users toward a per-repo .env", () => {
+ // The improvised flow recommended writing keys into videos//.env;
+ // this guidance must actively rule that out, not suggest it.
+ expect(text).toContain("no per-repo .env");
+ expect(text).not.toMatch(/paste keys.*\.env/i);
+ });
+
+ it("names the local fallback so 'no key' never reads as a failure", () => {
+ expect(text).toMatch(/Kokoro/);
+ expect(text).toMatch(/MusicGen/);
+ expect(text).toMatch(/free, offline/i);
+ });
+
+ it("shows only zero-install `npx hyperframes` paths, not the separately-installed heygen CLI", () => {
+ expect(text).not.toMatch(/heygen auth login/);
+ expect(text).toContain("npx hyperframes auth login");
+ expect(text).toContain("npx hyperframes auth login --api-key");
+ });
+
+ it("offers the --api-key path as a secondary option", () => {
+ expect(text).toContain("hyperframes auth login --api-key");
+ });
+});
+
+describe("buildUnconfiguredLines — non-interactive (CI / piped)", () => {
+ const lines = buildUnconfiguredLines(NON_INTERACTIVE);
+ const text = lines.join("\n");
+
+ it("is terse — two lines, no browser walkthrough", () => {
+ expect(lines).toHaveLength(2);
+ expect(text).not.toMatch(/opens your browser/i);
+ });
+
+ it("points at HEYGEN_API_KEY and the local fallback", () => {
+ expect(text).toContain("HEYGEN_API_KEY");
+ expect(text).toMatch(/local engines/i);
+ });
+});
+
+describe("buildUnconfiguredLines — offline engine readiness", () => {
+ const ready: OfflineEngineLine[] = [
+ { capability: "voice", label: "Kokoro", ready: true },
+ { capability: "music", label: "MusicGen", ready: true },
+ ];
+ const missing: OfflineEngineLine[] = [
+ { capability: "voice", label: "Kokoro", ready: true },
+ {
+ capability: "music",
+ label: "MusicGen",
+ ready: false,
+ setupHint: "pip install transformers torch soundfile numpy",
+ },
+ ];
+
+ it("shows the resolved engine per capability when ready", () => {
+ const text = joined(INTERACTIVE, ready);
+ expect(text).toMatch(/voice .*Kokoro/);
+ expect(text).toMatch(/music .*MusicGen/);
+ expect(text).toMatch(/ready/);
+ });
+
+ it("surfaces the pip setup hint and doctor pointer when a dep is missing", () => {
+ const text = joined(INTERACTIVE, missing);
+ expect(text).toContain("pip install transformers torch soundfile numpy");
+ expect(text).toMatch(/deps missing/);
+ expect(text).toContain("hyperframes doctor");
+ });
+
+ it("falls back to a generic line when readiness wasn't probed", () => {
+ const text = joined(INTERACTIVE);
+ expect(text).toMatch(/Kokoro/);
+ expect(text).toMatch(/MusicGen/);
+ });
+});
+
+describe("buildUnconfiguredJson", () => {
+ it("recommends auth login and reports the local fallback", () => {
+ for (const ctx of [INTERACTIVE, NON_INTERACTIVE]) {
+ const payload = buildUnconfiguredJson(ctx);
+ expect(payload).toMatchObject({
+ configured: false,
+ interactive: ctx.interactive,
+ recommended_action: "npx hyperframes auth login",
+ fallback: "local",
+ });
+ }
+ });
+
+ it("includes probed engines when provided", () => {
+ const engines: OfflineEngineLine[] = [
+ { capability: "voice", label: "Kokoro", ready: true },
+ { capability: "music", label: "MusicGen", ready: false, setupHint: "pip install ..." },
+ ];
+ expect(buildUnconfiguredJson(INTERACTIVE, engines)).toMatchObject({ offline_engines: engines });
+ });
+});
diff --git a/packages/cli/src/commands/auth/status.ts b/packages/cli/src/commands/auth/status.ts
index 832f6a0d07..d083b1079f 100644
--- a/packages/cli/src/commands/auth/status.ts
+++ b/packages/cli/src/commands/auth/status.ts
@@ -4,6 +4,14 @@
*
* Exits non-zero when nothing is configured or the API rejects the
* credential, so scripts can check "am I logged in?" with `$?`.
+ *
+ * When nothing is configured the output is onboarding-first: an
+ * interactive session (a TTY, or a coding agent driving the CLI) gets
+ * registration guidance led by `hyperframes auth login` — sign-in and
+ * sign-up are the same OAuth step — while CI / non-interactive runs get
+ * a terse note and continue on local fallbacks. This is the shared
+ * preflight every TTS/BGM workflow relays, so the wording lives in one
+ * place instead of each workflow improvising its own.
*/
import { defineCommand } from "citty";
@@ -15,7 +23,15 @@ import {
type ResolvedCredential,
type UserInfo,
} from "../../auth/index.js";
+import { getSystemMeta } from "../../telemetry/system.js";
import { c } from "../../ui/colors.js";
+import { resolveMusic, resolveVoice } from "../../audio/providers.js";
+import {
+ buildUnconfiguredJson,
+ buildUnconfiguredLines,
+ type OfflineEngineLine,
+ type UnconfiguredContext,
+} from "./status-guidance.js";
interface VerifiedStatus {
credential: ResolvedCredential;
@@ -54,13 +70,44 @@ export default defineCommand({
},
});
+/**
+ * Decide whether to show full onboarding guidance or a terse note.
+ * CI is never "interactive" even on a TTY; an agent runtime counts as
+ * interactive because a human is watching its relayed output.
+ */
+function detectUnconfiguredContext(): UnconfiguredContext {
+ const sys = getSystemMeta();
+ return { interactive: !sys.is_ci && (sys.is_tty || sys.agent_runtime !== null) };
+}
+
+/**
+ * Probe the local voice/music engines a workflow would fall back to.
+ * `hasHeygen` is false here by construction — we only reach this when no
+ * credential resolved — so this reports the offline engines and whether
+ * their Python deps are installed.
+ */
+function collectOfflineEngines(): OfflineEngineLine[] {
+ const voice = resolveVoice(false);
+ const music = resolveMusic(false);
+ return [
+ { capability: "voice", label: voice.label, ready: voice.ready, ...hint(voice.setupHint) },
+ { capability: "music", label: music.label, ready: music.ready, ...hint(music.setupHint) },
+ ];
+}
+
+function hint(setupHint: string | undefined): { setupHint?: string } {
+ return setupHint ? { setupHint } : {};
+}
+
function handleUnconfigured(asJson: boolean): never {
- if (asJson) {
- console.log(JSON.stringify({ configured: false }));
- } else {
- console.log(c.warn("Not signed in to HeyGen."));
- console.log(`Run ${c.accent("hyperframes auth login --api-key")} to sign in.`);
- }
+ const ctx = detectUnconfiguredContext();
+ // Probe engines for JSON (skills parse it) and interactive guidance; skip
+ // the Python probes for terse non-interactive/CI output to stay fast.
+ const engines = asJson || ctx.interactive ? collectOfflineEngines() : undefined;
+ const output = asJson
+ ? JSON.stringify(buildUnconfiguredJson(ctx, engines))
+ : buildUnconfiguredLines(ctx, engines).join("\n");
+ console.log(output);
process.exit(1);
}
diff --git a/packages/cli/src/commands/doctor.ts b/packages/cli/src/commands/doctor.ts
index 6619b164d8..f3fe14822f 100644
--- a/packages/cli/src/commands/doctor.ts
+++ b/packages/cli/src/commands/doctor.ts
@@ -5,6 +5,8 @@ import { platform } from "node:os";
import type { Example } from "./_examples.js";
import { c } from "../ui/colors.js";
import { parseToolVersion, runEnvironmentChecks } from "../browser/preflight.js";
+import { KOKORO_MODULES, KOKORO_PIP, MUSICGEN_MODULES, MUSICGEN_PIP } from "../audio/providers.js";
+import { hasPythonModules } from "../tts/python.js";
import { VERSION } from "../version.js";
import { getUpdateMeta, withMeta } from "../utils/updateCheck.js";
import {
@@ -158,6 +160,24 @@ async function checkWhisper(): Promise {
};
}
+function checkLocalVoice(): CheckResult {
+ if (hasPythonModules(KOKORO_MODULES)) return { ok: true, detail: "Kokoro deps installed" };
+ return {
+ ok: false,
+ detail: "Not installed (optional \u2014 local voice fallback)",
+ hint: KOKORO_PIP,
+ };
+}
+
+function checkLocalMusic(): CheckResult {
+ if (hasPythonModules(MUSICGEN_MODULES)) return { ok: true, detail: "MusicGen deps installed" };
+ return {
+ ok: false,
+ detail: "Not installed (optional \u2014 local music fallback)",
+ hint: MUSICGEN_PIP,
+ };
+}
+
export interface CheckOutcome {
name: string;
ok: boolean;
@@ -227,6 +247,8 @@ export default defineCommand({
checks.push({ name: "Environment", run: checkEnvironment });
checks.push({ name: "whisper-cpp", run: checkWhisper });
+ checks.push({ name: "TTS (Kokoro)", run: checkLocalVoice });
+ checks.push({ name: "BGM (MusicGen)", run: checkLocalMusic });
const outcomes: CheckOutcome[] = [];
for (const check of checks) {
diff --git a/packages/cli/src/tts/python.ts b/packages/cli/src/tts/python.ts
new file mode 100644
index 0000000000..13e6111d40
--- /dev/null
+++ b/packages/cli/src/tts/python.ts
@@ -0,0 +1,74 @@
+/**
+ * Shared Python-runtime probes. Used by Kokoro synthesis (which must
+ * actually `import` a module before using it) and by the `auth status` /
+ * `doctor` readiness checks (which only need to know whether a module is
+ * installed, cheaply, without paying the cost of importing heavy packages
+ * like torch).
+ */
+
+import { execFileSync } from "node:child_process";
+
+/** Locate a `python3` (or `python`) on PATH that reports as Python 3. */
+export function findPython(): string | undefined {
+ for (const name of ["python3", "python"]) {
+ try {
+ const cmd = process.platform === "win32" ? "where" : "which";
+ const output = execFileSync(cmd, [name], {
+ encoding: "utf-8",
+ stdio: ["pipe", "pipe", "pipe"],
+ timeout: 5000,
+ });
+ const first = output
+ .split(/\r?\n/)
+ .map((s) => s.trim())
+ .find(Boolean);
+ if (!first) continue;
+
+ // Verify it's Python 3
+ const version = execFileSync(first, ["--version"], {
+ encoding: "utf-8",
+ stdio: ["pipe", "pipe", "pipe"],
+ timeout: 5000,
+ }).trim();
+
+ if (version.includes("Python 3")) return first;
+ } catch {
+ // not found or not Python 3
+ }
+ }
+ return undefined;
+}
+
+/** True if `import ` succeeds — actually executes the module. */
+export function hasPythonPackage(python: string, pkg: string): boolean {
+ try {
+ execFileSync(python, ["-c", `import ${pkg}`], {
+ stdio: ["pipe", "pipe", "pipe"],
+ timeout: 10_000,
+ });
+ return true;
+ } catch {
+ return false;
+ }
+}
+
+/**
+ * True if every module is installed, checked via `importlib.util.find_spec`
+ * so heavy packages (torch) are never imported — fast enough for a preflight.
+ * Returns false when no Python 3 is found.
+ */
+export function hasPythonModules(modules: string[]): boolean {
+ const python = findPython();
+ if (!python) return false;
+ const list = JSON.stringify(modules);
+ const probe = `import importlib.util,sys; sys.exit(0 if all(importlib.util.find_spec(m) for m in ${list}) else 1)`;
+ try {
+ execFileSync(python, ["-c", probe], {
+ stdio: ["pipe", "pipe", "pipe"],
+ timeout: 10_000,
+ });
+ return true;
+ } catch {
+ return false;
+ }
+}
diff --git a/packages/cli/src/tts/synthesize.ts b/packages/cli/src/tts/synthesize.ts
index 829417914d..cc887b81f2 100644
--- a/packages/cli/src/tts/synthesize.ts
+++ b/packages/cli/src/tts/synthesize.ts
@@ -1,3 +1,4 @@
+// fallow-ignore-file complexity
import { execFileSync } from "node:child_process";
import { existsSync, writeFileSync, mkdirSync, readdirSync, unlinkSync } from "node:fs";
import { join, dirname, basename } from "node:path";
@@ -9,52 +10,7 @@ import {
inferLangFromVoiceId,
type SupportedLang,
} from "./manager.js";
-
-// ---------------------------------------------------------------------------
-// Python runtime detection
-// ---------------------------------------------------------------------------
-
-function findPython(): string | undefined {
- for (const name of ["python3", "python"]) {
- try {
- const cmd = process.platform === "win32" ? "where" : "which";
- const output = execFileSync(cmd, [name], {
- encoding: "utf-8",
- stdio: ["pipe", "pipe", "pipe"],
- timeout: 5000,
- });
- const first = output
- .split(/\r?\n/)
- .map((s) => s.trim())
- .find(Boolean);
- if (!first) continue;
-
- // Verify it's Python 3
- const version = execFileSync(first, ["--version"], {
- encoding: "utf-8",
- stdio: ["pipe", "pipe", "pipe"],
- timeout: 5000,
- }).trim();
-
- if (version.includes("Python 3")) return first;
- } catch {
- // not found or not Python 3
- }
- }
- return undefined;
-}
-
-function hasPythonPackage(python: string, pkg: string): boolean {
- try {
- execFileSync(python, ["-c", `import ${pkg}`], {
- stdio: ["pipe", "pipe", "pipe"],
- timeout: 10_000,
- });
- return true;
- } catch {
- return false;
- }
-}
+import { findPython, hasPythonPackage } from "./python.js";
// ---------------------------------------------------------------------------
// Inline Python script for Kokoro synthesis
diff --git a/packages/cli/src/utils/lintProject.test.ts b/packages/cli/src/utils/lintProject.test.ts
index e31601aac9..aec22d15e2 100644
--- a/packages/cli/src/utils/lintProject.test.ts
+++ b/packages/cli/src/utils/lintProject.test.ts
@@ -906,6 +906,19 @@ describe("multiple_root_compositions", () => {
expect(finding).toBeUndefined();
});
+ it("ignores root-level caption-skin.html source files", async () => {
+ const project = makeProject(validHtml());
+ writeFileSync(
+ join(project.dir, "caption-skin.html"),
+ '',
+ );
+ const { results } = await lintProject(project);
+ const finding = results[0]?.result.findings.find(
+ (f) => f.code === "multiple_root_compositions",
+ );
+ expect(finding).toBeUndefined();
+ });
+
it("ignores HTML files without data-composition-id", async () => {
const project = makeProject(validHtml());
writeFileSync(join(project.dir, "readme.html"), "Not a composition");
diff --git a/packages/cli/src/utils/lintProject.ts b/packages/cli/src/utils/lintProject.ts
index 2fe61902ec..8558c852bf 100644
--- a/packages/cli/src/utils/lintProject.ts
+++ b/packages/cli/src/utils/lintProject.ts
@@ -477,6 +477,7 @@ function lintMultipleRootCompositions(projectDir: string): HyperframeLintFinding
const rootHtmlFiles = readdirSync(projectDir).filter((f) => f.endsWith(".html"));
const rootCompositions: string[] = [];
for (const file of rootHtmlFiles) {
+ if (file === "caption-skin.html") continue;
const content = readFileSync(join(projectDir, file), "utf-8");
if (/data-composition-id/i.test(content)) {
rootCompositions.push(file);
diff --git a/packages/core/src/lint/context.ts b/packages/core/src/lint/context.ts
index 9de887df56..7ee0d75164 100644
--- a/packages/core/src/lint/context.ts
+++ b/packages/core/src/lint/context.ts
@@ -5,6 +5,7 @@ import {
findRootTag,
collectCompositionIds,
readAttr,
+ stripHtmlComments,
STYLE_BLOCK_PATTERN,
SCRIPT_BLOCK_PATTERN,
} from "./utils";
@@ -29,7 +30,10 @@ export type { HyperframeLintFinding };
export function buildLintContext(html: string, options: HyperframeLinterOptions = {}): LintContext {
const rawSource = html || "";
- let source = rawSource;
+ // Strip HTML comments before scanning so a commented-out or tag can't
+ // hijack the boundary match below. Linear + fixpoint (see stripHtmlComments) to
+ // stay ReDoS-free and catch markers that re-form when a comment is removed.
+ let source = stripHtmlComments(rawSource);
const templateMatch = source.match(/]*>([\s\S]*)<\/template>/i);
if (templateMatch?.[1]) source = templateMatch[1];
diff --git a/packages/core/src/lint/hyperframeLinter.test.ts b/packages/core/src/lint/hyperframeLinter.test.ts
index 75cd202313..8590bc41c5 100644
--- a/packages/core/src/lint/hyperframeLinter.test.ts
+++ b/packages/core/src/lint/hyperframeLinter.test.ts
@@ -63,4 +63,62 @@ describe("lintHyperframeHtml — orchestrator", () => {
);
expect(missing).toHaveLength(0);
});
+
+ it("ignores comments that mention template tags before the real template", async () => {
+ const html = `
+
+
+
+
+
+
+
+
+
+
+
+`;
+ const result = await lintHyperframeHtml(html, { filePath: "compositions/my-comp.html" });
+ const rootFindings = result.findings.filter(
+ (f) => f.code === "root_missing_composition_id" || f.code === "root_missing_dimensions",
+ );
+ expect(rootFindings).toHaveLength(0);
+ });
+
+ it("strips comments whose markers re-form after one pass (no decoy template survives)", async () => {
+ // Adjacent comment markers: removing the inner `` in a single pass
+ // re-joins `<` + `!-- … -->` into a fresh, complete `` that a lone
+ // global replace leaves behind — surfacing a decoy with no
+ // composition-id. A fixpoint strip removes it; this guards that behavior.
+ const html = `
+
+
+ <!-- -->
+
+
+
+
+
+
+`;
+ const result = await lintHyperframeHtml(html, { filePath: "compositions/my-comp.html" });
+ const rootFindings = result.findings.filter(
+ (f) => f.code === "root_missing_composition_id" || f.code === "root_missing_dimensions",
+ );
+ expect(rootFindings).toHaveLength(0);
+ });
});
diff --git a/packages/core/src/lint/utils.ts b/packages/core/src/lint/utils.ts
index b0901758a6..21c24ce6e5 100644
--- a/packages/core/src/lint/utils.ts
+++ b/packages/core/src/lint/utils.ts
@@ -249,6 +249,36 @@ export function stripJsComments(source: string): string {
return out;
}
+// One linear pass that drops every `` region. Uses indexOf, not a
+// `//` regex: that pattern backtracks O(n²) on inputs with many
+// unterminated "" is kept verbatim, matching the prior regex's no-match behavior.
+function stripHtmlCommentsOnce(source: string): string {
+ let out = "";
+ let i = 0;
+ for (;;) {
+ const start = source.indexOf("", start + 4);
+ if (end < 0) return out + source.slice(i);
+ out += source.slice(i, start);
+ i = end + 3;
+ }
+}
+
+// Strip HTML comments to a fixpoint. A single pass is not enough: deleting one
+// comment can splice adjacent markers into a fresh, complete (e.g.
+// "<!-- … -->" → ""), which would otherwise survive and let a
+// commented-out /tag hijack the linter's tag scan.
+export function stripHtmlComments(source: string): string {
+ let out = source;
+ for (let prev = ""; prev !== out; ) {
+ prev = out;
+ out = stripHtmlCommentsOnce(out);
+ }
+ return out;
+}
+
export function extractScriptTextsAndSrcs(scripts: ExtractedBlock[]): {
texts: string[];
srcs: string[];
diff --git a/skills/faceless-explainer/SKILL.md b/skills/faceless-explainer/SKILL.md
index 1e801fc4fd..7051ca8860 100644
--- a/skills/faceless-explainer/SKILL.md
+++ b/skills/faceless-explainer/SKILL.md
@@ -23,7 +23,9 @@ Initialize only if `hyperframes.json` is missing. Name `` from the topi
`npx hyperframes init "videos/" --non-interactive --skip-skills --example=blank`
-**Gate:** `hyperframes.json` exists, and angle, length, aspect ratio, and language are locked.
+**Show sign-in status before the brief** — run `npx hyperframes auth status` and **relay its output verbatim (don't paraphrase or rewrite it).** It reports whether voice/BGM will use HeyGen or local engines and, when not signed in, how to sign in. **If not signed in, STOP and wait for the user to choose — sign in, or say "go"/"offline" to continue with local engines — before asking the brief or anything else.** Treat it as a real decision point, not a passing note; don't fold the choice into the brief question, and don't write keys into a per-repo `.env`. (In autonomous mode, note the status and continue offline.) See `../hyperframes-media` → Preflight for the canonical guidance.
+
+**Gate:** `hyperframes.json` exists, and angle, length, aspect ratio, and language are locked; sign-in status was shown (signed in, or continuing offline).
---
@@ -52,11 +54,11 @@ You make the one judgment call — **which preset**. Read `../hyperframes-creati
node /scripts/build-frame.mjs --preset --hyperframes .
```
-The script does the rest deterministically: copies the preset's `FRAME.md` → `frame.md` and **remixes** it onto any brand tokens in `capture/extracted/tokens.json` (brand colors mapped onto the preset's color keys by role; the preset's display + body fonts swapped for the brand's), copies the preset's `caption-skin.html` verbatim, and self-validates (exits 1 on a broken mapping). Proceed as soon as it exits 0 — no hand-editing of the spec.
+The script does the rest deterministically: copies the preset's `FRAME.md` → `frame.md` and **remixes** it onto any brand tokens in `capture/extracted/tokens.json` (brand colors mapped onto the preset's color keys by role; the preset's display + body fonts swapped for the brand's), copies the preset's caption skin to `.hyperframes/caption-skin.html`, and self-validates (exits 1 on a broken mapping). Proceed as soon as it exits 0 — no hand-editing of the spec.
A faceless explainer usually has **no brand colors/fonts** (`tokens.json` colors/fonts empty) → the script keeps the preset's own palette, a complete shippable design. Only when the user named brand colors/fonts add them to `tokens.json` before running, and only adjust `frame.md` by hand afterward if a mapping truly needs it.
-**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from a named preset, and (when the preset ships one) `caption-skin.html` is at the project root.
+**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from a named preset, and (when the preset ships one) `.hyperframes/caption-skin.html` exists as the caption skin source.
---
@@ -78,7 +80,7 @@ After drafting, show a frame-by-frame summary. In that same message ask the user
Goal: Generate narration, word timings, music, and audio metadata from the approved script.
-Start audio after Step 3 approval. Run it in the background, then continue to Step 4.
+Start audio after Step 3 approval. Run it in the background, then continue to Step 4. (Sign-in status was already shown in Step 0; the engine falls back automatically.)
`node /scripts/audio.mjs --script ./SCRIPT.md --storyboard ./STORYBOARD.md --hyperframes . --out ./audio_meta.json &`
@@ -130,7 +132,7 @@ After audio timings exist, build captions in the background and assemble the ind
`node /scripts/assemble-index.mjs --storyboard ./STORYBOARD.md --hyperframes .`
-`captions.mjs` uses the project's `caption-skin.html` (copied in Step 2) as the caption look, injecting brand tokens from `frame.md`; with no skin present it renders the built-in default pill. `captions: skipped ()` is valid. Continue without captions when explicitly skipped.
+`captions.mjs` uses the project's `.hyperframes/caption-skin.html` (copied in Step 2) as the caption look, injecting brand tokens from `frame.md`; with no skin present it renders the built-in default pill. `captions: skipped ()` is valid. Continue without captions when explicitly skipped.
**Gate:** every frame is marked `animated`, `index.html` exists, and captions are built or explicitly skipped.
@@ -150,11 +152,15 @@ Inject transitions, run checks, pause for review, then render.
`npx hyperframes validate`
-`npx hyperframes inspect --strict-layout`
+`npx hyperframes inspect`
`npx hyperframes snapshot --at `
-If a command fails, surface stderr and stop. Do not pile on recovery commands. If a gate names a frame, fix `compositions/frames/NN-*.html` with the cheapest safe fix: edit the frame HTML for a local issue; re-dispatch the frame worker only when the whole shot must be rebuilt.
+`snapshot` stitches the captured frames into one contact sheet (`snapshots/contact-sheet.jpg`). Glance at it; if nothing is obviously broken, move on — don't linger here.
+
+If a command fails, surface stderr and stop — don't pile on recovery commands. Fix it yourself: the cheapest safe edit to `compositions/frames/NN-*.html`, then rerun the failed check.
+
+**Known false-positive — do not chase it.** `inspect` may report a handful of `text_box_overflow` errors of ~1–4px on the **caption** highlight words (selector `#caption-word-*` / `.caption-line`). The caption pill uses a deliberately snug `line-height` (set once in `scripts/captions.mjs`) and has **no `overflow:hidden`**, so a heavy display glyph's ink spills a few px into the pill's own padding — nothing is actually clipped. Treat these as expected and proceed. Do **not** inflate the caption `line-height` (it balloons the pill, which is worse). Only act on a `text_box_overflow` when it names a **frame** element (`#el-NN-*`), not a caption word.
After checks pass, pause for user review. The video is assembled, viewable, and editable in Studio. Manage preview only once across Step 3 and Step 6: open it if the user asked earlier, offer it if they declined earlier, and do not ask again if they are already reviewing in Studio.
diff --git a/skills/faceless-explainer/scripts/build-frame.mjs b/skills/faceless-explainer/scripts/build-frame.mjs
index dabb9ff7d8..de00509ae3 100644
--- a/skills/faceless-explainer/scripts/build-frame.mjs
+++ b/skills/faceless-explainer/scripts/build-frame.mjs
@@ -19,7 +19,14 @@
// fonts — the preset's display family → the brand display font, its body family →
// the brand body font, wherever they appear. Empty brand fonts → kept.
-import { copyFileSync, existsSync, readdirSync, readFileSync, writeFileSync } from "node:fs";
+import {
+ copyFileSync,
+ existsSync,
+ mkdirSync,
+ readdirSync,
+ readFileSync,
+ writeFileSync,
+} from "node:fs";
import { dirname, join, resolve } from "node:path";
import { fileURLToPath } from "node:url";
import {
@@ -214,8 +221,10 @@ if (brandColors.length && presetColors.length) {
}
if (inBlock && /^\S/.test(line)) inBlock = false;
if (!inBlock) return line;
- const m = line.match(/^(\s+)([\w-]+):\s*["']?[^"'\n]*["']?\s*$/);
- if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"`;
+ const m = line.match(
+ /^(\s+)([\w-]+):\s*(?:"[^"]*"|'[^']*'|#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\n]*?)(\s+#.*)?$/,
+ );
+ if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"${m[3] ?? ""}`;
return line;
})
.join("\n");
@@ -254,7 +263,9 @@ writeFileSync(framePath, md);
const presetSkin = join(presetDir, presetName, "caption-skin.html");
let skinCopied = false;
if (existsSync(presetSkin)) {
- copyFileSync(presetSkin, join(hyperframesDir, "caption-skin.html"));
+ const skinDir = join(hyperframesDir, ".hyperframes");
+ mkdirSync(skinDir, { recursive: true });
+ copyFileSync(presetSkin, join(skinDir, "caption-skin.html"));
skinCopied = true;
}
@@ -275,6 +286,6 @@ if (li != null && lc != null && li >= lc) {
console.log(`✓ build-frame: ${presetName} → ${framePath}`);
for (const s of summary) console.log(` ${s}`);
console.log(
- ` caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
+ ` .hyperframes/caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
);
console.log(` self-check: keys preserved, ink darker than canvas ✓`);
diff --git a/skills/faceless-explainer/scripts/captions.mjs b/skills/faceless-explainer/scripts/captions.mjs
index 72035382fa..dfa25c9fcf 100644
--- a/skills/faceless-explainer/scripts/captions.mjs
+++ b/skills/faceless-explainer/scripts/captions.mjs
@@ -15,8 +15,9 @@
// node captions.mjs build --storyboard ./STORYBOARD.md --audio-meta ./audio_meta.json --hyperframes . --out ./caption_groups.json
//
// CAPTION LOOK — two sources, picked automatically:
-// 1. PRESET SKIN (preferred). If a project-local `caption-skin.html` exists (Step 2
-// copies the chosen frame-preset's skin into the project), it is the caption look.
+// 1. PRESET SKIN (preferred). If a project-local `.hyperframes/caption-skin.html`
+// exists (Step 2 copies the chosen frame-preset's skin into the project), it is
+// the caption look.
// It is a brand-token-strict skin with three reserved holes; this script fills them
// and wraps the result in a for the engine:
// - `var GROUPS = [];` → the computed caption groups
@@ -68,7 +69,12 @@ function runBuild(argv) {
const outPath = resolve(flag(argv, "out", join(hyperframesDir, "caption_groups.json")));
const htmlPath = join(hyperframesDir, "compositions/captions.html");
const overridesPath = join(hyperframesDir, "caption-overrides.json");
- const skinPath = resolve(flag(argv, "skin", join(hyperframesDir, "caption-skin.html")));
+ const skinArg = flag(argv, "skin", null);
+ const hiddenSkinPath = join(hyperframesDir, ".hyperframes", "caption-skin.html");
+ const legacySkinPath = join(hyperframesDir, "caption-skin.html");
+ const skinPath = resolve(
+ skinArg ?? (existsSync(hiddenSkinPath) ? hiddenSkinPath : legacySkinPath),
+ );
const framePath = resolve(flag(argv, "frame", join(hyperframesDir, "frame.md")));
if (!existsSync(storyboardPath)) die(`STORYBOARD.md not found at ${storyboardPath}`);
@@ -291,8 +297,8 @@ function brandFontFaces(framePath, hyperframesDir) {
const weightOf = (n) => {
const s = n.toLowerCase();
if (/black|heavy|ultra|extrabold/.test(s)) return 800;
+ if (/semibold|demibold/.test(s)) return 600; // before /bold/ — "demibold" contains "bold"
if (/bold/.test(s)) return 700;
- if (/semibold|demibold/.test(s)) return 600;
if (/medium/.test(s)) return 500;
if (/light|thin/.test(s)) return 300;
return 400; // book / regular / roman
@@ -305,10 +311,22 @@ function brandFontFaces(framePath, hyperframesDir) {
: /\.ttf$/i.test(f)
? "truetype"
: "opentype";
+ // Normalize away ALL non-alphanumerics (spaces, underscores, hyphens) on BOTH the
+ // family name and the filename. Real font files use "_" / "-" as word separators
+ // ("TT_Norms_Pro_Bold.woff2"), so stripping only whitespace never matched them — the
+ // family key "ttnormspro" failed `startsWith` against "tt_norms_pro_bold", and the
+ // function silently returned "" → captions shipped with NO @font-face for any
+ // underscore/hyphen-named brand font (e.g. TT Norms Pro), which is exactly the
+ // font_family_without_font_face bug.
+ const norm = (s) => s.toLowerCase().replace(/[^a-z0-9]/g, "");
const faces = [];
const seen = new Set();
- for (const fam of families) {
- const key = fam.replace(/\s+/g, "").toLowerCase();
+ const claimed = new Set(); // each file is claimed by the MOST SPECIFIC family only
+ // Match the longest family key first so "TT Norms Pro" can't swallow the files that
+ // belong to "TT Norms Pro Mono" (its key is a prefix of the longer one's).
+ const ranked = [...families].sort((a, b) => norm(b).length - norm(a).length);
+ for (const fam of ranked) {
+ const key = norm(fam);
for (const d of dirs) {
let files = [];
try {
@@ -318,17 +336,34 @@ function brandFontFaces(framePath, hyperframesDir) {
}
for (const f of files.sort()) {
if (!/\.(woff2|woff|ttf|otf)$/i.test(f)) continue;
- if (!f.replace(/\s+/g, "").toLowerCase().startsWith(key)) continue;
+ if (claimed.has(f)) continue; // a more specific family already took this file
+ if (!norm(f.replace(/\.(woff2|woff|ttf|otf)$/i, "")).startsWith(key)) continue;
const w = weightOf(f);
const dedup = `${fam}-${w}`;
if (seen.has(dedup)) continue; // one src per weight; assets/fonts wins over capture
seen.add(dedup);
+ claimed.add(f);
faces.push(
` @font-face { font-family: '${fam}'; src: url('${d.rel}/${f}') format('${fmtOf(f)}'); font-weight: ${w}; font-display: block; }`,
);
}
}
}
+ // Loud signal instead of a silent "". If frame.md named a brand font but no file
+ // matched, the caption text WILL fall back to a generic font in the render — surface
+ // the cause here (at build time) rather than letting it surface 2 steps later as a
+ // font_family_without_font_face lint error disconnected from its root cause.
+ if (!faces.length) {
+ const where = dirs.length
+ ? dirs.map((d) => d.rel).join(" / ")
+ : "assets/fonts or capture/assets/fonts (neither exists)";
+ console.warn(
+ ` ⚠ captions: frame.md names font ${families.map((f) => `"${f}"`).join(", ")} ` +
+ `but no matching .woff2/.woff/.ttf/.otf was found in ${where} — captions will fall back ` +
+ `(text may render in the wrong font). Stage a font file whose name starts with the family ` +
+ `(e.g. "TT Norms Pro" → TT_Norms_Pro_Bold.woff2) so it ships with the project.`,
+ );
+ }
return faces.join("\n");
}
diff --git a/skills/faceless-explainer/scripts/lib/tokens.mjs b/skills/faceless-explainer/scripts/lib/tokens.mjs
index 15f87818c4..3f51778ec5 100644
--- a/skills/faceless-explainer/scripts/lib/tokens.mjs
+++ b/skills/faceless-explainer/scripts/lib/tokens.mjs
@@ -15,9 +15,9 @@ export function parseColors(md) {
if (!inBlock) continue;
if (/^\S/.test(line)) break; // dedent to a top-level key → end of block
const m = line.match(
- /^\s+([\w-]+):\s*["']?(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^"'#\s][^"'\n]*?)["']?\s*$/,
+ /^\s+([\w-]+):\s*(?:"([^"]+)"|'([^']+)'|(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\s][^#\n]*?))\s*(?:#.*)?$/,
);
- if (m) out.push([m[1], m[2].trim()]);
+ if (m) out.push([m[1], (m[2] ?? m[3] ?? m[4]).trim()]);
}
return out;
}
@@ -165,6 +165,10 @@ export function parseFonts(md) {
roles["card-headline"] ??
roles["section-headline"] ??
roles["quote-display"] ??
+ roles.h1 ??
+ roles.h2 ??
+ roles.title ??
+ roles.hero ??
body;
return { display: q(display), body: q(body) };
}
diff --git a/skills/faceless-explainer/sub-agents/frame-worker.md b/skills/faceless-explainer/sub-agents/frame-worker.md
index 922d905a99..c5247ada5d 100644
--- a/skills/faceless-explainer/sub-agents/frame-worker.md
+++ b/skills/faceless-explainer/sub-agents/frame-worker.md
@@ -68,5 +68,5 @@ The orchestrator runs `lint` / `validate` / `inspect`; catch these yourself firs
- `exit_animation_on_non_final_scene` — no exit tween unless you are the final frame.
- **Shot develops (not a slide)** — a non-still frame carries a development beat between entrance and settle; cited effects are sequenced into phases, not all fired at `t=0`.
- **Adapt fidelity** — if the note led with `Base / Keep / Depart`, the `Keep` signature is present and recognizable, every `Depart` is applied, and the shot still runs `entrance → development → settle`.
-- `font_family_without_font_face` — every non-system font named in `frame.md` has an `@import` / `@font-face`.
+- `font_family_without_font_face` — every font you name has a matching `@font-face` (or `@import`) **inside this file**. **Only use fonts that ship as files** with the project: the families declared in `frame.md` (their `.woff2` live in `assets/fonts/` or `capture/assets/fonts/` — point the `@font-face` `src` at the real file you find there). **Never name a font that has no file**, including system CJK / Japanese / Devanagari families (`Hiragino Sans`, `Yu Gothic`, `Noto Sans CJK`, `Noto Sans Devanagari`, …): the render machine is a clean headless Chrome with none of them installed, so the text silently falls back to a generic font and the typography is wrong in the MP4. For non-Latin or multilingual visible text, either use a shipped font that covers the script, or romanize / transliterate it (e.g. `日本語` → `Japanese`); if neither is possible it is out of scope for this frame — do not invent a font name.
- **Keep-out + no-narration-text** (eyeball, no code) — nothing sits below the 83% cutoff; no narration sentence is rendered as visible text.
diff --git a/skills/hyperframes-media/SKILL.md b/skills/hyperframes-media/SKILL.md
index 8a5f3de16d..33d77eb67d 100644
--- a/skills/hyperframes-media/SKILL.md
+++ b/skills/hyperframes-media/SKILL.md
@@ -32,6 +32,21 @@ All three capabilities degrade on **ONE switch** — whether a HeyGen credential
Full flag list + the `audio_meta.json` schema live in the header of `scripts/audio.mjs`. The references below cover the provider details and edge cases behind each capability.
+## Preflight — show sign-in status before any audio
+
+**Always run this before generating voice or BGM — inside a full workflow _or_ a one-off "generate me a BGM/voiceover" request.** No HeyGen credential is **not** a reason to silently fall back to local engines: first recommend signing in and let the user decide. Run the shared preflight and **relay its output verbatim** — don't improvise your own "missing key" prompt, and don't offer to write keys into a per-repo `.env`:
+
+```bash
+npx hyperframes auth status
+```
+
+- **Signed in** → it prints the account; proceed.
+- **Not signed in** (`exit 1` is expected here — "not signed in" is a normal state, not a failure) → it prints registration-first guidance. Recommend signing in: `npx hyperframes auth login` is browser OAuth — it **signs in and creates an account** (always available through this repo's CLI). To use an existing HeyGen API key (from app.heygen.com/settings/api), run `npx hyperframes auth login --api-key` — it saves to the shared `~/.heygen` (no per-repo `.env`). The output also lists the local engines voice/BGM will fall back to and a `pip` hint when deps are missing. **Relay this output as-is — don't paraphrase it into your own wording.** Then **STOP and wait** for the user to choose — sign in, or say "go" / "local" to continue offline — **before generating anything.** This is a real decision point, not a passing note: don't fold it into another question, and don't proceed past it on your own. (Exception: in autonomous / non-interactive mode, note the status and continue offline.)
+- `npx hyperframes auth status --json` returns `{ configured, recommended_action, offline_engines }` for deterministic branching.
+- **If the CLI can't run** (not on PATH and `npx` can't fetch it) → still **recommend signing in** (`npx hyperframes auth login`) and **STOP for the user's choice** — don't treat "no credential" as a silent green light for local generation.
+
+Credential resolution, full key priority, and the local-dependency list are in `references/requirements.md`.
+
## Provider chains (the detail behind the engine)
**TTS** — first available provider wins (the engine, or `npx hyperframes tts "..."`):
@@ -79,3 +94,4 @@ See `references/bgm.md` and `references/sfx.md`.
- **Captions consume the flat word-array format** with `{ id, text, start, end }`. See `references/transcribe.md` → "Output Shape".
- **`remove-background --background-output` is hole-cut, not inpainted.** For "scene without the person", a different tool is needed. See `references/remove-background.md` → "When NOT the right tool".
- **BGM/SFX default to HeyGen retrieval; the no-credential fallback is generation (BGM) or the bundled library (SFX).** `/audio/sounds` ranks by a text query — name effects concretely (`glass shatter`, not `dramatic sound`); a no-match **skips**, never blocks the render. SFX sit at volume ~0.35 under voice + BGM. See `references/sfx.md` / `references/bgm.md`.
+- **Treat workflow caption HTML as generated output.** For preset-backed videos, the reusable skin source lives at `.hyperframes/caption-skin.html` and the workflow script writes `compositions/captions.html`; do not edit generated `compositions/captions.html` to fix the skin. Rebuild via the workflow's `captions.mjs`, or use that workflow's explicit overrides mechanism when present.
diff --git a/skills/hyperframes-media/references/bgm.md b/skills/hyperframes-media/references/bgm.md
index cff7a440bf..277626ca93 100644
--- a/skills/hyperframes-media/references/bgm.md
+++ b/skills/hyperframes-media/references/bgm.md
@@ -3,7 +3,9 @@
One music bed per composition, produced by the shared audio engine (`scripts/audio.mjs` → `scripts/lib/bgm.mjs`). Two routes, chosen by the engine's one switch — whether a HeyGen credential is present:
- **HeyGen retrieval — the default when credentialed.** Search HeyGen's music catalog by mood, download the top track. No generation; same `~/.heygen` / `$HEYGEN_API_KEY` credential as TTS.
-- **Local generation (Lyria → MusicGen) — the automatic fallback when there is no credential** (or when asked for explicitly). Generate a WAV from a mood prompt. There is **no `npx hyperframes bgm` command**; the engine spawns `scripts/lyria-recipe.py` or an inline MusicGen script directly.
+- **Local generation (Lyria → MusicGen) — the fallback when there is no credential** (or when asked for explicitly). Generate a WAV from a mood prompt. There is **no `npx hyperframes bgm` command**; the engine spawns `scripts/lyria-recipe.py` or an inline MusicGen script directly.
+
+> **Run the Preflight first — no credential is not a green light to silently generate locally.** Before generating, complete the sign-in **Preflight** (see `../SKILL.md` → Preflight): run `npx hyperframes auth status`, recommend signing in, and **STOP for the user's choice** (sign in for HeyGen's music library, or continue offline with local generation). This applies to a one-off "generate a BGM" request just as much as inside a full workflow.
## Driving it from the request
diff --git a/skills/hyperframes-media/references/captions/authoring.md b/skills/hyperframes-media/references/captions/authoring.md
index 8b4d7c168f..65687f6803 100644
--- a/skills/hyperframes-media/references/captions/authoring.md
+++ b/skills/hyperframes-media/references/captions/authoring.md
@@ -156,4 +156,4 @@ Caption components ship with transparent backgrounds — they're pure overlays.
- Sync to transcript timestamps.
- One group visible at a time.
- Every group must have a hard `tl.set` kill at `group.end`.
-- The compiler embeds supported fonts automatically — just declare `font-family` in CSS.
+- Fonts: the compiler auto-embeds only its **built-in mapped set** (Inter, Roboto, Montserrat, …) — for those, just declare `font-family` in CSS. Any **other** font (a brand/custom font like `TT Norms Pro`, or a non-Latin CJK/Devanagari family) is **not** auto-supplied: it needs an `@font-face` pointing at a real `.woff2` shipped with the project, or the text silently falls back to a generic font in the render. Don't assume a `font-family` you can see locally will render — the render machine is a clean headless Chrome with no installed fonts.
diff --git a/skills/hyperframes-media/references/requirements.md b/skills/hyperframes-media/references/requirements.md
index a4b2a1a1b7..d695b3e0b2 100644
--- a/skills/hyperframes-media/references/requirements.md
+++ b/skills/hyperframes-media/references/requirements.md
@@ -1,8 +1,24 @@
# Requirements & Caches
+## Credential & key priority
+
+Run `npx hyperframes auth status` to see what's configured and which engines a workflow will use (see the skill's **Preflight** section). Keys resolve in this order — **first match wins**:
+
+| Provider | Resolution order (first non-empty wins) | Local deps when used |
+| ------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
+| **HeyGen** (TTS + BGM/SFX retrieval) | `$HEYGEN_API_KEY` → `$HYPERFRAMES_API_KEY` → `~/.heygen/credentials` (shared with heygen-cli; `$HEYGEN_CONFIG_DIR` overrides the dir; written by `hyperframes auth login`) | none (REST) |
+| **ElevenLabs** (TTS fallback) | `$ELEVENLABS_API_KEY` | `pip install elevenlabs` |
+| **Lyria** (BGM fallback) | `$GEMINI_API_KEY` → `$GOOGLE_API_KEY` | `pip install google-genai` |
+| **Kokoro** (TTS, no key) | always — final voice fallback | `pip install kokoro-onnx soundfile` |
+| **MusicGen** (BGM, no key) | always — final music fallback | `pip install transformers torch soundfile numpy` |
+
+`hyperframes auth login` (browser OAuth) is the recommended setup: one sign-in, every project, no per-repo `.env`. An OAuth login is sent as `Authorization: Bearer`; an API key as `X-Api-Key`. With no HeyGen credential, voice/BGM run fully locally (Kokoro / MusicGen) — `hyperframes auth status` and `hyperframes doctor` both report whether those local deps are installed.
+
+## Model caches & system dependencies
+
Each command downloads its own model on first run and caches it under `~/.cache/hyperframes/`:
-- **TTS (HeyGen)** — no local deps; needs a HeyGen credential + `ffmpeg` on PATH (to transcode the mp3 response to `.wav`). Credential resolves like the CLI: `$HEYGEN_API_KEY` → `$HYPERFRAMES_API_KEY` → `~/.heygen/credentials` (shared with heygen-cli; run `hyperframes auth login`). An OAuth login is sent as `Authorization: Bearer`; an API key as `X-Api-Key`.
+- **TTS (HeyGen)** — no local deps; needs a HeyGen credential + `ffmpeg` on PATH (to transcode the mp3 response to `.wav`). Credential resolves like the CLI: `$HEYGEN_API_KEY` → `$HYPERFRAMES_API_KEY` → `~/.heygen/credentials` (shared with heygen-cli; run `npx hyperframes auth login`). An OAuth login is sent as `Authorization: Bearer`; an API key as `X-Api-Key`.
- **TTS (ElevenLabs)** — same as HeyGen: API key + `ffmpeg`.
- **TTS (Kokoro)** — Kokoro-82M (~311 MB) + voices (~27 MB) in `tts/`. Requires Python 3.8+ with `kokoro-onnx` and `soundfile` (`pip install kokoro-onnx soundfile`). Non-English text also needs `espeak-ng` system-wide.
- **BGM (Lyria)** — needs `$GEMINI_API_KEY` or `$GOOGLE_API_KEY` + `pip install google-genai`. No local model cache.
diff --git a/skills/hyperframes-media/references/tts.md b/skills/hyperframes-media/references/tts.md
index d61e161c0e..7f7b6c17a9 100644
--- a/skills/hyperframes-media/references/tts.md
+++ b/skills/hyperframes-media/references/tts.md
@@ -2,6 +2,8 @@
`npx hyperframes tts` auto-detects a provider from env vars; explicit override via `--provider`.
+> **Run the Preflight first — no credential is not a green light to silently use the local voice.** Before generating a voiceover, complete the sign-in **Preflight** (see `../SKILL.md` → Preflight): run `npx hyperframes auth status`, recommend signing in, and **STOP for the user's choice** (sign in for HeyGen voices, or continue offline with local Kokoro). This applies to a one-off "generate a voiceover" request just as much as inside a full workflow.
+
## Provider chain
| Order | Provider | Env trigger | Voice IDs | Word timestamps | Audio format |
@@ -35,10 +37,10 @@ wins: `$HEYGEN_API_KEY` → `$HYPERFRAMES_API_KEY` → a project `.env` (auto-lo
walks up ≤5 dirs) → `~/.heygen/credentials` (shared with heygen-cli;
`$HEYGEN_CONFIG_DIR` overrides the dir). An OAuth login is sent as
`Authorization: Bearer`; an API key as `X-Api-Key`. If the only credential is an
-expired OAuth token it stops with a hint to run `hyperframes auth refresh`.
+expired OAuth token it stops with a hint to run `npx hyperframes auth refresh`.
```bash
-# Only needed if you haven't run `hyperframes auth login`:
+# Only needed if you haven't run `npx hyperframes auth login`:
export HEYGEN_API_KEY=... # or put it in a project .env
# Synthesize + capture word timestamps in one call (skips a Whisper pass)
diff --git a/skills/hyperframes-media/scripts/lib/heygen.mjs b/skills/hyperframes-media/scripts/lib/heygen.mjs
index ce56aa9765..ef466e4bb1 100644
--- a/skills/hyperframes-media/scripts/lib/heygen.mjs
+++ b/skills/hyperframes-media/scripts/lib/heygen.mjs
@@ -75,10 +75,10 @@ export function heygenAuthHeaders() {
if (cred?.headers) return cred.headers;
if (cred?.expired)
throw new Error(
- "HeyGen OAuth token expired — run `hyperframes auth refresh` (or `hyperframes auth login`)",
+ "HeyGen OAuth token expired — run `npx hyperframes auth refresh` (or `npx hyperframes auth login`)",
);
throw new Error(
- "no HeyGen credentials — set $HEYGEN_API_KEY, or run `hyperframes auth login` (writes ~/.heygen/credentials)",
+ "no HeyGen credentials — set $HEYGEN_API_KEY, or run `npx hyperframes auth login` (writes ~/.heygen/credentials)",
);
}
diff --git a/skills/hyperframes-media/scripts/lib/tts.mjs b/skills/hyperframes-media/scripts/lib/tts.mjs
index 9cb141c5c6..c21d2b87ba 100644
--- a/skills/hyperframes-media/scripts/lib/tts.mjs
+++ b/skills/hyperframes-media/scripts/lib/tts.mjs
@@ -36,7 +36,7 @@ export function pickProvider(userProvider) {
throw new Error(`invalid provider "${userProvider}" (heygen | elevenlabs | kokoro)`);
if (userProvider === "heygen" && !heygenAvailable())
throw new Error(
- "provider=heygen but no HeyGen credentials (set $HEYGEN_API_KEY or run `hyperframes auth login`)",
+ "provider=heygen but no HeyGen credentials (set $HEYGEN_API_KEY or run `npx hyperframes auth login`)",
);
if (userProvider === "elevenlabs" && !process.env.ELEVENLABS_API_KEY)
throw new Error("provider=elevenlabs but $ELEVENLABS_API_KEY is not set");
diff --git a/skills/music-to-video/SKILL.md b/skills/music-to-video/SKILL.md
index 7aa224478b..4deda4c980 100644
--- a/skills/music-to-video/SKILL.md
+++ b/skills/music-to-video/SKILL.md
@@ -24,7 +24,7 @@ Workflow: Step 0 setup → `hyperframes.json` + `assets/bgm.mp3`; Step 1 analyze
Goal: Establish the music source, create the HyperFrames project, and note any user-supplied media.
-The **music is the spine** — establish one track before anything else. This skill is tuned for **fast, high-energy BGM**: a strong beat grid drives the cuts (calm tracks work, but pace by phrase rather than beat). If the user gave you audio — a music file, or a video to pull the audio from — use it. If not, generate one: choose the mood from the user's description (e.g. "driving synthwave", "trap beat", "upbeat corporate") and produce a track via `/hyperframes-media` (`references/bgm.md` — HeyGen retrieval when credentialed, else local Lyria / MusicGen; ElevenLabs or another generator also works). Either way the track lands at `assets/bgm.mp3`. Stage any user-supplied images or videos so frames can weave them in on the beat grid; otherwise typography carries the whole video.
+The **music is the spine** — establish one track before anything else. This skill is tuned for **fast, high-energy BGM**: a strong beat grid drives the cuts (calm tracks work, but pace by phrase rather than beat). If the user gave you audio — a music file, or a video to pull the audio from — use it. If not, generate one: choose the mood from the user's description (e.g. "driving synthwave", "trap beat", "upbeat corporate") and produce a track via `/hyperframes-media` (`references/bgm.md` — HeyGen retrieval when credentialed, else local Lyria / MusicGen; ElevenLabs or another generator also works). Before generating, run `npx hyperframes auth status` and **relay its output verbatim (don't paraphrase or rewrite it)** — it shows whether BGM comes from HeyGen or local MusicGen and, if not signed in, how to sign in. **If not signed in, STOP and wait for the user to choose — sign in, or continue offline with local MusicGen — before generating the track**; don't write keys into a per-repo `.env`. (In autonomous mode, note the status and continue offline.) See `/hyperframes-media` → Preflight for the canonical guidance. Either way the track lands at `assets/bgm.mp3`. Stage any user-supplied images or videos so frames can weave them in on the beat grid; otherwise typography carries the whole video.
Initialize only if `hyperframes.json` is missing. Name `` from the brief in kebab-case, such as `midnight-drive-loop` — never a timestamp.
@@ -145,7 +145,7 @@ Run the CLI on the **assembled project** — that's the correct unit (the per-fr
( cd "$PROJECT_DIR" && npx hyperframes lint . && npx hyperframes validate . && npx hyperframes inspect . )
```
-Inspect at `t=0`, each frame start, the strongest DROP / SURGE, every `hard_stops[].t`, and the final frame. On failure, make the **cheapest safe fix**: edit the offending `compositions/frames/NN-*.html` for a local issue; **re-dispatch that one frame-worker** only when a whole frame must be rebuilt; go back to Step 3 only if the plan is creatively wrong. Never change duration or audio timing to hide a sync issue. Once the gates pass, pause for user review, then render only on approval:
+Inspect at `t=0`, each frame start, the strongest DROP / SURGE, every `hard_stops[].t`, and the final frame. On failure, make the **cheapest safe fix** yourself: edit the offending `compositions/frames/NN-*.html`. Never change duration or audio timing to hide a sync issue. Once the gates pass, pause for user review, then render only on approval:
```bash
( cd "$PROJECT_DIR" && npx hyperframes render . -q draft -o renders/video.mp4 --fps 30 )
diff --git a/skills/pr-to-video/SKILL.md b/skills/pr-to-video/SKILL.md
index e17699422a..7d0226eecb 100644
--- a/skills/pr-to-video/SKILL.md
+++ b/skills/pr-to-video/SKILL.md
@@ -19,13 +19,32 @@ Workflow: Step 0 setup → `hyperframes.json`; Step 1 ingest → `capture/extrac
Goal: Lock the PR reference and the core video brief, and create the HyperFrames project if needed.
-Get the **PR reference** (a full URL, an `/#` ref, or "this PR" in a checked-out repo) and, in one message, confirm the brief — lead with a recommended default for each and pre-fill anything `/hyperframes` already set: **angle** (changelog / feature-reveal / fix-explainer / refactor-walkthrough — default: infer from the PR), **audience** (default: developers), **length** (default ~60-90s), **aspect** (default 16:9), **language**. The style is always **claude**. Proceed only after the user replies; a "go" accepts the defaults.
+Get the **PR reference** (a full URL, an `/#` ref, or "this PR" in a checked-out repo) and, in one message, confirm the brief — lead with a recommended default for each and pre-fill anything `/hyperframes` already set: **angle** (changelog / feature-reveal / fix-explainer / refactor-walkthrough — default: infer from the PR), **audience** (default: developers), **length** (default: **scale to the PR's change size** — see below), **aspect** (default 16:9), **language**. The style is always **claude**. Proceed only after the user replies; a "go" accepts the defaults.
+
+**Recommend the length from the PR's change size**, not a fixed guess. Before confirming the brief, peek at the PR once — a read-only call that also grounds the angle (Step 1 still does the full deterministic fetch):
+
+```bash
+gh pr view --json title,additions,deletions,changedFiles
+```
+
+Pick the tier from `additions + deletions` (nudged up by `changedFiles`) and lead with it as the default (the user can override; hard cap ~3 min):
+
+| PR change size | Recommended length |
+| --------------------------------- | ------------------ |
+| trivial (≲ 50 lines changed) | ~20–40s |
+| focused (~50–200 lines) | ~40–70s |
+| substantial (~200–600 lines) | ~70–110s |
+| large (≳ 600 lines, or 25+ files) | ~110–180s |
+
+State the basis in one phrase when you propose it (e.g. "~40s — small change, +44/−13 across 12 files"). A huge PR doesn't mean a long video — if the story is one headline change, keep it tight and say so.
Initialize only if `hyperframes.json` is missing. Name `` from the PR in kebab-case, such as `acme-sdk-pr-1842`; never use the workspace name or a timestamp.
`npx hyperframes init "videos/" --non-interactive --skip-skills --example=blank`
-**Gate:** `hyperframes.json` exists; the PR ref is captured; angle, length, aspect ratio, and language are locked.
+**Show sign-in status before the brief** — run `npx hyperframes auth status` and **relay its output verbatim (don't paraphrase or rewrite it).** It reports whether voice/BGM will use HeyGen or local engines and, when not signed in, how to sign in. **If not signed in, STOP and wait for the user to choose — sign in, or say "go"/"offline" to continue with local engines — before asking the brief or anything else.** Treat it as a real decision point, not a passing note; don't fold the choice into the brief question, and don't write keys into a per-repo `.env`. (In autonomous mode, note the status and continue offline.) See `../hyperframes-media` → Preflight for the canonical guidance.
+
+**Gate:** `hyperframes.json` exists; the PR ref is captured; angle, length, aspect ratio, and language are locked; sign-in status was shown (signed in, or continuing offline).
---
@@ -68,9 +87,9 @@ The style is fixed — **claude** (warm editorial; a navy code surface built for
node /scripts/build-frame.mjs --preset claude --hyperframes .
```
-The script copies the claude preset's `FRAME.md` → `frame.md`, remixes it onto any brand tokens in `capture/extracted/tokens.json` (a PR has none → `colors:[]`/`fonts:[]` keeps claude's own palette, a complete design), copies the preset's `caption-skin.html`, and self-validates (exits 1 on a broken mapping). Proceed as soon as it exits 0 — no hand-editing.
+The script copies the claude preset's `FRAME.md` → `frame.md`, remixes it onto any brand tokens in `capture/extracted/tokens.json` (a PR has none → `colors:[]`/`fonts:[]` keeps claude's own palette, a complete design), copies the preset's caption skin to `.hyperframes/caption-skin.html`, and self-validates (exits 1 on a broken mapping). Proceed as soon as it exits 0 — no hand-editing.
-**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from the claude preset, and `caption-skin.html` is at the project root.
+**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from the claude preset, and `.hyperframes/caption-skin.html` exists as the caption skin source.
---
@@ -146,7 +165,7 @@ After audio timings exist, build captions in the background and assemble the ind
`node /scripts/assemble-index.mjs --storyboard ./STORYBOARD.md --hyperframes .`
-`captions.mjs` uses the project's `caption-skin.html` (claude's, copied in Step 2), injecting brand tokens from `frame.md`; `captions: skipped ()` is valid. `assemble-index.mjs` stages the credits avatars from `assets/` as an idempotent backstop.
+`captions.mjs` uses the project's `.hyperframes/caption-skin.html` (claude's, copied in Step 2), injecting brand tokens from `frame.md`; `captions: skipped ()` is valid. `assemble-index.mjs` stages the credits avatars from `assets/` as an idempotent backstop.
**Gate:** every frame is marked `animated`, `index.html` exists, and captions are built or explicitly skipped.
@@ -170,9 +189,11 @@ Inject transitions, run checks, pause for review, then render.
`npx hyperframes snapshot --at `
-If a command fails, surface stderr and stop. Do not pile on recovery commands. If a gate names a frame, fix `compositions/frames/NN-*.html` with the cheapest safe fix: edit the frame HTML for a local issue; re-dispatch the frame worker only when the whole shot must be rebuilt.
+`snapshot` stitches the captured frames into one contact sheet (`snapshots/contact-sheet.jpg`). Glance at it; if nothing is obviously broken, move on — don't linger here.
+
+If a command fails, surface stderr and stop — don't pile on recovery commands. Fix it yourself: the cheapest safe edit to `compositions/frames/NN-*.html`, then rerun the failed check.
-**Known false-positive — do not chase it.** `inspect` may report a handful of `text_box_overflow` errors of ~1–4px on the **caption** highlight words (selector `#caption-word-*` / `.caption-line`). The caption pill uses a deliberately snug `line-height` (set once in `scripts/captions.mjs`) and has **no `overflow:hidden`**, so a heavy display glyph's ink spills a few px into the pill's own padding — nothing is actually clipped. Treat these as expected and proceed. Do **not** inflate the caption `line-height` (it balloons the pill, which is worse) and do **not** re-dispatch a frame for them. Only act on a `text_box_overflow` when it names a **frame** element (`#el-NN-*`), not a caption word.
+**Known false-positive — do not chase it.** `inspect` may report a handful of `text_box_overflow` errors of ~1–4px on the **caption** highlight words (selector `#caption-word-*` / `.caption-line`). The caption pill uses a deliberately snug `line-height` (set once in `scripts/captions.mjs`) and has **no `overflow:hidden`**, so a heavy display glyph's ink spills a few px into the pill's own padding — nothing is actually clipped. Treat these as expected and proceed. Do **not** inflate the caption `line-height` (it balloons the pill, which is worse). Only act on a `text_box_overflow` when it names a **frame** element (`#el-NN-*`), not a caption word.
After checks pass, pause for user review. The video is assembled, viewable, and editable in Studio. Manage preview only once across Step 3 and Step 6: open it if the user asked earlier, offer it if they declined earlier, do not ask again if they are already reviewing in Studio.
diff --git a/skills/pr-to-video/scripts/build-frame.mjs b/skills/pr-to-video/scripts/build-frame.mjs
index dabb9ff7d8..de00509ae3 100644
--- a/skills/pr-to-video/scripts/build-frame.mjs
+++ b/skills/pr-to-video/scripts/build-frame.mjs
@@ -19,7 +19,14 @@
// fonts — the preset's display family → the brand display font, its body family →
// the brand body font, wherever they appear. Empty brand fonts → kept.
-import { copyFileSync, existsSync, readdirSync, readFileSync, writeFileSync } from "node:fs";
+import {
+ copyFileSync,
+ existsSync,
+ mkdirSync,
+ readdirSync,
+ readFileSync,
+ writeFileSync,
+} from "node:fs";
import { dirname, join, resolve } from "node:path";
import { fileURLToPath } from "node:url";
import {
@@ -214,8 +221,10 @@ if (brandColors.length && presetColors.length) {
}
if (inBlock && /^\S/.test(line)) inBlock = false;
if (!inBlock) return line;
- const m = line.match(/^(\s+)([\w-]+):\s*["']?[^"'\n]*["']?\s*$/);
- if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"`;
+ const m = line.match(
+ /^(\s+)([\w-]+):\s*(?:"[^"]*"|'[^']*'|#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\n]*?)(\s+#.*)?$/,
+ );
+ if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"${m[3] ?? ""}`;
return line;
})
.join("\n");
@@ -254,7 +263,9 @@ writeFileSync(framePath, md);
const presetSkin = join(presetDir, presetName, "caption-skin.html");
let skinCopied = false;
if (existsSync(presetSkin)) {
- copyFileSync(presetSkin, join(hyperframesDir, "caption-skin.html"));
+ const skinDir = join(hyperframesDir, ".hyperframes");
+ mkdirSync(skinDir, { recursive: true });
+ copyFileSync(presetSkin, join(skinDir, "caption-skin.html"));
skinCopied = true;
}
@@ -275,6 +286,6 @@ if (li != null && lc != null && li >= lc) {
console.log(`✓ build-frame: ${presetName} → ${framePath}`);
for (const s of summary) console.log(` ${s}`);
console.log(
- ` caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
+ ` .hyperframes/caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
);
console.log(` self-check: keys preserved, ink darker than canvas ✓`);
diff --git a/skills/pr-to-video/scripts/captions.mjs b/skills/pr-to-video/scripts/captions.mjs
index 1d0436e002..6bf4b84b47 100644
--- a/skills/pr-to-video/scripts/captions.mjs
+++ b/skills/pr-to-video/scripts/captions.mjs
@@ -15,8 +15,9 @@
// node captions.mjs build --storyboard ./STORYBOARD.md --audio-meta ./audio_meta.json --hyperframes . --out ./caption_groups.json
//
// CAPTION LOOK — two sources, picked automatically:
-// 1. PRESET SKIN (preferred). If a project-local `caption-skin.html` exists (Step 2
-// copies the chosen frame-preset's skin into the project), it is the caption look.
+// 1. PRESET SKIN (preferred). If a project-local `.hyperframes/caption-skin.html`
+// exists (Step 2 copies the chosen frame-preset's skin into the project), it is
+// the caption look.
// It is a brand-token-strict skin with three reserved holes; this script fills them
// and wraps the result in a for the engine:
// - `var GROUPS = [];` → the computed caption groups
@@ -68,7 +69,12 @@ function runBuild(argv) {
const outPath = resolve(flag(argv, "out", join(hyperframesDir, "caption_groups.json")));
const htmlPath = join(hyperframesDir, "compositions/captions.html");
const overridesPath = join(hyperframesDir, "caption-overrides.json");
- const skinPath = resolve(flag(argv, "skin", join(hyperframesDir, "caption-skin.html")));
+ const skinArg = flag(argv, "skin", null);
+ const hiddenSkinPath = join(hyperframesDir, ".hyperframes", "caption-skin.html");
+ const legacySkinPath = join(hyperframesDir, "caption-skin.html");
+ const skinPath = resolve(
+ skinArg ?? (existsSync(hiddenSkinPath) ? hiddenSkinPath : legacySkinPath),
+ );
const framePath = resolve(flag(argv, "frame", join(hyperframesDir, "frame.md")));
if (!existsSync(storyboardPath)) die(`STORYBOARD.md not found at ${storyboardPath}`);
@@ -312,8 +318,8 @@ function brandFontFaces(framePath, hyperframesDir) {
const weightOf = (n) => {
const s = n.toLowerCase();
if (/black|heavy|ultra|extrabold/.test(s)) return 800;
+ if (/semibold|demibold/.test(s)) return 600; // before /bold/ — "demibold" contains "bold"
if (/bold/.test(s)) return 700;
- if (/semibold|demibold/.test(s)) return 600;
if (/medium/.test(s)) return 500;
if (/light|thin/.test(s)) return 300;
return 400; // book / regular / roman
@@ -326,10 +332,22 @@ function brandFontFaces(framePath, hyperframesDir) {
: /\.ttf$/i.test(f)
? "truetype"
: "opentype";
+ // Normalize away ALL non-alphanumerics (spaces, underscores, hyphens) on BOTH the
+ // family name and the filename. Real font files use "_" / "-" as word separators
+ // ("TT_Norms_Pro_Bold.woff2"), so stripping only whitespace never matched them — the
+ // family key "ttnormspro" failed `startsWith` against "tt_norms_pro_bold", and the
+ // function silently returned "" → captions shipped with NO @font-face for any
+ // underscore/hyphen-named brand font (e.g. TT Norms Pro), which is exactly the
+ // font_family_without_font_face bug.
+ const norm = (s) => s.toLowerCase().replace(/[^a-z0-9]/g, "");
const faces = [];
const seen = new Set();
- for (const fam of families) {
- const key = fam.replace(/\s+/g, "").toLowerCase();
+ const claimed = new Set(); // each file is claimed by the MOST SPECIFIC family only
+ // Match the longest family key first so "TT Norms Pro" can't swallow the files that
+ // belong to "TT Norms Pro Mono" (its key is a prefix of the longer one's).
+ const ranked = [...families].sort((a, b) => norm(b).length - norm(a).length);
+ for (const fam of ranked) {
+ const key = norm(fam);
for (const d of dirs) {
let files = [];
try {
@@ -339,17 +357,34 @@ function brandFontFaces(framePath, hyperframesDir) {
}
for (const f of files.sort()) {
if (!/\.(woff2|woff|ttf|otf)$/i.test(f)) continue;
- if (!f.replace(/\s+/g, "").toLowerCase().startsWith(key)) continue;
+ if (claimed.has(f)) continue; // a more specific family already took this file
+ if (!norm(f.replace(/\.(woff2|woff|ttf|otf)$/i, "")).startsWith(key)) continue;
const w = weightOf(f);
const dedup = `${fam}-${w}`;
if (seen.has(dedup)) continue; // one src per weight; assets/fonts wins over capture
seen.add(dedup);
+ claimed.add(f);
faces.push(
` @font-face { font-family: '${fam}'; src: url('${d.rel}/${f}') format('${fmtOf(f)}'); font-weight: ${w}; font-display: block; }`,
);
}
}
}
+ // Loud signal instead of a silent "". If frame.md named a brand font but no local file
+ // matched, the caller falls back to a Google Fonts @import (or, failing that, a generic
+ // font) — surface the cause here at build time rather than letting it surface 2 steps
+ // later as a font_family_without_font_face lint error disconnected from its root cause.
+ if (!faces.length) {
+ const where = dirs.length
+ ? dirs.map((d) => d.rel).join(" / ")
+ : "assets/fonts or capture/assets/fonts (neither exists)";
+ console.warn(
+ ` ⚠ captions: frame.md names font ${families.map((f) => `"${f}"`).join(", ")} ` +
+ `but no matching .woff2/.woff/.ttf/.otf was found in ${where} — falling back to @import/generic ` +
+ `(text may render in the wrong font). Stage a font file whose name starts with the family ` +
+ `(e.g. "TT Norms Pro" → TT_Norms_Pro_Bold.woff2) to ship it locally.`,
+ );
+ }
return faces.join("\n");
}
diff --git a/skills/pr-to-video/scripts/lib/tokens.mjs b/skills/pr-to-video/scripts/lib/tokens.mjs
index 15f87818c4..3f51778ec5 100644
--- a/skills/pr-to-video/scripts/lib/tokens.mjs
+++ b/skills/pr-to-video/scripts/lib/tokens.mjs
@@ -15,9 +15,9 @@ export function parseColors(md) {
if (!inBlock) continue;
if (/^\S/.test(line)) break; // dedent to a top-level key → end of block
const m = line.match(
- /^\s+([\w-]+):\s*["']?(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^"'#\s][^"'\n]*?)["']?\s*$/,
+ /^\s+([\w-]+):\s*(?:"([^"]+)"|'([^']+)'|(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\s][^#\n]*?))\s*(?:#.*)?$/,
);
- if (m) out.push([m[1], m[2].trim()]);
+ if (m) out.push([m[1], (m[2] ?? m[3] ?? m[4]).trim()]);
}
return out;
}
@@ -165,6 +165,10 @@ export function parseFonts(md) {
roles["card-headline"] ??
roles["section-headline"] ??
roles["quote-display"] ??
+ roles.h1 ??
+ roles.h2 ??
+ roles.title ??
+ roles.hero ??
body;
return { display: q(display), body: q(body) };
}
diff --git a/skills/pr-to-video/sub-agents/frame-worker.md b/skills/pr-to-video/sub-agents/frame-worker.md
index 66028b3894..e2f20d6fc4 100644
--- a/skills/pr-to-video/sub-agents/frame-worker.md
+++ b/skills/pr-to-video/sub-agents/frame-worker.md
@@ -76,4 +76,4 @@ You **can't** meaningfully run `hyperframes lint` / `validate` / `inspect` here:
2. **One paused timeline** (`timeline_not_paused` / `timeline_not_registered`) — exactly one `gsap.timeline({ paused: true })`, registered at `window.__timelines[""]`, built synchronously.
3. **Determinism & visibility** (`css_transition_used` / `exit_animation_on_non_final_scene`) — no CSS transitions, no `repeat` / `yoyo`, no `Math.random` / `Date.now` (the renderer seeks frame-by-frame); entrance via `fromTo` (not CSS-hidden starts) so the hero is visible by `t ≤ 0.5s`; no exit tween unless you are the final frame.
4. **The whole shot fits `data-duration`** — `entrance → development → settle` all land inside it, and a non-still frame develops mid-shot (cited effects sequenced into phases, not all fired at `t=0`; an `Adapt` note keeps its `Keep` signature). **This includes any `code-*` block's internal cadence**: a long snippet at the block's default per-character speed _overruns_ a short frame — the code never finishes typing and the chrome beats never play. Set the cadence so the full block completes within `data-duration` (see `code-vocabulary.md`). Only a deliberate hold / stillness note skips development.
-5. **Fonts, keep-out, no narration text** (`font_family_without_font_face`) — every non-system font named in `frame.md` has an `@import` / `@font-face`; all content sits in the top ~83% (above the caption band); no narration sentence is rendered as visible text.
+5. **Fonts, keep-out, no narration text** (`font_family_without_font_face`) — every font you name has a matching `@font-face` (or `@import`) **inside this file**, and you **only use fonts that ship as files**: the families declared in `frame.md` (their `.woff2` live in `assets/fonts/` or `capture/assets/fonts/` — point `src` at the real file). **Never name a font with no file**, including system CJK / Japanese / Devanagari families (`Hiragino Sans`, `Yu Gothic`, `Noto Sans CJK`, `Noto Sans Devanagari`, …): the render machine is a clean headless Chrome without them, so the text silently falls back and the MP4 typography is wrong — for non-Latin / multilingual visible text use a shipped font that covers the script or romanize it, else it is out of scope. Plus: all content sits in the top ~83% (above the caption band); no narration sentence is rendered as visible text.
diff --git a/skills/product-launch-video/SKILL.md b/skills/product-launch-video/SKILL.md
index 78ce4873c0..679bcf55d4 100644
--- a/skills/product-launch-video/SKILL.md
+++ b/skills/product-launch-video/SKILL.md
@@ -23,7 +23,9 @@ Initialize only if `hyperframes.json` is missing. Name `` from the bran
`npx hyperframes init "videos/" --non-interactive --skip-skills --example=blank`
-**Gate:** `hyperframes.json` exists, and angle, length, aspect ratio, and language are locked.
+**Show sign-in status before the brief** — run `npx hyperframes auth status` and **relay its output verbatim (don't paraphrase or rewrite it).** It reports whether voice/BGM will use HeyGen or local engines and, when not signed in, how to sign in. **If not signed in, STOP and wait for the user to choose — sign in, or say "go"/"offline" to continue with local engines — before asking the brief or anything else.** Treat it as a real decision point, not a passing note; don't fold the choice into the brief question, and don't write keys into a per-repo `.env`. (In autonomous mode, note the status and continue offline.) See `../hyperframes-media` → Preflight for the canonical guidance.
+
+**Gate:** `hyperframes.json` exists, and angle, length, aspect ratio, and language are locked; sign-in status was shown (signed in, or continuing offline).
---
@@ -53,11 +55,11 @@ You make the one judgment call — **which preset**. Read `../hyperframes-creati
node /scripts/build-frame.mjs --preset --hyperframes .
```
-The script does the rest deterministically: copies the preset's `FRAME.md` → `frame.md` and **remixes** it onto the brand tokens in `capture/extracted/tokens.json` (brand colors mapped onto the preset's color keys by role — ink, canvas, accents — keeping keys/structure/components; the preset's display + body fonts swapped for the brand's), copies the preset's `caption-skin.html` verbatim, and self-validates (exits 1 on a broken mapping). Proceed to the next step as soon as it exits 0 — no hand-editing of the spec.
+The script does the rest deterministically: copies the preset's `FRAME.md` → `frame.md` and **remixes** it onto the brand tokens in `capture/extracted/tokens.json` (brand colors mapped onto the preset's color keys by role — ink, canvas, accents — keeping keys/structure/components; the preset's display + body fonts swapped for the brand's), copies the preset's caption skin to `.hyperframes/caption-skin.html`, and self-validates (exits 1 on a broken mapping). Proceed to the next step as soon as it exits 0 — no hand-editing of the spec.
`tokens.json` with no brand colors/fonts (e.g. no capture) → the script keeps the preset's own palette, a complete shippable design. If the brief names brand colors/fonts the capture missed, add them to `capture/extracted/tokens.json` before running (or use the user's `design.md` to populate it); only adjust `frame.md` by hand afterward if a mapping truly needs it.
-**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from a named preset, and (when the preset ships one) `caption-skin.html` is at the project root.
+**Gate:** `build-frame.mjs` exited 0 — `frame.md` exists from a named preset, and (when the preset ships one) `.hyperframes/caption-skin.html` exists as the caption skin source.
---
@@ -135,7 +137,7 @@ After audio timings exist, build captions in the background and assemble the ind
`node /scripts/assemble-index.mjs --storyboard ./STORYBOARD.md --hyperframes .`
-`captions.mjs` uses the project's `caption-skin.html` (copied in Step 2) as the caption look, injecting brand tokens from `frame.md`; with no skin present it renders the built-in default pill. `captions: skipped ()` is valid. Continue without captions when explicitly skipped.
+`captions.mjs` uses the project's `.hyperframes/caption-skin.html` (copied in Step 2) as the caption look, injecting brand tokens from `frame.md`; with no skin present it renders the built-in default pill. `captions: skipped ()` is valid. Continue without captions when explicitly skipped.
**Gate:** every frame is marked `animated`, `index.html` exists, and captions are built or explicitly skipped.
@@ -159,9 +161,11 @@ Inject transitions, run checks, pause for review, then render.
`npx hyperframes snapshot --at `
-If a command fails, surface stderr and stop. Do not pile on recovery commands. If a gate names a frame, fix `compositions/frames/NN-*.html` with the cheapest safe fix: edit the frame HTML for a local issue; re-dispatch the frame worker only when the whole shot must be rebuilt.
+`snapshot` stitches the captured frames into one contact sheet (`snapshots/contact-sheet.jpg`). Glance at it; if nothing is obviously broken, move on — don't linger here.
+
+If a command fails, surface stderr and stop — don't pile on recovery commands. Fix it yourself: the cheapest safe edit to `compositions/frames/NN-*.html`, then rerun the failed check.
-**Known false-positive — do not chase it.** `inspect` may report a handful of `text_box_overflow` errors of ~1–4px on the **caption** highlight words (selector `#caption-word-*` / `.caption-line`). The caption pill uses a deliberately snug `line-height` (set once in `scripts/captions.mjs`) and has **no `overflow:hidden`**, so a heavy display glyph's ink spills a few px into the pill's own padding — nothing is actually clipped. Treat these as expected and proceed. Do **not** inflate the caption `line-height` (it balloons the pill, which is worse) and do **not** re-dispatch a frame for them. Only act on a `text_box_overflow` when it names a **frame** element (`#el-NN-*`), not a caption word.
+**Known false-positive — do not chase it.** `inspect` may report a handful of `text_box_overflow` errors of ~1–4px on the **caption** highlight words (selector `#caption-word-*` / `.caption-line`). The caption pill uses a deliberately snug `line-height` (set once in `scripts/captions.mjs`) and has **no `overflow:hidden`**, so a heavy display glyph's ink spills a few px into the pill's own padding — nothing is actually clipped. Treat these as expected and proceed. Do **not** inflate the caption `line-height` (it balloons the pill, which is worse). Only act on a `text_box_overflow` when it names a **frame** element (`#el-NN-*`), not a caption word.
After checks pass, pause for user review. The video is assembled, viewable, and editable in Studio. Manage preview only once across Step 3 and Step 6: open it if the user asked earlier, offer it if they declined earlier, and do not ask again if they are already reviewing in Studio.
diff --git a/skills/product-launch-video/scripts/build-frame.mjs b/skills/product-launch-video/scripts/build-frame.mjs
index dabb9ff7d8..de00509ae3 100644
--- a/skills/product-launch-video/scripts/build-frame.mjs
+++ b/skills/product-launch-video/scripts/build-frame.mjs
@@ -19,7 +19,14 @@
// fonts — the preset's display family → the brand display font, its body family →
// the brand body font, wherever they appear. Empty brand fonts → kept.
-import { copyFileSync, existsSync, readdirSync, readFileSync, writeFileSync } from "node:fs";
+import {
+ copyFileSync,
+ existsSync,
+ mkdirSync,
+ readdirSync,
+ readFileSync,
+ writeFileSync,
+} from "node:fs";
import { dirname, join, resolve } from "node:path";
import { fileURLToPath } from "node:url";
import {
@@ -214,8 +221,10 @@ if (brandColors.length && presetColors.length) {
}
if (inBlock && /^\S/.test(line)) inBlock = false;
if (!inBlock) return line;
- const m = line.match(/^(\s+)([\w-]+):\s*["']?[^"'\n]*["']?\s*$/);
- if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"`;
+ const m = line.match(
+ /^(\s+)([\w-]+):\s*(?:"[^"]*"|'[^']*'|#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\n]*?)(\s+#.*)?$/,
+ );
+ if (m && newByKey.has(m[2])) return `${m[1]}${m[2]}: "${newByKey.get(m[2])}"${m[3] ?? ""}`;
return line;
})
.join("\n");
@@ -254,7 +263,9 @@ writeFileSync(framePath, md);
const presetSkin = join(presetDir, presetName, "caption-skin.html");
let skinCopied = false;
if (existsSync(presetSkin)) {
- copyFileSync(presetSkin, join(hyperframesDir, "caption-skin.html"));
+ const skinDir = join(hyperframesDir, ".hyperframes");
+ mkdirSync(skinDir, { recursive: true });
+ copyFileSync(presetSkin, join(skinDir, "caption-skin.html"));
skinCopied = true;
}
@@ -275,6 +286,6 @@ if (li != null && lc != null && li >= lc) {
console.log(`✓ build-frame: ${presetName} → ${framePath}`);
for (const s of summary) console.log(` ${s}`);
console.log(
- ` caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
+ ` .hyperframes/caption-skin.html: ${skinCopied ? "copied" : "preset ships none — captions will use the default pill"}`,
);
console.log(` self-check: keys preserved, ink darker than canvas ✓`);
diff --git a/skills/product-launch-video/scripts/captions.mjs b/skills/product-launch-video/scripts/captions.mjs
index 72035382fa..dfa25c9fcf 100644
--- a/skills/product-launch-video/scripts/captions.mjs
+++ b/skills/product-launch-video/scripts/captions.mjs
@@ -15,8 +15,9 @@
// node captions.mjs build --storyboard ./STORYBOARD.md --audio-meta ./audio_meta.json --hyperframes . --out ./caption_groups.json
//
// CAPTION LOOK — two sources, picked automatically:
-// 1. PRESET SKIN (preferred). If a project-local `caption-skin.html` exists (Step 2
-// copies the chosen frame-preset's skin into the project), it is the caption look.
+// 1. PRESET SKIN (preferred). If a project-local `.hyperframes/caption-skin.html`
+// exists (Step 2 copies the chosen frame-preset's skin into the project), it is
+// the caption look.
// It is a brand-token-strict skin with three reserved holes; this script fills them
// and wraps the result in a for the engine:
// - `var GROUPS = [];` → the computed caption groups
@@ -68,7 +69,12 @@ function runBuild(argv) {
const outPath = resolve(flag(argv, "out", join(hyperframesDir, "caption_groups.json")));
const htmlPath = join(hyperframesDir, "compositions/captions.html");
const overridesPath = join(hyperframesDir, "caption-overrides.json");
- const skinPath = resolve(flag(argv, "skin", join(hyperframesDir, "caption-skin.html")));
+ const skinArg = flag(argv, "skin", null);
+ const hiddenSkinPath = join(hyperframesDir, ".hyperframes", "caption-skin.html");
+ const legacySkinPath = join(hyperframesDir, "caption-skin.html");
+ const skinPath = resolve(
+ skinArg ?? (existsSync(hiddenSkinPath) ? hiddenSkinPath : legacySkinPath),
+ );
const framePath = resolve(flag(argv, "frame", join(hyperframesDir, "frame.md")));
if (!existsSync(storyboardPath)) die(`STORYBOARD.md not found at ${storyboardPath}`);
@@ -291,8 +297,8 @@ function brandFontFaces(framePath, hyperframesDir) {
const weightOf = (n) => {
const s = n.toLowerCase();
if (/black|heavy|ultra|extrabold/.test(s)) return 800;
+ if (/semibold|demibold/.test(s)) return 600; // before /bold/ — "demibold" contains "bold"
if (/bold/.test(s)) return 700;
- if (/semibold|demibold/.test(s)) return 600;
if (/medium/.test(s)) return 500;
if (/light|thin/.test(s)) return 300;
return 400; // book / regular / roman
@@ -305,10 +311,22 @@ function brandFontFaces(framePath, hyperframesDir) {
: /\.ttf$/i.test(f)
? "truetype"
: "opentype";
+ // Normalize away ALL non-alphanumerics (spaces, underscores, hyphens) on BOTH the
+ // family name and the filename. Real font files use "_" / "-" as word separators
+ // ("TT_Norms_Pro_Bold.woff2"), so stripping only whitespace never matched them — the
+ // family key "ttnormspro" failed `startsWith` against "tt_norms_pro_bold", and the
+ // function silently returned "" → captions shipped with NO @font-face for any
+ // underscore/hyphen-named brand font (e.g. TT Norms Pro), which is exactly the
+ // font_family_without_font_face bug.
+ const norm = (s) => s.toLowerCase().replace(/[^a-z0-9]/g, "");
const faces = [];
const seen = new Set();
- for (const fam of families) {
- const key = fam.replace(/\s+/g, "").toLowerCase();
+ const claimed = new Set(); // each file is claimed by the MOST SPECIFIC family only
+ // Match the longest family key first so "TT Norms Pro" can't swallow the files that
+ // belong to "TT Norms Pro Mono" (its key is a prefix of the longer one's).
+ const ranked = [...families].sort((a, b) => norm(b).length - norm(a).length);
+ for (const fam of ranked) {
+ const key = norm(fam);
for (const d of dirs) {
let files = [];
try {
@@ -318,17 +336,34 @@ function brandFontFaces(framePath, hyperframesDir) {
}
for (const f of files.sort()) {
if (!/\.(woff2|woff|ttf|otf)$/i.test(f)) continue;
- if (!f.replace(/\s+/g, "").toLowerCase().startsWith(key)) continue;
+ if (claimed.has(f)) continue; // a more specific family already took this file
+ if (!norm(f.replace(/\.(woff2|woff|ttf|otf)$/i, "")).startsWith(key)) continue;
const w = weightOf(f);
const dedup = `${fam}-${w}`;
if (seen.has(dedup)) continue; // one src per weight; assets/fonts wins over capture
seen.add(dedup);
+ claimed.add(f);
faces.push(
` @font-face { font-family: '${fam}'; src: url('${d.rel}/${f}') format('${fmtOf(f)}'); font-weight: ${w}; font-display: block; }`,
);
}
}
}
+ // Loud signal instead of a silent "". If frame.md named a brand font but no file
+ // matched, the caption text WILL fall back to a generic font in the render — surface
+ // the cause here (at build time) rather than letting it surface 2 steps later as a
+ // font_family_without_font_face lint error disconnected from its root cause.
+ if (!faces.length) {
+ const where = dirs.length
+ ? dirs.map((d) => d.rel).join(" / ")
+ : "assets/fonts or capture/assets/fonts (neither exists)";
+ console.warn(
+ ` ⚠ captions: frame.md names font ${families.map((f) => `"${f}"`).join(", ")} ` +
+ `but no matching .woff2/.woff/.ttf/.otf was found in ${where} — captions will fall back ` +
+ `(text may render in the wrong font). Stage a font file whose name starts with the family ` +
+ `(e.g. "TT Norms Pro" → TT_Norms_Pro_Bold.woff2) so it ships with the project.`,
+ );
+ }
return faces.join("\n");
}
diff --git a/skills/product-launch-video/scripts/lib/tokens.mjs b/skills/product-launch-video/scripts/lib/tokens.mjs
index 15f87818c4..3f51778ec5 100644
--- a/skills/product-launch-video/scripts/lib/tokens.mjs
+++ b/skills/product-launch-video/scripts/lib/tokens.mjs
@@ -15,9 +15,9 @@ export function parseColors(md) {
if (!inBlock) continue;
if (/^\S/.test(line)) break; // dedent to a top-level key → end of block
const m = line.match(
- /^\s+([\w-]+):\s*["']?(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^"'#\s][^"'\n]*?)["']?\s*$/,
+ /^\s+([\w-]+):\s*(?:"([^"]+)"|'([^']+)'|(#[0-9a-fA-F]{3,8}|rgba?\([^)]*\)|[^#\s][^#\n]*?))\s*(?:#.*)?$/,
);
- if (m) out.push([m[1], m[2].trim()]);
+ if (m) out.push([m[1], (m[2] ?? m[3] ?? m[4]).trim()]);
}
return out;
}
@@ -165,6 +165,10 @@ export function parseFonts(md) {
roles["card-headline"] ??
roles["section-headline"] ??
roles["quote-display"] ??
+ roles.h1 ??
+ roles.h2 ??
+ roles.title ??
+ roles.hero ??
body;
return { display: q(display), body: q(body) };
}
diff --git a/skills/product-launch-video/sub-agents/frame-worker.md b/skills/product-launch-video/sub-agents/frame-worker.md
index 7930f3ac11..965e7834b8 100644
--- a/skills/product-launch-video/sub-agents/frame-worker.md
+++ b/skills/product-launch-video/sub-agents/frame-worker.md
@@ -65,5 +65,5 @@ You **can't** meaningfully run `hyperframes lint` / `validate` / `inspect` here:
- `exit_animation_on_non_final_scene` — no exit tween unless you are the final frame.
- **Shot develops (not a slide)** — a non-still frame carries a development beat between entrance and settle; cited effects are sequenced into phases, not all fired at `t=0`.
- **Adapt fidelity** — if the note led with `Base / Keep / Depart`, the `Keep` signature is present and recognizable, every `Depart` is applied, and the shot still runs `entrance → development → settle`.
-- `font_family_without_font_face` — every non-system font named in `frame.md` has an `@import` / `@font-face`.
+- `font_family_without_font_face` — every font you name has a matching `@font-face` (or `@import`) **inside this file**. **Only use fonts that ship as files** with the project: the families declared in `frame.md` (their `.woff2` live in `assets/fonts/` or `capture/assets/fonts/` — point the `@font-face` `src` at the real file you find there). **Never name a font that has no file**, including system CJK / Japanese / Devanagari families (`Hiragino Sans`, `Yu Gothic`, `Noto Sans CJK`, `Noto Sans Devanagari`, …): the render machine is a clean headless Chrome with none of them installed, so the text silently falls back to a generic font and the typography is wrong in the MP4. For non-Latin or multilingual visible text, either use a shipped font that covers the script, or romanize / transliterate it (e.g. `日本語` → `Japanese`); if neither is possible it is out of scope for this frame — do not invent a font name.
- **Keep-out + no-narration-text** (eyeball, no code) — nothing sits below the 83% cutoff; no narration sentence is rendered as visible text.
diff --git a/skills/website-to-video/SKILL.md b/skills/website-to-video/SKILL.md
index 3d9c6d7285..01a9cf754b 100644
--- a/skills/website-to-video/SKILL.md
+++ b/skills/website-to-video/SKILL.md
@@ -34,7 +34,9 @@ If you find yourself reasoning "auto mode says bias toward action, so I'll skip
Capture the site, then read the extracted data to understand the **brand and product** — what it does, who it's for, what voice it speaks in, what mood it lives in. The captured assets are a brand toolkit for later, not the building blocks the video is made from.
-**Gate:** Site summary printed — strategy-first (what the product does, who it's for, brand voice) before the asset / color / font inventory.
+**Show sign-in status before the brief** — run `npx hyperframes auth status` and **relay its output verbatim (don't paraphrase or rewrite it).** It reports whether voice/BGM will use HeyGen or local engines and, when not signed in, how to sign in. **If not signed in, STOP and wait for the user to choose — sign in, or say "go"/"offline" to continue with local engines — before asking the brief or anything else.** Treat it as a real decision point, not a passing note; don't fold the choice into the brief question, and don't write keys into a per-repo `.env`. (In autonomous mode, note the status and continue offline.) See `../hyperframes-media` → Preflight for the canonical guidance.
+
+**Gate:** Site summary printed — strategy-first (what the product does, who it's for, brand voice) before the asset / color / font inventory; sign-in status was shown (signed in, or continuing offline).
---