diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000000..3a748a0bd3a41 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,161 @@ +# CLAUDE.md + +This file provides guidance to coding agents when working with code in this repository. + +## Repository Overview + +Cube is a semantic layer for building data applications. This is a monorepo containing the complete Cube ecosystem including: +- Cube backend server and core components +- Client libraries for JavaScript/React/Vue/Angular +- Database drivers for various data sources +- Documentation site +- Rust components (CubeSQL, CubeStore) + +## Development Commands + +**Note: This project uses Yarn as the package manager.** + +### Core Build Commands +```bash +# Build all packages +yarn build + +# Run TypeScript compilation across all packages +yarn tsc + +# Watch mode for TypeScript compilation +yarn tsc:watch + +# Clean build artifacts +yarn clean + +# Run linting across all packages +yarn lint + +# Fix linting issues +yarn lint:fix + +# Lint package.json files +yarn lint:npm +``` + +### Testing Commands +```bash +# Run tests (most packages have individual test commands) +yarn test + +# Test individual packages +cd packages/cubejs-[package-name] +yarn test +``` + +### Documentation Development + +**IMPORTANT: `/docs-mintlify` is the active documentation site. `/docs` is the legacy +docs site and is deprecated — do NOT add or edit content there.** When asked to write or +update documentation, work in `/docs-mintlify` unless the user explicitly says otherwise. + +```bash +cd docs-mintlify +yarn dev # Start the Mintlify dev server +``` + +- Content is authored as `.mdx` under topic directories (e.g. `admin/ai/`, `docs/explore-analyze/`). +- Frontmatter uses `title` and `description` keys. +- Navigation is registered in `docs-mintlify/docs.json` (pages must be added to the + relevant `group` to appear in the sidebar). +- Use Mintlify components: ``, ``, ``, ``, ``/``, + ``/``. Internal links are root-relative (e.g. `/admin/ai/rules`). +- See `docs-mintlify/CLAUDE.md` for full conventions. + +## Architecture Overview + +### Monorepo Structure +- **`/packages`**: All JavaScript/TypeScript packages managed by Lerna + - Core packages: `cubejs-server-core`, `cubejs-schema-compiler`, `cubejs-query-orchestrator` + - Client libraries: `cubejs-client-core`, `cubejs-client-react`, etc. + - Database drivers: `cubejs-postgres-driver`, `cubejs-bigquery-driver`, etc. + - API layer: `cubejs-api-gateway` +- **`/rust`**: Rust components including CubeSQL (SQL interface) and CubeStore (distributed storage) +- **`/docs-mintlify`**: Mintlify documentation site — **the active docs site** (author docs here) +- **`/docs`**: Legacy Next.js/Nextra documentation site — **deprecated**, do not edit +- **`/examples`**: Example implementations and recipes + +### Key Components +1. **Schema Compiler**: Compiles data models into executable queries +2. **Query Orchestrator**: Manages query execution, caching, and pre-aggregations +3. **API Gateway**: Provides REST, GraphQL, and SQL APIs +4. **CubeSQL**: Postgres-compatible SQL interface (Rust) +5. **CubeStore**: Distributed OLAP storage engine (Rust) +6. **Tesseract**: Native SQL planner (Rust) located in `/rust/cube/cubesqlplanner` - enabled via `CUBESQL_SQL_PUSH_DOWN=true` environment variable + +### Package Management +- Uses Yarn workspaces with Lerna for package management +- TypeScript compilation is coordinated across packages +- Jest for unit testing with package-specific configurations + +## Testing Approach + +### Unit Tests +- Most packages have Jest-based unit tests in `/test` directories +- TypeScript packages use `jest.config.js` with TypeScript compilation +- Snapshot testing for SQL compilation and query planning + +### Integration Tests +- Driver-specific integration tests in `/packages/cubejs-testing-drivers` +- End-to-end tests in `/packages/cubejs-testing` +- Docker-based testing environments for database drivers + +### Test Commands +```bash +# Individual package testing +cd packages/[package-name] +yarn test + +# Driver integration tests (requires Docker) +cd packages/cubejs-testing-drivers +yarn test +``` + +## Development Workflow + +1. **Making Changes**: Work in individual packages, changes are coordinated via Lerna +2. **Building**: Use `yarn tsc` to compile TypeScript across all packages +3. **Testing**: Run relevant tests for modified packages +4. **Linting**: Ensure code passes `yarn lint` before committing + +## Git + +Use conventional commits with these prefixes: +- `feat:` — new features +- `fix:` — bug fixes +- `docs:` — documentation changes +- `refactor:` — code refactoring + +Include scope in parentheses when applicable, e.g., `fix(tesseract):` or `feat(databricks-jdbc-driver):`. + +## Common File Patterns + +- `*.test.ts/js`: Jest unit tests +- `jest.config.js`: Jest configuration per package +- `tsconfig.json`: TypeScript configuration (inherits from root) +- `CHANGELOG.md`: Per-package changelogs maintained by Lerna +- `src/`: Source code directory +- `dist/`: Compiled output (not committed) + +## Important Notes + +- Documentation lives in `/docs-mintlify` (active, Mintlify). `/docs` is the legacy docs + site and is deprecated — do not add or edit content there. See `docs-mintlify/CLAUDE.md`. +- The main Cube application development happens in `/packages` +- For data model changes, focus on `cubejs-schema-compiler` package +- For query execution changes, focus on `cubejs-query-orchestrator` package +- Database connectivity is handled by individual driver packages + +## Key Dependencies + +- **Lerna**: Monorepo management and publishing +- **TypeScript**: Primary language for most packages +- **Jest**: Testing framework +- **Rollup**: Bundling for client libraries +- **Docker**: Testing environments for database drivers \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index db985ff947ba3..43c994c2d3617 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,161 +1 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Repository Overview - -Cube is a semantic layer for building data applications. This is a monorepo containing the complete Cube ecosystem including: -- Cube backend server and core components -- Client libraries for JavaScript/React/Vue/Angular -- Database drivers for various data sources -- Documentation site -- Rust components (CubeSQL, CubeStore) - -## Development Commands - -**Note: This project uses Yarn as the package manager.** - -### Core Build Commands -```bash -# Build all packages -yarn build - -# Run TypeScript compilation across all packages -yarn tsc - -# Watch mode for TypeScript compilation -yarn tsc:watch - -# Clean build artifacts -yarn clean - -# Run linting across all packages -yarn lint - -# Fix linting issues -yarn lint:fix - -# Lint package.json files -yarn lint:npm -``` - -### Testing Commands -```bash -# Run tests (most packages have individual test commands) -yarn test - -# Test individual packages -cd packages/cubejs-[package-name] -yarn test -``` - -### Documentation Development - -**IMPORTANT: `/docs-mintlify` is the active documentation site. `/docs` is the legacy -docs site and is deprecated — do NOT add or edit content there.** When asked to write or -update documentation, work in `/docs-mintlify` unless the user explicitly says otherwise. - -```bash -cd docs-mintlify -yarn dev # Start the Mintlify dev server -``` - -- Content is authored as `.mdx` under topic directories (e.g. `admin/ai/`, `docs/explore-analyze/`). -- Frontmatter uses `title` and `description` keys. -- Navigation is registered in `docs-mintlify/docs.json` (pages must be added to the - relevant `group` to appear in the sidebar). -- Use Mintlify components: ``, ``, ``, ``, ``/``, - ``/``. Internal links are root-relative (e.g. `/admin/ai/rules`). -- See `docs-mintlify/CLAUDE.md` for full conventions. - -## Architecture Overview - -### Monorepo Structure -- **`/packages`**: All JavaScript/TypeScript packages managed by Lerna - - Core packages: `cubejs-server-core`, `cubejs-schema-compiler`, `cubejs-query-orchestrator` - - Client libraries: `cubejs-client-core`, `cubejs-client-react`, etc. - - Database drivers: `cubejs-postgres-driver`, `cubejs-bigquery-driver`, etc. - - API layer: `cubejs-api-gateway` -- **`/rust`**: Rust components including CubeSQL (SQL interface) and CubeStore (distributed storage) -- **`/docs-mintlify`**: Mintlify documentation site — **the active docs site** (author docs here) -- **`/docs`**: Legacy Next.js/Nextra documentation site — **deprecated**, do not edit -- **`/examples`**: Example implementations and recipes - -### Key Components -1. **Schema Compiler**: Compiles data models into executable queries -2. **Query Orchestrator**: Manages query execution, caching, and pre-aggregations -3. **API Gateway**: Provides REST, GraphQL, and SQL APIs -4. **CubeSQL**: Postgres-compatible SQL interface (Rust) -5. **CubeStore**: Distributed OLAP storage engine (Rust) -6. **Tesseract**: Native SQL planner (Rust) located in `/rust/cube/cubesqlplanner` - enabled via `CUBESQL_SQL_PUSH_DOWN=true` environment variable - -### Package Management -- Uses Yarn workspaces with Lerna for package management -- TypeScript compilation is coordinated across packages -- Jest for unit testing with package-specific configurations - -## Testing Approach - -### Unit Tests -- Most packages have Jest-based unit tests in `/test` directories -- TypeScript packages use `jest.config.js` with TypeScript compilation -- Snapshot testing for SQL compilation and query planning - -### Integration Tests -- Driver-specific integration tests in `/packages/cubejs-testing-drivers` -- End-to-end tests in `/packages/cubejs-testing` -- Docker-based testing environments for database drivers - -### Test Commands -```bash -# Individual package testing -cd packages/[package-name] -yarn test - -# Driver integration tests (requires Docker) -cd packages/cubejs-testing-drivers -yarn test -``` - -## Development Workflow - -1. **Making Changes**: Work in individual packages, changes are coordinated via Lerna -2. **Building**: Use `yarn tsc` to compile TypeScript across all packages -3. **Testing**: Run relevant tests for modified packages -4. **Linting**: Ensure code passes `yarn lint` before committing - -## Git - -Use conventional commits with these prefixes: -- `feat:` — new features -- `fix:` — bug fixes -- `docs:` — documentation changes -- `refactor:` — code refactoring - -Include scope in parentheses when applicable, e.g., `fix(tesseract):` or `feat(databricks-jdbc-driver):`. - -## Common File Patterns - -- `*.test.ts/js`: Jest unit tests -- `jest.config.js`: Jest configuration per package -- `tsconfig.json`: TypeScript configuration (inherits from root) -- `CHANGELOG.md`: Per-package changelogs maintained by Lerna -- `src/`: Source code directory -- `dist/`: Compiled output (not committed) - -## Important Notes - -- Documentation lives in `/docs-mintlify` (active, Mintlify). `/docs` is the legacy docs - site and is deprecated — do not add or edit content there. See `docs-mintlify/CLAUDE.md`. -- The main Cube application development happens in `/packages` -- For data model changes, focus on `cubejs-schema-compiler` package -- For query execution changes, focus on `cubejs-query-orchestrator` package -- Database connectivity is handled by individual driver packages - -## Key Dependencies - -- **Lerna**: Monorepo management and publishing -- **TypeScript**: Primary language for most packages -- **Jest**: Testing framework -- **Rollup**: Bundling for client libraries -- **Docker**: Testing environments for database drivers \ No newline at end of file +@AGENTS.md diff --git a/docs-mintlify/AGENTS.md b/docs-mintlify/AGENTS.md new file mode 100644 index 0000000000000..0bf4c484c5310 --- /dev/null +++ b/docs-mintlify/AGENTS.md @@ -0,0 +1,98 @@ +# Cube Documentation (Mintlify) + +This is the **active** Cube documentation site, built with [Mintlify](https://mintlify.com). +All documentation work should happen here. + +> The `/docs` directory at the repo root is the **legacy** Nextra docs site and is +> **deprecated** — do not add or edit content there. + +## Local development + +```bash +cd docs-mintlify +yarn dev # Start the Mintlify dev server +``` + +## Naming conventions + +Product naming conventions (product names, taxonomy, deployment types, plan +tiers, API names) are maintained in a shared rules file — follow it in all +docs content: + +@../.cursor/rules/namings-rule.mdc + +## Writing style + +- **Tone**: professional, direct, instructive. Address the reader as "you" (second person). +- **Headings**: one H1 is provided by the frontmatter `title` — start body sections at H2 (`##`). +- **Code**: always specify a language fence (` ```yaml`, ` ```markdown`, ` ```text`). Use + inline backticks for identifiers (`accessible_views`, `agents/rules/`). +- **Paragraphs**: keep them short; use `-` bullet lists for multiple items. + +## File and frontmatter conventions + +- Content is `.mdx`, organized by topic directory (e.g. `admin/ai/`, `docs/explore-analyze/`). +- The file path maps to the URL: `admin/ai/rules.mdx` → `/admin/ai/rules`. +- Every page starts with YAML frontmatter using `title` and `description`: + + ```mdx + --- + title: Rules + description: One-sentence summary used for SEO and navigation previews. + --- + ``` + +- **Do not** add an H1 in the body — the `title` is the page heading. + +## Navigation + +Navigation is defined in `docs-mintlify/docs.json`. A new page only appears in the sidebar +once its path (without the `.mdx` extension) is added to the appropriate `group` in +`docs.json`. After adding a page, update `docs.json` and verify it is still valid JSON. + +## Components + +Mintlify provides these components (used throughout the docs): + +- Callouts: ``, ``, ``, ``, `` +- `` with nested `` for sequential instructions +- `` with nested `` +- `` / ``, `` / ``, `` for images + +Content inside callouts and steps is plain MDX. Internal links are root-relative +(`/admin/ai/skills`), not file paths. + +## Preview features + +Every page documenting a feature that is in **preview** must open with a `` +callout — placed right after the frontmatter, before the body — saying the feature is +in preview and that the user should reach out to the Cube support team to activate it +for their account: + +```mdx + + + is currently in preview, and the user experience and file format may +still change. Reach out to the [Cube support team](/admin/account-billing/support) +to activate this feature for your account. + + +``` + +Adapt the "may still change" sentence per feature; the "in preview" + "reach out to +the Cube support team to activate it for your account" parts are required. Do not +expose internal feature-flag names in public docs. + +## Images and screenshots + +Wrap screenshots in `` and store assets under `images/`. When a screenshot is +needed but not yet available, leave an MDX comment placeholder: `{/* TODO: screenshot — ... */}`. + +## AI / agent docs structure + +The agent configuration (code-first, developer-facing) lives under `admin/ai/`: +`rules.mdx`, `certified-queries.mdx`, `skills.mdx`, `memory-isolation.mdx`, +`multi-agent.mdx`, `bring-your-own-model.mdx`. The end-user chat experience +(explorer/viewer-facing) lives under `docs/explore-analyze/` (e.g. `analytics-chat.mdx`, +`skills.mdx`). Keep authoring docs in `admin/ai/` and usage docs in `docs/explore-analyze/`, +and cross-link the two. diff --git a/docs-mintlify/CLAUDE.md b/docs-mintlify/CLAUDE.md index 0bf4c484c5310..43c994c2d3617 100644 --- a/docs-mintlify/CLAUDE.md +++ b/docs-mintlify/CLAUDE.md @@ -1,98 +1 @@ -# Cube Documentation (Mintlify) - -This is the **active** Cube documentation site, built with [Mintlify](https://mintlify.com). -All documentation work should happen here. - -> The `/docs` directory at the repo root is the **legacy** Nextra docs site and is -> **deprecated** — do not add or edit content there. - -## Local development - -```bash -cd docs-mintlify -yarn dev # Start the Mintlify dev server -``` - -## Naming conventions - -Product naming conventions (product names, taxonomy, deployment types, plan -tiers, API names) are maintained in a shared rules file — follow it in all -docs content: - -@../.cursor/rules/namings-rule.mdc - -## Writing style - -- **Tone**: professional, direct, instructive. Address the reader as "you" (second person). -- **Headings**: one H1 is provided by the frontmatter `title` — start body sections at H2 (`##`). -- **Code**: always specify a language fence (` ```yaml`, ` ```markdown`, ` ```text`). Use - inline backticks for identifiers (`accessible_views`, `agents/rules/`). -- **Paragraphs**: keep them short; use `-` bullet lists for multiple items. - -## File and frontmatter conventions - -- Content is `.mdx`, organized by topic directory (e.g. `admin/ai/`, `docs/explore-analyze/`). -- The file path maps to the URL: `admin/ai/rules.mdx` → `/admin/ai/rules`. -- Every page starts with YAML frontmatter using `title` and `description`: - - ```mdx - --- - title: Rules - description: One-sentence summary used for SEO and navigation previews. - --- - ``` - -- **Do not** add an H1 in the body — the `title` is the page heading. - -## Navigation - -Navigation is defined in `docs-mintlify/docs.json`. A new page only appears in the sidebar -once its path (without the `.mdx` extension) is added to the appropriate `group` in -`docs.json`. After adding a page, update `docs.json` and verify it is still valid JSON. - -## Components - -Mintlify provides these components (used throughout the docs): - -- Callouts: ``, ``, ``, ``, `` -- `` with nested `` for sequential instructions -- `` with nested `` -- `` / ``, `` / ``, `` for images - -Content inside callouts and steps is plain MDX. Internal links are root-relative -(`/admin/ai/skills`), not file paths. - -## Preview features - -Every page documenting a feature that is in **preview** must open with a `` -callout — placed right after the frontmatter, before the body — saying the feature is -in preview and that the user should reach out to the Cube support team to activate it -for their account: - -```mdx - - - is currently in preview, and the user experience and file format may -still change. Reach out to the [Cube support team](/admin/account-billing/support) -to activate this feature for your account. - - -``` - -Adapt the "may still change" sentence per feature; the "in preview" + "reach out to -the Cube support team to activate it for your account" parts are required. Do not -expose internal feature-flag names in public docs. - -## Images and screenshots - -Wrap screenshots in `` and store assets under `images/`. When a screenshot is -needed but not yet available, leave an MDX comment placeholder: `{/* TODO: screenshot — ... */}`. - -## AI / agent docs structure - -The agent configuration (code-first, developer-facing) lives under `admin/ai/`: -`rules.mdx`, `certified-queries.mdx`, `skills.mdx`, `memory-isolation.mdx`, -`multi-agent.mdx`, `bring-your-own-model.mdx`. The end-user chat experience -(explorer/viewer-facing) lives under `docs/explore-analyze/` (e.g. `analytics-chat.mdx`, -`skills.mdx`). Keep authoring docs in `admin/ai/` and usage docs in `docs/explore-analyze/`, -and cross-link the two. +@AGENTS.md diff --git a/docs/AGENTS.md b/docs/AGENTS.md new file mode 100644 index 0000000000000..2afc1406befe6 --- /dev/null +++ b/docs/AGENTS.md @@ -0,0 +1,210 @@ +# Cube Documentation (LEGACY — DEPRECATED) + +> **This `/docs` site is deprecated. Do not add or edit content here.** +> The active documentation site is `/docs-mintlify` (Mintlify). Write all new and updated +> docs there — see `docs-mintlify/CLAUDE.md` for conventions. The guidance below is kept +> only for reference to the legacy site. + +This file provides guidance to coding agents when working with the documentation site. + +## Writing Style + +**Tone**: Professional, direct, and instructive. Address the reader as "you" in second person. + +**Good**: "You can connect a Cube deployment to Metabase using the SQL API." +**Avoid**: "One can connect..." or "Users can connect..." + +**Headings**: +- H1 (`#`) for page title only (one per page) +- H2 (`##`) for major sections +- H3 (`###`) for subsections +- H4 (`####`) rarely, only for deep nesting + +**Code**: +- Always specify language: ` ```python`, ` ```yaml`, ` ```javascript` +- Use `filename=` attribute when helpful: ` ```python filename="cube.py"` +- Inline code with backticks for identifiers: `driver_factory`, `security_context`, `pre_aggregations` + +**Links**: +- Define references at file bottom: + ``` + [ref-config]: /product/configuration + [ref-env-vars]: /product/configuration/reference/environment-variables + ``` +- Use reference syntax inline: `[configuration options][ref-config]` + +**Paragraphs**: Keep moderate length (3-4 sentences). Use bullet lists (with `-`) for multiple items. + +## Custom Components + +### Alert Boxes + +Use for callouts. Content should be on separate lines from the tags. + +**InfoBox** — informational notes: +```mdx + + +Scheduled refreshes are available on [Premium and Enterprise plans](https://cube.dev/pricing). + + +``` + +**WarningBox** — important warnings: +```mdx + + +Cube expects the context to be an object. If you don't provide an object as the +JWT payload, you will receive an error. + + +``` + +**SuccessBox** — availability or positive notes: +```mdx + + +Presentation tools are available in both Cube Cloud and Cube Core. + + +``` + +**ReferenceBox** — links to related documentation: +```mdx + + +See [Cube style guide][ref-style-guide] for more recommendations on syntax and structure. + + +``` + +### Code Tabs (for multi-language examples) + +````mdx + + +```python +from cube import config +``` + +```javascript +const config = {} +``` + + +```` + +### UI Navigation + +```mdx +Settings → Configuration +``` + +### Environment Variables + +```mdx +CUBEJS_DB_SSL +``` + +Auto-links to the environment variables reference. + +### Images + +```mdx + + + +``` + +### Videos + +```mdx + + +``` + +### Grids (for navigation cards) + +```mdx + + + +``` + +### Community Drivers + +```mdx + +``` + +## Documentation Structure + +### File Organization + +- Content lives in `/content/product/` +- Each directory needs `_meta.js` for navigation +- Use `index.mdx` with `asIndexPage: true` frontmatter for section overviews + +### _meta.js Files + +Define navigation order and display names: + +```javascript +export default { + "introduction": "Introduction", + "getting-started": "Getting started", + "configuration": "Data Sources & Config" +} +``` + +Hide pages from navigation: + +```javascript +export default { + "visible-page": "Visible Page", + "hidden-page": { + title: "Hidden Page", + display: "hidden" + } +} +``` + +### index.mdx Files + +Create section landing pages: + +```mdx +--- +asIndexPage: true +--- + +# Section Title + +Overview content here... +``` + +### URL Mapping + +File paths map directly to URLs: +- `configuration/data-sources/postgres.mdx` → `/product/configuration/data-sources/postgres` + +## Redirects + +When moving or renaming pages, add redirects to `redirects.json`: + +```json +{ + "source": "/old/path", + "destination": "/new/path", + "permanent": true +} +``` + +Always use `"permanent": true` for documentation moves. diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md index 26329abedf60c..43c994c2d3617 100644 --- a/docs/CLAUDE.md +++ b/docs/CLAUDE.md @@ -1,210 +1 @@ -# Cube Documentation (LEGACY — DEPRECATED) - -> **This `/docs` site is deprecated. Do not add or edit content here.** -> The active documentation site is `/docs-mintlify` (Mintlify). Write all new and updated -> docs there — see `docs-mintlify/CLAUDE.md` for conventions. The guidance below is kept -> only for reference to the legacy site. - -This file provides guidance to Claude Code when working with the documentation site. - -## Writing Style - -**Tone**: Professional, direct, and instructive. Address the reader as "you" in second person. - -**Good**: "You can connect a Cube deployment to Metabase using the SQL API." -**Avoid**: "One can connect..." or "Users can connect..." - -**Headings**: -- H1 (`#`) for page title only (one per page) -- H2 (`##`) for major sections -- H3 (`###`) for subsections -- H4 (`####`) rarely, only for deep nesting - -**Code**: -- Always specify language: ` ```python`, ` ```yaml`, ` ```javascript` -- Use `filename=` attribute when helpful: ` ```python filename="cube.py"` -- Inline code with backticks for identifiers: `driver_factory`, `security_context`, `pre_aggregations` - -**Links**: -- Define references at file bottom: - ``` - [ref-config]: /product/configuration - [ref-env-vars]: /product/configuration/reference/environment-variables - ``` -- Use reference syntax inline: `[configuration options][ref-config]` - -**Paragraphs**: Keep moderate length (3-4 sentences). Use bullet lists (with `-`) for multiple items. - -## Custom Components - -### Alert Boxes - -Use for callouts. Content should be on separate lines from the tags. - -**InfoBox** — informational notes: -```mdx - - -Scheduled refreshes are available on [Premium and Enterprise plans](https://cube.dev/pricing). - - -``` - -**WarningBox** — important warnings: -```mdx - - -Cube expects the context to be an object. If you don't provide an object as the -JWT payload, you will receive an error. - - -``` - -**SuccessBox** — availability or positive notes: -```mdx - - -Presentation tools are available in both Cube Cloud and Cube Core. - - -``` - -**ReferenceBox** — links to related documentation: -```mdx - - -See [Cube style guide][ref-style-guide] for more recommendations on syntax and structure. - - -``` - -### Code Tabs (for multi-language examples) - -````mdx - - -```python -from cube import config -``` - -```javascript -const config = {} -``` - - -```` - -### UI Navigation - -```mdx -Settings → Configuration -``` - -### Environment Variables - -```mdx -CUBEJS_DB_SSL -``` - -Auto-links to the environment variables reference. - -### Images - -```mdx - - - -``` - -### Videos - -```mdx - - -``` - -### Grids (for navigation cards) - -```mdx - - - -``` - -### Community Drivers - -```mdx - -``` - -## Documentation Structure - -### File Organization - -- Content lives in `/content/product/` -- Each directory needs `_meta.js` for navigation -- Use `index.mdx` with `asIndexPage: true` frontmatter for section overviews - -### _meta.js Files - -Define navigation order and display names: - -```javascript -export default { - "introduction": "Introduction", - "getting-started": "Getting started", - "configuration": "Data Sources & Config" -} -``` - -Hide pages from navigation: - -```javascript -export default { - "visible-page": "Visible Page", - "hidden-page": { - title: "Hidden Page", - display: "hidden" - } -} -``` - -### index.mdx Files - -Create section landing pages: - -```mdx ---- -asIndexPage: true ---- - -# Section Title - -Overview content here... -``` - -### URL Mapping - -File paths map directly to URLs: -- `configuration/data-sources/postgres.mdx` → `/product/configuration/data-sources/postgres` - -## Redirects - -When moving or renaming pages, add redirects to `redirects.json`: - -```json -{ - "source": "/old/path", - "destination": "/new/path", - "permanent": true -} -``` - -Always use `"permanent": true` for documentation moves. +@AGENTS.md diff --git a/packages/cubejs-backend-shared/AGENTS.md b/packages/cubejs-backend-shared/AGENTS.md new file mode 100644 index 0000000000000..e0736c94b9920 --- /dev/null +++ b/packages/cubejs-backend-shared/AGENTS.md @@ -0,0 +1,44 @@ +# CLAUDE.md + +This file provides guidance to coding agents when working with code in this repository. + +## Package Overview + +The `@cubejs-backend/shared` package contains shared utilities, types, and helper functions used across all Cube backend packages. This package provides core functionality like environment configuration, promise utilities, decorators, and common data types. + +## Development Commands + +**Note: This project uses Yarn as the package manager.** + +```bash +# Build the package +yarn build + +# Build with TypeScript compilation +yarn tsc + +# Watch mode for development +yarn watch + +# Run unit tests +yarn unit + +# Run linting +yarn lint + +# Fix linting issues +yarn lint:fix +``` + +## Architecture Overview + +### Core Components + +The shared package is organized into several key modules: + +1. **Environment Configuration** (`src/env.ts`): Centralized environment variable management with type safety and validation +2. **Promise Utilities** (`src/promises.ts`): Async helpers including debouncing, memoization, cancellation, and retry logic +3. **Decorators** (`src/decorators.ts`): Method decorators for cross-cutting concerns like async debouncing +4. **Type Helpers** (`src/type-helpers.ts`): Common TypeScript utility types used across packages +5. **Time Utilities** (`src/time.ts`): Date/time manipulation and formatting functions +6. **Process Utilities** (`src/process.ts`): Process management and platform detection diff --git a/packages/cubejs-backend-shared/CLAUDE.md b/packages/cubejs-backend-shared/CLAUDE.md index 8217b7870741d..43c994c2d3617 100644 --- a/packages/cubejs-backend-shared/CLAUDE.md +++ b/packages/cubejs-backend-shared/CLAUDE.md @@ -1,44 +1 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Package Overview - -The `@cubejs-backend/shared` package contains shared utilities, types, and helper functions used across all Cube backend packages. This package provides core functionality like environment configuration, promise utilities, decorators, and common data types. - -## Development Commands - -**Note: This project uses Yarn as the package manager.** - -```bash -# Build the package -yarn build - -# Build with TypeScript compilation -yarn tsc - -# Watch mode for development -yarn watch - -# Run unit tests -yarn unit - -# Run linting -yarn lint - -# Fix linting issues -yarn lint:fix -``` - -## Architecture Overview - -### Core Components - -The shared package is organized into several key modules: - -1. **Environment Configuration** (`src/env.ts`): Centralized environment variable management with type safety and validation -2. **Promise Utilities** (`src/promises.ts`): Async helpers including debouncing, memoization, cancellation, and retry logic -3. **Decorators** (`src/decorators.ts`): Method decorators for cross-cutting concerns like async debouncing -4. **Type Helpers** (`src/type-helpers.ts`): Common TypeScript utility types used across packages -5. **Time Utilities** (`src/time.ts`): Date/time manipulation and formatting functions -6. **Process Utilities** (`src/process.ts`): Process management and platform detection +@AGENTS.md diff --git a/packages/cubejs-query-orchestrator/AGENTS.md b/packages/cubejs-query-orchestrator/AGENTS.md new file mode 100644 index 0000000000000..f7bc1b007d5e5 --- /dev/null +++ b/packages/cubejs-query-orchestrator/AGENTS.md @@ -0,0 +1,131 @@ +# CLAUDE.md + +This file provides guidance to coding agents when working with code in this repository. + +## Package Overview + +The Query Orchestrator is a multi-stage querying engine that manages query execution, caching, and pre-aggregations in Cube. It receives pre-aggregation SQL queries and executes them in exact order, ensuring up-to-date data structure and freshness. + +## Development Commands + +**Note: This project uses Yarn as the package manager.** + +```bash +# Build the package +yarn build + +# Build with TypeScript compilation +yarn tsc + +# Watch mode for development +yarn watch + +# Run all tests (unit + integration) +yarn test + +# Run only unit tests +yarn unit + +# Run only integration tests +yarn integration + +# Run CubeStore integration tests specifically +yarn integration:cubestore + +# Run linting +yarn lint + +# Fix linting issues +yarn lint:fix +``` + +## Architecture Overview + +### Core Components + +The Query Orchestrator consists of several interconnected components: + +1. **QueryOrchestrator** (`src/orchestrator/QueryOrchestrator.ts`): Main orchestration class that coordinates query execution and manages drivers +2. **QueryCache** (`src/orchestrator/QueryCache.ts`): Handles query result caching with configurable cache drivers +3. **QueryQueue** (`src/orchestrator/QueryQueue.ts`): Manages query queuing and background processing +4. **PreAggregations** (`src/orchestrator/PreAggregations.ts`): Manages pre-aggregation building and loading +5. **DriverFactory** (`src/orchestrator/DriverFactory.ts`): Creates and manages database driver instances + +### Cache and Queue Driver Architecture + +The orchestrator supports multiple backend drivers: +- **Memory**: In-memory caching and queuing (development) +- **CubeStore**: Distributed storage engine (production) +- **Redis**: External Redis-based caching (legacy, being phased out) + +Driver selection logic in `QueryOrchestrator.ts:detectQueueAndCacheDriver()`: +- Explicit configuration via `cacheAndQueueDriver` option +- Environment variables (`CUBEJS_CACHE_AND_QUEUE_DRIVER`) +- Auto-detection: Redis if `CUBEJS_REDIS_URL` exists, CubeStore for production, Memory for development + +### Query Processing Flow + +1. **Query Submission**: Queries enter through QueryOrchestrator +2. **Cache Check**: QueryCache checks for existing results +3. **Queue Management**: QueryQueue handles background execution +4. **Pre-aggregation Processing**: PreAggregations component manages rollup tables +5. **Result Caching**: Results stored via cache driver for future requests + +### Pre-aggregation System + +The pre-aggregation system includes: +- **PreAggregationLoader**: Loads pre-aggregation definitions +- **PreAggregationPartitionRangeLoader**: Handles partition range loading +- **PreAggregationLoadCache**: Manages loading cache for pre-aggregations + +## Testing Structure + +### Unit Tests (`test/unit/`) +- `QueryCache.test.ts`: Query caching functionality +- `QueryQueue.test.ts`: Queue management and processing +- `QueryOrchestrator.test.js`: Main orchestrator logic +- `PreAggregations.test.js`: Pre-aggregation management + +### Integration Tests (`test/integration/`) +- `cubestore/`: CubeStore-specific integration tests +- Tests real database interactions and queue processing + +### Test Abstractions +- `QueryCache.abstract.ts`: Shared test suite for cache implementations +- `QueryQueue.abstract.ts`: Shared test suite for queue implementations + +## Key Design Patterns + +### Queue Processing Architecture +The DEVELOPMENT.md file contains detailed sequence diagrams showing: +- Queue interaction with CubeStore via specific queue commands (`QUEUE ADD`, `QUEUE RETRIEVE`, etc.) +- Background query processing with heartbeat management +- Result handling and cleanup + +### Driver Factory Pattern +- `DriverFactory` type enables pluggable database drivers +- `DriverFactoryByDataSource` supports multi-tenant scenarios +- Separation between external (user data) and internal (cache/queue) drivers + +### Error Handling +- `ContinueWaitError`: Signals when queries should continue waiting +- `TimeoutError`: Handles query timeout scenarios +- Proper cleanup and resource management across all components + +## Configuration + +Key configuration options in `QueryOrchestratorOptions`: +- `externalDriverFactory`: Database driver for user data +- `cacheAndQueueDriver`: Backend for caching and queuing +- `queryCacheOptions`: Cache-specific settings +- `preAggregationsOptions`: Pre-aggregation configuration +- `rollupOnlyMode`: When enabled, only serves pre-aggregated data +- `continueWaitTimeout`: Timeout for waiting operations + +## Development Notes + +- Uses TypeScript with relaxed strict settings (`tsconfig.json`) +- Inherits linting rules from `@cubejs-backend/linter` +- Jest configuration extends base repository config +- Docker Compose setup for integration testing +- Coverage reports generated in `coverage/` directory diff --git a/packages/cubejs-query-orchestrator/CLAUDE.md b/packages/cubejs-query-orchestrator/CLAUDE.md index 2867ba7b7f237..43c994c2d3617 100644 --- a/packages/cubejs-query-orchestrator/CLAUDE.md +++ b/packages/cubejs-query-orchestrator/CLAUDE.md @@ -1,131 +1 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Package Overview - -The Query Orchestrator is a multi-stage querying engine that manages query execution, caching, and pre-aggregations in Cube. It receives pre-aggregation SQL queries and executes them in exact order, ensuring up-to-date data structure and freshness. - -## Development Commands - -**Note: This project uses Yarn as the package manager.** - -```bash -# Build the package -yarn build - -# Build with TypeScript compilation -yarn tsc - -# Watch mode for development -yarn watch - -# Run all tests (unit + integration) -yarn test - -# Run only unit tests -yarn unit - -# Run only integration tests -yarn integration - -# Run CubeStore integration tests specifically -yarn integration:cubestore - -# Run linting -yarn lint - -# Fix linting issues -yarn lint:fix -``` - -## Architecture Overview - -### Core Components - -The Query Orchestrator consists of several interconnected components: - -1. **QueryOrchestrator** (`src/orchestrator/QueryOrchestrator.ts`): Main orchestration class that coordinates query execution and manages drivers -2. **QueryCache** (`src/orchestrator/QueryCache.ts`): Handles query result caching with configurable cache drivers -3. **QueryQueue** (`src/orchestrator/QueryQueue.ts`): Manages query queuing and background processing -4. **PreAggregations** (`src/orchestrator/PreAggregations.ts`): Manages pre-aggregation building and loading -5. **DriverFactory** (`src/orchestrator/DriverFactory.ts`): Creates and manages database driver instances - -### Cache and Queue Driver Architecture - -The orchestrator supports multiple backend drivers: -- **Memory**: In-memory caching and queuing (development) -- **CubeStore**: Distributed storage engine (production) -- **Redis**: External Redis-based caching (legacy, being phased out) - -Driver selection logic in `QueryOrchestrator.ts:detectQueueAndCacheDriver()`: -- Explicit configuration via `cacheAndQueueDriver` option -- Environment variables (`CUBEJS_CACHE_AND_QUEUE_DRIVER`) -- Auto-detection: Redis if `CUBEJS_REDIS_URL` exists, CubeStore for production, Memory for development - -### Query Processing Flow - -1. **Query Submission**: Queries enter through QueryOrchestrator -2. **Cache Check**: QueryCache checks for existing results -3. **Queue Management**: QueryQueue handles background execution -4. **Pre-aggregation Processing**: PreAggregations component manages rollup tables -5. **Result Caching**: Results stored via cache driver for future requests - -### Pre-aggregation System - -The pre-aggregation system includes: -- **PreAggregationLoader**: Loads pre-aggregation definitions -- **PreAggregationPartitionRangeLoader**: Handles partition range loading -- **PreAggregationLoadCache**: Manages loading cache for pre-aggregations - -## Testing Structure - -### Unit Tests (`test/unit/`) -- `QueryCache.test.ts`: Query caching functionality -- `QueryQueue.test.ts`: Queue management and processing -- `QueryOrchestrator.test.js`: Main orchestrator logic -- `PreAggregations.test.js`: Pre-aggregation management - -### Integration Tests (`test/integration/`) -- `cubestore/`: CubeStore-specific integration tests -- Tests real database interactions and queue processing - -### Test Abstractions -- `QueryCache.abstract.ts`: Shared test suite for cache implementations -- `QueryQueue.abstract.ts`: Shared test suite for queue implementations - -## Key Design Patterns - -### Queue Processing Architecture -The DEVELOPMENT.md file contains detailed sequence diagrams showing: -- Queue interaction with CubeStore via specific queue commands (`QUEUE ADD`, `QUEUE RETRIEVE`, etc.) -- Background query processing with heartbeat management -- Result handling and cleanup - -### Driver Factory Pattern -- `DriverFactory` type enables pluggable database drivers -- `DriverFactoryByDataSource` supports multi-tenant scenarios -- Separation between external (user data) and internal (cache/queue) drivers - -### Error Handling -- `ContinueWaitError`: Signals when queries should continue waiting -- `TimeoutError`: Handles query timeout scenarios -- Proper cleanup and resource management across all components - -## Configuration - -Key configuration options in `QueryOrchestratorOptions`: -- `externalDriverFactory`: Database driver for user data -- `cacheAndQueueDriver`: Backend for caching and queuing -- `queryCacheOptions`: Cache-specific settings -- `preAggregationsOptions`: Pre-aggregation configuration -- `rollupOnlyMode`: When enabled, only serves pre-aggregated data -- `continueWaitTimeout`: Timeout for waiting operations - -## Development Notes - -- Uses TypeScript with relaxed strict settings (`tsconfig.json`) -- Inherits linting rules from `@cubejs-backend/linter` -- Jest configuration extends base repository config -- Docker Compose setup for integration testing -- Coverage reports generated in `coverage/` directory +@AGENTS.md diff --git a/rust/cubesql/AGENTS.md b/rust/cubesql/AGENTS.md new file mode 100644 index 0000000000000..419843cca51eb --- /dev/null +++ b/rust/cubesql/AGENTS.md @@ -0,0 +1,146 @@ +# CLAUDE.md + +This file provides guidance to coding agents when working with code in this repository. + +## Repository Overview + +CubeSQL is a SQL proxy server that enables SQL-based access to Cube.js semantic layer. It emulates the PostgreSQL wire protocol, allowing standard SQL clients and BI tools to query Cube.js deployments as if they were traditional databases. Note: MySQL protocol support has been deprecated and is no longer available. + +This is a Rust workspace containing three crates: +- **cubesql**: Main SQL proxy server with query compilation and protocol emulation +- **cubeclient**: Rust client library for Cube.js API communication +- **pg-srv**: PostgreSQL wire protocol server implementation + +## Development Commands + +### Prerequisites +```bash +# Install required Rust toolchain (1.90.0) +rustup update + +# Install snapshot testing tool +cargo install cargo-insta +``` + +### Core Build Commands +```bash +# Build all workspace members +cargo build + +# Build release version +cargo build --release + +# Format code +cargo fmt + +# Run linting (note: many clippy rules are disabled) +cargo clippy +``` + +### Running CubeSQL Server +```bash +# Run with required environment variables +CUBESQL_CUBE_URL=$CUBE_URL/cubejs-api \ +CUBESQL_CUBE_TOKEN=$CUBE_TOKEN \ +CUBESQL_LOG_LEVEL=debug \ +CUBESQL_BIND_ADDR=0.0.0.0:4444 \ +cargo run --bin cubesqld + +# Connect via PostgreSQL client +psql -h 127.0.0.1 -p 4444 -U root +``` + +### Testing Commands +```bash +# Run all unit tests +cargo test + +# Run specific test module +cargo test test_introspection +cargo test test_udfs + +# Run integration tests (requires Cube.js instance) +cargo test --test e2e + +# Review snapshot test changes +cargo insta review + +# Run benchmarks +cargo bench +``` + +## Architecture Overview + +### Query Processing Pipeline +1. **Protocol Layer**: Accepts PostgreSQL wire protocol connections +2. **SQL Parser**: Modified sqlparser-rs parses incoming SQL queries +3. **Query Rewriter**: egg-based rewrite engine transforms SQL to Cube.js queries +4. **Compilation**: Generates Cube.js REST API calls or DataFusion execution plans +5. **Execution**: DataFusion executes queries or proxies to Cube.js +6. **Result Formatting**: Converts results back to wire protocol format + +### Key Components + +#### cubesql crate structure: +- **`/compile`**: SQL compilation and query planning + - `/engine`: DataFusion integration and query execution + - `/rewrite`: egg-based query optimization rules +- **`/sql`**: Database protocol implementations + - `/postgres`: PostgreSQL system catalog emulation + - `/database_variables`: Variable system for PostgreSQL protocol +- **`/transport`**: Network transport and session management +- **`/config`**: Configuration and service initialization + +#### Testing Approach: +- **Unit Tests**: Inline tests in source files using `#[cfg(test)]` +- **Integration Tests**: End-to-end tests in `/e2e` directory +- **Snapshot Tests**: Extensive use of `insta` for SQL compilation snapshots +- **BI Tool Tests**: Compatibility tests for Metabase, Tableau, PowerBI, etc. + +### Important Implementation Details + +1. **DataFusion Integration**: Uses forked Apache Arrow DataFusion for query execution +2. **Rewrite Rules**: Complex SQL transformations using egg e-graph library +3. **Protocol Emulation**: Implements enough of PostgreSQL protocol for BI tools +4. **System Catalogs**: Emulates pg_catalog (PostgreSQL) +5. **Variable Handling**: Supports SET/SHOW commands for protocol compatibility + +## Common Development Tasks + +### Adding New SQL Support +1. Add parsing support in `/compile/parser` +2. Create rewrite rules in `/compile/rewrite/rules` +3. Add tests with snapshot expectations +4. Update protocol-specific handling if needed + +### Debugging Query Compilation +```bash +# Enable detailed logging +CUBESQL_LOG_LEVEL=trace cargo run --bin cubesqld + +# Check rewrite traces in logs +# Look for "Rewrite" entries showing transformation steps +``` + +### Working with Snapshots +```bash +# After making changes that affect SQL compilation +cargo test +cargo insta review # Review and accept/reject changes +``` + +## Key Dependencies + +- **DataFusion**: Query execution engine (forked version with custom modifications) +- **sqlparser-rs**: SQL parser (forked with CubeSQL-specific extensions) +- **egg**: E-graph library for query optimization +- **tokio**: Async runtime for network and I/O operations +- **pgwire**: PostgreSQL wire protocol implementation + +## Important Notes + +- This codebase uses heavily modified forks of DataFusion and sqlparser-rs +- Many clippy lints are disabled due to code generation and complex patterns +- Integration tests require a running Cube.js instance +- The rewrite engine is performance-critical and uses advanced optimization techniques +- Protocol compatibility is paramount for BI tool support \ No newline at end of file diff --git a/rust/cubesql/CLAUDE.md b/rust/cubesql/CLAUDE.md index 9b220d2adfa1f..43c994c2d3617 100644 --- a/rust/cubesql/CLAUDE.md +++ b/rust/cubesql/CLAUDE.md @@ -1,146 +1 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Repository Overview - -CubeSQL is a SQL proxy server that enables SQL-based access to Cube.js semantic layer. It emulates the PostgreSQL wire protocol, allowing standard SQL clients and BI tools to query Cube.js deployments as if they were traditional databases. Note: MySQL protocol support has been deprecated and is no longer available. - -This is a Rust workspace containing three crates: -- **cubesql**: Main SQL proxy server with query compilation and protocol emulation -- **cubeclient**: Rust client library for Cube.js API communication -- **pg-srv**: PostgreSQL wire protocol server implementation - -## Development Commands - -### Prerequisites -```bash -# Install required Rust toolchain (1.90.0) -rustup update - -# Install snapshot testing tool -cargo install cargo-insta -``` - -### Core Build Commands -```bash -# Build all workspace members -cargo build - -# Build release version -cargo build --release - -# Format code -cargo fmt - -# Run linting (note: many clippy rules are disabled) -cargo clippy -``` - -### Running CubeSQL Server -```bash -# Run with required environment variables -CUBESQL_CUBE_URL=$CUBE_URL/cubejs-api \ -CUBESQL_CUBE_TOKEN=$CUBE_TOKEN \ -CUBESQL_LOG_LEVEL=debug \ -CUBESQL_BIND_ADDR=0.0.0.0:4444 \ -cargo run --bin cubesqld - -# Connect via PostgreSQL client -psql -h 127.0.0.1 -p 4444 -U root -``` - -### Testing Commands -```bash -# Run all unit tests -cargo test - -# Run specific test module -cargo test test_introspection -cargo test test_udfs - -# Run integration tests (requires Cube.js instance) -cargo test --test e2e - -# Review snapshot test changes -cargo insta review - -# Run benchmarks -cargo bench -``` - -## Architecture Overview - -### Query Processing Pipeline -1. **Protocol Layer**: Accepts PostgreSQL wire protocol connections -2. **SQL Parser**: Modified sqlparser-rs parses incoming SQL queries -3. **Query Rewriter**: egg-based rewrite engine transforms SQL to Cube.js queries -4. **Compilation**: Generates Cube.js REST API calls or DataFusion execution plans -5. **Execution**: DataFusion executes queries or proxies to Cube.js -6. **Result Formatting**: Converts results back to wire protocol format - -### Key Components - -#### cubesql crate structure: -- **`/compile`**: SQL compilation and query planning - - `/engine`: DataFusion integration and query execution - - `/rewrite`: egg-based query optimization rules -- **`/sql`**: Database protocol implementations - - `/postgres`: PostgreSQL system catalog emulation - - `/database_variables`: Variable system for PostgreSQL protocol -- **`/transport`**: Network transport and session management -- **`/config`**: Configuration and service initialization - -#### Testing Approach: -- **Unit Tests**: Inline tests in source files using `#[cfg(test)]` -- **Integration Tests**: End-to-end tests in `/e2e` directory -- **Snapshot Tests**: Extensive use of `insta` for SQL compilation snapshots -- **BI Tool Tests**: Compatibility tests for Metabase, Tableau, PowerBI, etc. - -### Important Implementation Details - -1. **DataFusion Integration**: Uses forked Apache Arrow DataFusion for query execution -2. **Rewrite Rules**: Complex SQL transformations using egg e-graph library -3. **Protocol Emulation**: Implements enough of PostgreSQL protocol for BI tools -4. **System Catalogs**: Emulates pg_catalog (PostgreSQL) -5. **Variable Handling**: Supports SET/SHOW commands for protocol compatibility - -## Common Development Tasks - -### Adding New SQL Support -1. Add parsing support in `/compile/parser` -2. Create rewrite rules in `/compile/rewrite/rules` -3. Add tests with snapshot expectations -4. Update protocol-specific handling if needed - -### Debugging Query Compilation -```bash -# Enable detailed logging -CUBESQL_LOG_LEVEL=trace cargo run --bin cubesqld - -# Check rewrite traces in logs -# Look for "Rewrite" entries showing transformation steps -``` - -### Working with Snapshots -```bash -# After making changes that affect SQL compilation -cargo test -cargo insta review # Review and accept/reject changes -``` - -## Key Dependencies - -- **DataFusion**: Query execution engine (forked version with custom modifications) -- **sqlparser-rs**: SQL parser (forked with CubeSQL-specific extensions) -- **egg**: E-graph library for query optimization -- **tokio**: Async runtime for network and I/O operations -- **pgwire**: PostgreSQL wire protocol implementation - -## Important Notes - -- This codebase uses heavily modified forks of DataFusion and sqlparser-rs -- Many clippy lints are disabled due to code generation and complex patterns -- Integration tests require a running Cube.js instance -- The rewrite engine is performance-critical and uses advanced optimization techniques -- Protocol compatibility is paramount for BI tool support \ No newline at end of file +@AGENTS.md diff --git a/rust/cubestore/AGENTS.md b/rust/cubestore/AGENTS.md new file mode 100644 index 0000000000000..a10312152b1fa --- /dev/null +++ b/rust/cubestore/AGENTS.md @@ -0,0 +1,159 @@ +# CLAUDE.md + +This file provides guidance to coding agents when working with code in this repository. + +## Repository Overview + +CubeStore is the Rust-based distributed OLAP storage engine for Cube.js, designed to store and serve pre-aggregations at scale. It's part of the larger Cube.js monorepo and serves as the materialized cache store for rollup tables. + +## Architecture Overview + +### Core Components + +The codebase is organized as a Rust workspace with multiple crates: + +- **`cubestore`**: Main CubeStore implementation with distributed storage, query execution, and API interfaces +- **`cubestore-sql-tests`**: SQL compatibility test suite and benchmarks +- **`cubehll`**: HyperLogLog implementation for approximate distinct counting +- **`cubedatasketches`**: DataSketches integration for advanced approximate algorithms +- **`cubezetasketch`**: Theta Sketch implementation for set operations +- **`cuberpc`**: RPC layer for distributed communication +- **`cuberockstore`**: RocksDB wrapper and storage abstraction + +### Key Modules in `cubestore/src/` + +- **`metastore/`**: Metadata management, table schemas, partitioning, and distributed coordination +- **`queryplanner/`**: Query planning, optimization, and physical execution planning using DataFusion +- **`store/`**: Core storage layer with compaction and data management +- **`cluster/`**: Distributed cluster management, worker pools, and inter-node communication +- **`table/`**: Table data handling, Parquet integration, and data redistribution +- **`cachestore/`**: Caching layer with eviction policies and queue management +- **`sql/`**: SQL parsing and execution layer +- **`streaming/`**: Kafka streaming support and traffic handling +- **`remotefs/`**: Cloud storage integration (S3, GCS, MinIO) +- **`config/`**: Dependency injection and configuration management + +## Development Commands + +### Building + +```bash +# Build all crates in release mode +cargo build --release + +# Build all crates in debug mode +cargo build + +# Build specific crate +cargo build -p cubestore + +# Check code without building +cargo check +``` + +### Testing + +```bash +# Run all tests +cargo test + +# Run tests for specific crate +cargo test -p cubestore +cargo test -p cubestore-sql-tests + +# Run single test +cargo test test_name + +# Run tests with output +cargo test -- --nocapture + +# Run integration tests +cargo test --test '*' + +# Run benchmarks +cargo bench +``` + +### Development + +```bash +# Format code +cargo fmt + +# Check formatting +cargo fmt -- --check + +# Run clippy lints +cargo clippy + +# Run with debug logging +RUST_LOG=debug cargo run + +# Run specific binary +cargo run --bin cubestore + +# Watch for changes (requires cargo-watch) +cargo watch -x check -x test +``` + +### JavaScript Wrapper Commands + +```bash +# Build TypeScript wrapper +npm run build + +# Run JavaScript tests +npm test + +# Lint JavaScript code +npm run lint + +# Fix linting issues +npm run lint:fix +``` + +## Key Dependencies and Technologies + +- **DataFusion**: Apache Arrow-based query engine (using Cube's fork) +- **Apache Arrow/Parquet**: Columnar data format and processing +- **RocksDB**: Embedded key-value store for metadata +- **Tokio**: Async runtime for concurrent operations +- **sqlparser-rs**: SQL parsing (using Cube's fork) + +## Configuration via Dependency Injection + +The codebase uses a custom dependency injection system defined in `config/injection.rs`. Services are configured through the `Injector` and use `Arc` patterns for abstraction. + +## Testing Approach + +- Unit tests are colocated with source files using `#[cfg(test)]` modules +- Integration tests are in `cubestore-sql-tests/tests/` +- SQL compatibility tests use fixtures in `cubestore-sql-tests/src/tests.rs` +- Benchmarks are in `benches/` directories + +## Important Notes + +- **Rust Nightly**: Uses nightly-2025-08-01 (see `rust-toolchain.toml`) +- Uses custom forks of Arrow/DataFusion and sqlparser-rs for Cube-specific features +- Distributed mode involves router and worker nodes communicating via RPC +- Heavy use of async/await patterns with Tokio runtime +- Parquet files are the primary storage format for data + +## Docker Configuration + +The project includes Docker configurations for building and deploying CubeStore: + +- **`builder.Dockerfile`**: Defines the base build image with Rust nightly-2025-08-01, LLVM 18, and build dependencies +- **`Dockerfile`**: Production Dockerfile that uses `cubejs/rust-builder:bookworm-llvm-18` base image and copies rust-toolchain.toml +- **GitHub Actions**: Multiple CI/CD workflows use the same Rust version + +## Updating Rust Version + +When updating the Rust version, ensure ALL these files are kept in sync: + +1. **`rust-toolchain.toml`** - Primary source of truth for local development +2. **`builder.Dockerfile`** - Update the rustup default command with the new nightly version +3. **`Dockerfile`** - Copies rust-toolchain.toml (no manual update needed if builder image is updated) +4. **GitHub Workflows** - Update all occurrences of the Rust nightly version in `.github/workflows/` directory + +**Note**: The `cubejs/rust-builder:bookworm-llvm-18` Docker image tag may also need updating if the builder.Dockerfile changes significantly. diff --git a/rust/cubestore/CLAUDE.md b/rust/cubestore/CLAUDE.md index 8ed13d4deeaf0..43c994c2d3617 100644 --- a/rust/cubestore/CLAUDE.md +++ b/rust/cubestore/CLAUDE.md @@ -1,159 +1 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Repository Overview - -CubeStore is the Rust-based distributed OLAP storage engine for Cube.js, designed to store and serve pre-aggregations at scale. It's part of the larger Cube.js monorepo and serves as the materialized cache store for rollup tables. - -## Architecture Overview - -### Core Components - -The codebase is organized as a Rust workspace with multiple crates: - -- **`cubestore`**: Main CubeStore implementation with distributed storage, query execution, and API interfaces -- **`cubestore-sql-tests`**: SQL compatibility test suite and benchmarks -- **`cubehll`**: HyperLogLog implementation for approximate distinct counting -- **`cubedatasketches`**: DataSketches integration for advanced approximate algorithms -- **`cubezetasketch`**: Theta Sketch implementation for set operations -- **`cuberpc`**: RPC layer for distributed communication -- **`cuberockstore`**: RocksDB wrapper and storage abstraction - -### Key Modules in `cubestore/src/` - -- **`metastore/`**: Metadata management, table schemas, partitioning, and distributed coordination -- **`queryplanner/`**: Query planning, optimization, and physical execution planning using DataFusion -- **`store/`**: Core storage layer with compaction and data management -- **`cluster/`**: Distributed cluster management, worker pools, and inter-node communication -- **`table/`**: Table data handling, Parquet integration, and data redistribution -- **`cachestore/`**: Caching layer with eviction policies and queue management -- **`sql/`**: SQL parsing and execution layer -- **`streaming/`**: Kafka streaming support and traffic handling -- **`remotefs/`**: Cloud storage integration (S3, GCS, MinIO) -- **`config/`**: Dependency injection and configuration management - -## Development Commands - -### Building - -```bash -# Build all crates in release mode -cargo build --release - -# Build all crates in debug mode -cargo build - -# Build specific crate -cargo build -p cubestore - -# Check code without building -cargo check -``` - -### Testing - -```bash -# Run all tests -cargo test - -# Run tests for specific crate -cargo test -p cubestore -cargo test -p cubestore-sql-tests - -# Run single test -cargo test test_name - -# Run tests with output -cargo test -- --nocapture - -# Run integration tests -cargo test --test '*' - -# Run benchmarks -cargo bench -``` - -### Development - -```bash -# Format code -cargo fmt - -# Check formatting -cargo fmt -- --check - -# Run clippy lints -cargo clippy - -# Run with debug logging -RUST_LOG=debug cargo run - -# Run specific binary -cargo run --bin cubestore - -# Watch for changes (requires cargo-watch) -cargo watch -x check -x test -``` - -### JavaScript Wrapper Commands - -```bash -# Build TypeScript wrapper -npm run build - -# Run JavaScript tests -npm test - -# Lint JavaScript code -npm run lint - -# Fix linting issues -npm run lint:fix -``` - -## Key Dependencies and Technologies - -- **DataFusion**: Apache Arrow-based query engine (using Cube's fork) -- **Apache Arrow/Parquet**: Columnar data format and processing -- **RocksDB**: Embedded key-value store for metadata -- **Tokio**: Async runtime for concurrent operations -- **sqlparser-rs**: SQL parsing (using Cube's fork) - -## Configuration via Dependency Injection - -The codebase uses a custom dependency injection system defined in `config/injection.rs`. Services are configured through the `Injector` and use `Arc` patterns for abstraction. - -## Testing Approach - -- Unit tests are colocated with source files using `#[cfg(test)]` modules -- Integration tests are in `cubestore-sql-tests/tests/` -- SQL compatibility tests use fixtures in `cubestore-sql-tests/src/tests.rs` -- Benchmarks are in `benches/` directories - -## Important Notes - -- **Rust Nightly**: Uses nightly-2025-08-01 (see `rust-toolchain.toml`) -- Uses custom forks of Arrow/DataFusion and sqlparser-rs for Cube-specific features -- Distributed mode involves router and worker nodes communicating via RPC -- Heavy use of async/await patterns with Tokio runtime -- Parquet files are the primary storage format for data - -## Docker Configuration - -The project includes Docker configurations for building and deploying CubeStore: - -- **`builder.Dockerfile`**: Defines the base build image with Rust nightly-2025-08-01, LLVM 18, and build dependencies -- **`Dockerfile`**: Production Dockerfile that uses `cubejs/rust-builder:bookworm-llvm-18` base image and copies rust-toolchain.toml -- **GitHub Actions**: Multiple CI/CD workflows use the same Rust version - -## Updating Rust Version - -When updating the Rust version, ensure ALL these files are kept in sync: - -1. **`rust-toolchain.toml`** - Primary source of truth for local development -2. **`builder.Dockerfile`** - Update the rustup default command with the new nightly version -3. **`Dockerfile`** - Copies rust-toolchain.toml (no manual update needed if builder image is updated) -4. **GitHub Workflows** - Update all occurrences of the Rust nightly version in `.github/workflows/` directory - -**Note**: The `cubejs/rust-builder:bookworm-llvm-18` Docker image tag may also need updating if the builder.Dockerfile changes significantly. +@AGENTS.md