Skip to content

fix(streams): remove quadratic buffering in TextLineStream#7211

Open
tomas-zijdemans wants to merge 1 commit into
denoland:mainfrom
tomas-zijdemans:quadricTexLineStream
Open

fix(streams): remove quadratic buffering in TextLineStream#7211
tomas-zijdemans wants to merge 1 commit into
denoland:mainfrom
tomas-zijdemans:quadricTexLineStream

Conversation

@tomas-zijdemans

Copy link
Copy Markdown
Contributor

Why

TextLineStream concatenates its entire pending buffer into every incoming
chunk and rescans it from the start (chars = this.#currentLine + chars).
When a line spans many chunks, each fragment costs O(buffered bytes), making
the total cost O(n²) in the line length. With allowCR: true there is a
second issue: indexOf("\r") runs from position 0 on every loop iteration,
so chunks containing many lines are rescanned once per line.

Real-world impact

Lines shorter than one chunk (~64 KiB for files, ~8–16 KiB for fetch) are
unaffected. But multi-MB single-line records are common — JSONL/NDJSON with
embedded documents or base64 payloads, SSE-style streams — and hit the
quadratic directly. Measured on Deno 2.9.1 (Apple Silicon):

scenario before after
8 MiB line, 1460 B chunks 2,262 ms 6.3 ms
8 MiB line, 64 KiB chunks 47 ms 1 ms
64 MiB JSONL file (8 × 8 MiB records), Deno.openTextDecoderStreamTextLineStream 353 ms 33 ms
16 MiB NDJSON over fetch (4 × 4 MiB lines, localhost) 230 ms 17 ms
common case (100k × 100 B lines) within noise (0.9–1.1x)

There is also a robustness angle: input without newlines forces quadratic
CPU on the consumer, so a stream that never sends a terminator can block the
event loop for seconds.

How

Buffer incoming fragments in an array and join only when a chunk arrives in
which a line terminator can complete; scan with position-based
indexOf(x, start) instead of repeated slicing. Same approach
DelimiterStream in this package already uses. Output is byte-for-byte
identical to the previous implementation (verified with a 3,000-case
differential fuzz across both allowCR modes, random fragmentation
including splits inside \r\n). No API changes.

Added tests: line spanning many chunks (both modes), empty chunks, and
chunk-final \r resolved by the next chunk with allowCR: true.

I used Cursor (Claude) to help investigate and write this change.

@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.77778% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 94.84%. Comparing base (3b390d0) to head (a8e06f1).

Files with missing lines Patch % Lines
streams/text_line_stream.ts 97.77% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7211   +/-   ##
=======================================
  Coverage   94.84%   94.84%           
=======================================
  Files         617      618    +1     
  Lines       51674    51706   +32     
  Branches     9350     9367   +17     
=======================================
+ Hits        49008    49039   +31     
  Misses       2121     2121           
- Partials      545      546    +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant