fix(streams): remove quadratic buffering in TextLineStream#7211
Open
tomas-zijdemans wants to merge 1 commit into
Open
fix(streams): remove quadratic buffering in TextLineStream#7211tomas-zijdemans wants to merge 1 commit into
tomas-zijdemans wants to merge 1 commit into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7211 +/- ##
=======================================
Coverage 94.84% 94.84%
=======================================
Files 617 618 +1
Lines 51674 51706 +32
Branches 9350 9367 +17
=======================================
+ Hits 49008 49039 +31
Misses 2121 2121
- Partials 545 546 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
TextLineStreamconcatenates its entire pending buffer into every incomingchunk and rescans it from the start (
chars = this.#currentLine + chars).When a line spans many chunks, each fragment costs O(buffered bytes), making
the total cost O(n²) in the line length. With
allowCR: truethere is asecond issue:
indexOf("\r")runs from position 0 on every loop iteration,so chunks containing many lines are rescanned once per line.
Real-world impact
Lines shorter than one chunk (~64 KiB for files, ~8–16 KiB for fetch) are
unaffected. But multi-MB single-line records are common — JSONL/NDJSON with
embedded documents or base64 payloads, SSE-style streams — and hit the
quadratic directly. Measured on Deno 2.9.1 (Apple Silicon):
Deno.open→TextDecoderStream→TextLineStreamThere is also a robustness angle: input without newlines forces quadratic
CPU on the consumer, so a stream that never sends a terminator can block the
event loop for seconds.
How
Buffer incoming fragments in an array and join only when a chunk arrives in
which a line terminator can complete; scan with position-based
indexOf(x, start)instead of repeated slicing. Same approachDelimiterStreamin this package already uses. Output is byte-for-byteidentical to the previous implementation (verified with a 3,000-case
differential fuzz across both
allowCRmodes, random fragmentationincluding splits inside
\r\n). No API changes.Added tests: line spanning many chunks (both modes), empty chunks, and
chunk-final
\rresolved by the next chunk withallowCR: true.I used Cursor (Claude) to help investigate and write this change.