Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/utils/files/base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,9 @@ export interface FileMetadata {
/** For text files */
lineCount?: number;

/** Whether the read reached the end of the file */
reachedEOF?: boolean;

/** For PDF files */
isPdf?: boolean;
author?: string;
Expand Down
106 changes: 98 additions & 8 deletions src/utils/files/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import {
FileResult,
FileInfo
} from './base.js';
import { detectLineEnding, normalizeLineEndings, type LineEndingStyle } from '../lineEndingHandler.js';

// TODO: Centralize these constants with filesystem.ts to avoid silent drift
// These duplicate concepts from filesystem.ts and should be moved to a shared
Expand Down Expand Up @@ -55,8 +56,31 @@ export class TextFileHandler implements FileHandler {
const length = options?.length ?? 1000; // Default from config
const includeStatusMessage = options?.includeStatusMessage ?? true;

// Binary detection is done at factory level - just read as text
return this.readFileWithSmartPositioning(filePath, offset, length, 'text/plain', includeStatusMessage);
// Detect original line ending before reading.
// readline strips \r\n → \n; we need to restore the original style
// so that downstream consumers (e.g. write_file) preserve line endings.
const lineEnding = await this.detectFileLineEnding(filePath);
const endsWithNewline = await this.fileEndsWithNewline(filePath);

const result = await this.readFileWithSmartPositioning(filePath, offset, length, 'text/plain', includeStatusMessage);

// Restore original line endings if the file uses CRLF or CR
if (lineEnding !== '\n' && typeof result.content === 'string') {
result.content = normalizeLineEndings(result.content, lineEnding);
Comment on lines +68 to +69

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: This normalizes the entire returned payload to a single line-ending style, which rewrites files that contain mixed line endings. In a read→write roundtrip, mixed-ending files will be silently homogenized instead of preserved, causing unintended file diffs and content drift. Preserve per-line endings (or skip normalization when mixed endings are detected) instead of forcing one style. [logic error]

Severity Level: Major ⚠️
- ⚠️ read_file+write_file roundtrips rewrite mixed-ending text files.
- ⚠️ Version-controlled files gain noisy diffs from newline homogenization.
- ⚠️ Byte-stable guarantee in description fails for mixed endings.
Steps of Reproduction ✅
1. Create a text file on disk with mixed line endings, e.g. content `"Line one\r\nLine
two\nLine three\r\n"` saved as `mixed.txt`.

2. Call the MCP `read_file` tool via the filesystem handler `handleReadFile` in
`src/handlers/filesystem-handlers.ts:108-88`, which delegates to `readFile` in
`src/tools/filesystem.ts:95-103`, then to `readFileFromDisk` in
`src/tools/filesystem.ts:392-404`.

3. `readFileFromDisk` obtains a `TextFileHandler` from the factory
(`src/utils/files/factory.ts:38-40`) and calls `TextFileHandler.read()`
(`src/utils/files/text.ts:54-83`). Inside `read()`, `detectFileLineEnding()`
(`src/utils/files/text.ts:162-197`) returns the style of the *first* newline encountered
(e.g. `'\r\n'` if the first line uses CRLF). Then the existing code at
`src/utils/files/text.ts:68-69` runs `normalizeLineEndings(result.content, lineEnding)`,
which, per `normalizeLineEndings` in `src/utils/lineEndingHandler.ts:30-41`, rewrites
*all* line endings in `result.content` to that one style (e.g. all `\n` become `\r\n`).

4. If an AI client then performs a read→modify→write roundtrip like test 5 in
`test/test-crlf-preservation.js:103-146` but using `mixed.txt` (read via `handleReadFile`,
strip the status header by `indexOf('Line one')`, modify content, then write back via
`writeFile` in `src/tools/filesystem.ts:153-160`), the rewritten content written to disk
will now have uniform line endings (e.g. all CRLF) instead of the original mixed CRLF/LF
combination, causing silent normalization and file diffs.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/utils/files/text.ts
**Line:** 68:69
**Comment:**
	*Logic Error: This normalizes the entire returned payload to a single line-ending style, which rewrites files that contain mixed line endings. In a read→write roundtrip, mixed-ending files will be silently homogenized instead of preserved, causing unintended file diffs and content drift. Preserve per-line endings (or skip normalization when mixed endings are detected) instead of forcing one style.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

}

// Restore trailing newline stripped by readline.
// When we haven't reached EOF, the last line we read definitely had
// a newline after it (there's more content following in the file).
// When we have reached EOF, only restore if the original file ended with one.
if (typeof result.content === 'string' && !result.content.endsWith(lineEnding)) {
const reachedEOF = result.metadata?.reachedEOF !== false;
if (!reachedEOF || endsWithNewline) {
result.content += lineEnding;
}
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

return result;
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

async write(path: string, content: string, mode: 'rewrite' | 'append' = 'rewrite'): Promise<void> {
Expand Down Expand Up @@ -131,6 +155,67 @@ export class TextFileHandler implements FileHandler {
return undefined;
}

/**
* Detect line ending style by scanning file chunks until a newline is found.
* Reads in 8KB chunks up to 64KB to handle files whose first line exceeds 8KB.
*/
private async detectFileLineEnding(filePath: string): Promise<LineEndingStyle> {
const CHUNK_SIZE = 8192;
const MAX_SCAN = 65536; // 64KB cap
const fd = await fs.open(filePath, 'r');
try {
const buffer = Buffer.alloc(CHUNK_SIZE);
let position = 0;
let prevEndedWithCR = false;
while (position < MAX_SCAN) {
const { bytesRead } = await fd.read(buffer, 0, CHUNK_SIZE, position);
if (bytesRead === 0) {
return prevEndedWithCR ? '\r' : '\n';
}
const chunk = buffer.toString('utf-8', 0, bytesRead);
// Handle \r at previous chunk boundary followed by \n here
if (prevEndedWithCR) {
return chunk[0] === '\n' ? '\r\n' : '\r';
}
for (let i = 0; i < chunk.length; i++) {
if (chunk[i] === '\r') {
if (i + 1 < chunk.length) {
return chunk[i + 1] === '\n' ? '\r\n' : '\r';
}
// \r at end of chunk — check next chunk for \n
prevEndedWithCR = true;
break;
}
if (chunk[i] === '\n') {
return '\n';
}
}
position += bytesRead;
}
// No newline found within cap — default to LF
return '\n';
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Comment on lines +164 to +196

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The line-ending detector stops scanning after 64KB and defaults to LF if no newline is found in that window. For files whose first newline appears after that limit (for example, very long first lines), CRLF/CR files will be misdetected as LF, so returned content will still be normalized to LF and the original endings will be lost. Continue scanning until the first newline or EOF (or make the cap large/adaptive and fall back to a safer strategy) instead of forcing LF at 64KB. [logic error]

Severity Level: Major ⚠️
- ❌ readFile misidentifies CRLF style for long first lines.
- ⚠️ writeFile roundtrips collapse CRLF to LF for files.
- ⚠️ Git diffs show spurious line-ending changes after edits.
Steps of Reproduction ✅
1. Create a CRLF text file whose first newline appears after 64KB (e.g., a single >64KB
line followed by `\r\n` and more lines) inside an allowed directory (see
allowedDirectories setup pattern in `test/test-crlf-preservation.js:35-44`).

2. Call the public `readFile()` API in `src/tools/filesystem.ts:85-93` with that file path
(this routes to `readFileFromDisk()` which calls `getFileHandler()` and then
`handler.read()` at `filesystem.ts:467-477`).

3. In `src/utils/files/factory.ts:76-103`, `getFileHandler()` returns `TextFileHandler`
for this text file, and `TextFileHandler.read()` at `src/utils/files/text.ts:54-83`
invokes `detectFileLineEnding(filePath)` at `text.ts:162-196` before reading.

4. Inside `detectFileLineEnding()`, the loop scans in 8KB chunks while `position <
MAX_SCAN` (`text.ts:170-194`); because the first `\r`/`\n` occurs after 64KB, the loop
exits without finding any newline and returns `'\n'` at `text.ts:195-196`, causing
`lineEnding` to be misdetected as LF so the normalization branch at `text.ts:67-70` is
skipped and the CRLF file is read and later rewritten (via `writeFile()` at
`src/tools/filesystem.ts:143-165`) with LF-only line endings.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/utils/files/text.ts
**Line:** 164:196
**Comment:**
	*Logic Error: The line-ending detector stops scanning after 64KB and defaults to LF if no newline is found in that window. For files whose first newline appears after that limit (for example, very long first lines), CRLF/CR files will be misdetected as LF, so returned content will still be normalized to LF and the original endings will be lost. Continue scanning until the first newline or EOF (or make the cap large/adaptive and fall back to a safer strategy) instead of forcing LF at 64KB.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

} finally {
await fd.close();
}
}

/**
* Check whether the file ends with a newline character (\n or \r).
* Used to restore trailing newlines stripped by readline.
*/
private async fileEndsWithNewline(filePath: string): Promise<boolean> {
const stats = await fs.stat(filePath);
if (stats.size === 0) return false;
const fd = await fs.open(filePath, 'r');
try {
const buf = Buffer.alloc(1);
await fd.read(buf, 0, 1, stats.size - 1);
return buf[0] === 0x0A || buf[0] === 0x0D; // \n or \r
} finally {
await fd.close();
}
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

/**
* Generate enhanced status message
*/
Expand Down Expand Up @@ -283,7 +368,7 @@ export class TextFileHandler implements FileHandler {
? `${this.generateEnhancedStatusMessage(result.length, -n, fileTotalLines, true)}\n\n${result.join('\n')}`
: result.join('\n');

return { content, mimeType, metadata: {} };
return { content, mimeType, metadata: { reachedEOF: true } };
} finally {
await fd.close();
}
Expand Down Expand Up @@ -330,7 +415,7 @@ export class TextFileHandler implements FileHandler {
? `${this.generateEnhancedStatusMessage(result.length, -requestedLines, fileTotalLines, true)}\n\n${result.join('\n')}`
: result.join('\n');

return { content, mimeType, metadata: {} };
return { content, mimeType, metadata: { reachedEOF: true } };
}

/**
Expand All @@ -351,12 +436,15 @@ export class TextFileHandler implements FileHandler {

const result: string[] = [];
let lineNumber = 0;
let reachedEOF = true;

for await (const line of rl) {
if (lineNumber >= offset && result.length < length) {
result.push(line);
} else if (result.length >= length) {
reachedEOF = false;
break;
}
if (result.length >= length) break;
lineNumber++;
}

Expand All @@ -365,10 +453,10 @@ export class TextFileHandler implements FileHandler {
if (includeStatusMessage) {
const statusMessage = this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false);
const content = `${statusMessage}\n\n${result.join('\n')}`;
return { content, mimeType, metadata: {} };
return { content, mimeType, metadata: { reachedEOF } };
} else {
const content = result.join('\n');
return { content, mimeType, metadata: {} };
return { content, mimeType, metadata: { reachedEOF } };
}
}

Expand Down Expand Up @@ -421,6 +509,7 @@ export class TextFileHandler implements FileHandler {

const result: string[] = [];
let firstLineSkipped = false;
let reachedEOF = true;

for await (const line of rl2) {
if (!firstLineSkipped && startPosition > 0) {
Expand All @@ -431,6 +520,7 @@ export class TextFileHandler implements FileHandler {
if (result.length < length) {
result.push(line);
} else {
reachedEOF = false;
break;
}
}
Expand All @@ -441,7 +531,7 @@ export class TextFileHandler implements FileHandler {
? `${this.generateEnhancedStatusMessage(result.length, offset, fileTotalLines, false)}\n\n${result.join('\n')}`
: result.join('\n');

return { content, mimeType, metadata: {} };
return { content, mimeType, metadata: { reachedEOF } };
} finally {
await fd.close();
}
Expand Down
Loading