Skip to content

Reject NaN/inf numeric results instead of returning them as a valid answer#1894

Open
aaravanmay wants to merge 1 commit into
sinaptik-ai:mainfrom
aaravanmay:reject-nan-number
Open

Reject NaN/inf numeric results instead of returning them as a valid answer#1894
aaravanmay wants to merge 1 commit into
sinaptik-ai:mainfrom
aaravanmay:reject-nan-number

Conversation

@aaravanmay

Copy link
Copy Markdown

ResponseParser._validate_response accepts a number result via isinstance(result["value"], (int, float, np.int64)). But NaN and inf are floats, so they pass that check, get wrapped in a NumberResponse, and are returned as the answer — with no error.

This is the silent-wrong case: when generated code aggregates over an empty result — e.g. df["sales"].mean() after a filter that matched zero rows — pandas returns nan. The user (and any downstream charting code) then receives nan as a confident numeric answer with no indication anything went wrong.

The fix adds a finite check after the existing numeric isinstance check:

if isinstance(result["value"], float) and not np.isfinite(result["value"]):
    raise InvalidOutputValueMismatch(
        "Invalid output: Numeric result is NaN or infinite (likely an aggregation over empty data)."
    )

np is already imported; np.isfinite covers both NaN and inf, and np.float64 subclasses float so it's covered too. Booleans/ints are unaffected.

Includes a regression test (no LLM call) asserting NaN and inf both raise InvalidOutputValueMismatch while a normal number still parses.

Found while building a fault-injection tester for agent tools.

_validate_response checks isinstance(value, (int, float, np.int64)) for type
'number', but NaN and inf are floats and pass that check, so they are wrapped in
a NumberResponse and returned as the answer. These almost always come from an
aggregation over an empty result (e.g. df['sales'].mean() when a filter matched
zero rows) - a silent wrong answer.

This adds a finite-number check that raises InvalidOutputValueMismatch for NaN/inf.
Adds a regression test (no LLM) that fails on main and passes with the fix.
@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 8, 2026
@aaravanmay

Copy link
Copy Markdown
Author

This NaN-as-a-valid-answer bug is the exact failure class I built faultline (open source) to catch: a data agent confidently returning a number that's wrong or not-a-number, with no error raised. The numeric-finite invariant that catches it is ~3 lines, and it runs deterministically in CI — no LLM judge. I've drafted a small suite for pandas-ai covering this class (NaN/inf results, silently truncated dataframes). Want me to open it as a separate PR with a non-blocking GitHub Action so you can watch what it flags for a week before deciding? If it's noisy, close it — no hard feelings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant