-
Notifications
You must be signed in to change notification settings - Fork 515
Clearer parse error for identifiers with a '-' in the middle (#7742) #7744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 2 commits
aa23687
0001e2f
1853b6b
f70f713
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| ### Changed | ||
|
|
||
| - The UPLC/PLC/PIR textual parser now rejects unquoted identifiers that | ||
| contain a `-` anywhere other than as the terminal numeric unique-suffix | ||
| separator (e.g. `pubKeyHash-305478r71`, `foo-bar`, `foo-123-456`) with | ||
| a dedicated `InvalidIdentifier` diagnostic that points directly at the | ||
| offending name and shows the full bad text. Previously the same inputs | ||
| silently mis-parsed — the prefix was taken as a name plus unique-suffix | ||
| and the remainder was picked up as an adjacent term — which surfaced as | ||
| a confusing "unexpected '(' expecting ')'" message far from the real | ||
| site (see #7742). To use such a string as a name verbatim, wrap it in | ||
| backticks: `` `pubKeyHash-305478r71` ``. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -52,6 +52,12 @@ data ParserError | |
| = BuiltinTypeNotAStar !T.Text !SourcePos | ||
| | UnknownBuiltinFunction !T.Text !SourcePos ![T.Text] | ||
| | InvalidBuiltinConstant !T.Text !T.Text !SourcePos | ||
| | {-| An unquoted identifier that violates the grammar: a '-' appeared | ||
| anywhere other than as the separator of a terminal numeric unique-suffix | ||
| (e.g. @pubKeyHash-305478r71@, @foo-bar@, @foo-123-456@). The 'Text' | ||
| carries the full offending text as it appeared in the source, so the | ||
| user sees their own name back in the diagnostic. -} | ||
| InvalidIdentifier !T.Text !SourcePos | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the current error thrown if there's an invalid character (like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the crux: there's an asymmetry in how identifier-related errors are handled.
The new error only fires for case 2, but the constructor name That said I see two possible ways forward:
I chose local fix: its an improvement that doesn't stop us from reworking the whole identifier parser later if we want (much bigger scope)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the end I have re-implemented the middle-ground solution:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Do you mean this is how they are handled currently on master, or how you are handling it in your PR?
I don't know what this means. "Touched" in what sense? And to step back a bit - why do we need to allow parsing both names-with-unique and names-without-unique? In any given textual UPLC, don't all names either have uniques or don't? Btw, as a general rule, avoid posting Claude answers verbatim. It comes across to me as not being respectful of other people's time. They are often quite verbose and of questionable quality - inaccuracy is common. You are of course free to use Claude, but make sure to quality-check, shorten and rephrase, make it suitable for human conversation before asking others to read it. This applies to PR descriptions, code reviews, and other communication.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
yes, I meant master.
by "touched" I meant "changed" the suffix (numeric extension) parser.
Idk, this decision was made before me.
I agree that most likely its either all with suffixes or all without. Depends on the machinery that produced the UPLC.
So far when producing public texts (GH descriptions, comments, etc.) I tried to come close to how I want to receive them (Golden rule: treat others same way you want them to treat you). I didn't mean to disrespect anyone, and since you criticized the original Claude produced PR description - I simplified it significantly and now its in the shape in which I'd want others to create PRs for my review. I think it also serves your request to provide the Before and After examples. Other than the original PR description - I typed all comments on keyboard myself without asking LLM to write them. Anyway, I got your point:
|
||
| deriving stock (Eq, Ord, Generic) | ||
| deriving anyclass (NFData) | ||
|
|
||
|
|
@@ -192,6 +198,18 @@ instance Pretty ParserError where | |
| <+> squotes (pretty s) | ||
| <+> "at" | ||
| <+> pretty loc | ||
| pretty (InvalidIdentifier txt loc) = | ||
| "Invalid identifier" | ||
| <+> squotes (pretty txt) | ||
| <+> "at" | ||
| <+> pretty loc | ||
| <> "." | ||
| <> hardline | ||
| <> "A '-' inside a name is the numeric unique-suffix separator and must be" | ||
| <+> "followed only by digits and a word boundary." | ||
| <> hardline | ||
| <> "To use this text as a name verbatim, quote it with backticks:" | ||
| <+> pretty ("`" <> txt <> "`") | ||
|
|
||
| instance ShowErrorComponent ParserError where | ||
| showErrorComponent = show . pretty | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| test:1:21: | ||
| | | ||
| 1 | (program 1.1.0 (lam foo-123-456 foo-123-456)) | ||
| | ^ | ||
| Invalid identifier 'foo-123-456' at test:1:21. | ||
| A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary. | ||
| To use this text as a name verbatim, quote it with backticks: `foo-123-456` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| test:1:21: | ||
| | | ||
| 1 | (program 1.1.0 (lam pubKeyHash-305478r71 (lam x x))) | ||
| | ^ | ||
| Invalid identifier 'pubKeyHash-305478r71' at test:1:21. | ||
| A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary. | ||
| To use this text as a name verbatim, quote it with backticks: `pubKeyHash-305478r71` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| test:1:21: | ||
| | | ||
| 1 | (program 1.1.0 (lam foo-bar foo-bar)) | ||
| | ^ | ||
| Invalid identifier 'foo-bar' at test:1:21. | ||
| A '-' inside a name is the numeric unique-suffix separator and must be followed only by digits and a word boundary. | ||
| To use this text as a name verbatim, quote it with backticks: `foo-bar` |
Uh oh!
There was an error while loading. Please reload this page.