-
Notifications
You must be signed in to change notification settings - Fork 867
Add a ConstraintAnalysis pass #8853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 207 commits
fdd5024
aabd860
3332b8a
8b18012
002d237
229ae81
5598aec
ed255d4
e5d8299
fa4805f
ae146cf
4748cc5
f7e33cc
80a843e
2b10d87
c9f17bc
18a691b
5e8c7c3
3b22e48
08e419d
70add1c
15c806a
c30dcda
c74abed
da40546
ddc615e
2707bb1
6390870
3cedbcf
f54afaa
8071200
b550c4d
b1aedcf
0bc8d2e
61fc656
3a3970b
8a6d355
562872d
d296092
a46436f
413f4e8
817a9ca
c14b3b9
35329c8
1dfe44c
6e6d06e
c9f67d1
5f1fdc8
c339b79
d673317
c4321a9
260fb83
4fe70e3
c0d19d9
e6f2d34
200566c
83e845e
e16096b
494f67a
6e76ac0
cf2b0e4
2a02004
279dc39
6e8f576
fcd7e41
6537ace
61dd31e
783e6ff
3e0055e
c1e3dc8
b13c894
d7f5732
362fb95
05d9e29
bc1f4c9
70cd80e
bf12966
67b6707
b9fe8c0
6bae184
6706a99
62cba65
c3089ea
7c5b0b7
958a6d4
e4079e9
31a0ea5
d78f0ef
e78d295
c9eaeed
8884744
eda589a
72e40e4
54cd47e
9e1af11
561c80f
f29d198
3d4a7bb
3cabc2e
690a5ab
840f863
e8a4a80
bd77752
94a3c35
4c5b867
f828d9d
6036296
889658d
724ecb9
0fc9e55
5ad1b75
8181363
d0ad2f4
c5b7d1d
0e35b2b
60d50b4
626b5d7
5896406
cf29b58
cdaff6b
c02ba2e
94d2161
735d7ea
6e80fde
cf7fcc6
ac02454
0930461
7b7d2ac
679bd24
920e7a9
602a8c3
44ad794
2f0bdd7
7f77016
daf2208
59f5a09
ed455bd
03e91b6
6177a16
af791b1
cb35706
c0f54ad
745116c
4c8dcca
725c08d
3e96dcc
03ec477
945e042
41ffb72
ba50172
613ef95
923a42a
5e432a4
2da66f7
3ef82bb
b05dba6
564b3cb
4c3eef2
f40a39d
8a0286a
0d2f0ed
517dac0
9522de6
0e358e3
a178135
e09cbda
0cb581a
91cc337
ea3db17
060d8aa
088a425
9267485
3965c27
280e7b3
26dfc30
c1dacec
3952dfc
224425c
203d455
e4a2a49
22d2d3b
bdc4911
9b4a261
c924041
8e5e075
90aabfa
81714b2
edf8059
b1522c2
97ec6b5
966676d
9645e71
64e0a86
0e0bfb7
6947fc3
2b30208
464aba5
82aad7d
62e5ee9
835d863
6aec092
c014633
81f2421
4f9fb9d
dab23fb
26b1b62
bda6665
9dbf1d9
1981731
ec2c0d5
53974d7
e575105
aec9542
92e5efd
49a448a
e0524d5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,224 @@ | ||
| /* | ||
| * Copyright 2026 WebAssembly Community Group participants | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| // | ||
| // Use mathematical constraint solving to optimize. For example: | ||
| // | ||
| // if (x == 10) { | ||
| // assert(x != 0); // redundant and can be removed. | ||
| // } | ||
| // | ||
|
|
||
| #include "cfg/cfg-traversal.h" | ||
| #include "ir/constraint.h" | ||
| #include "ir/drop.h" | ||
| #include "ir/literal-utils.h" | ||
| #include "ir/local-graph.h" | ||
| #include "ir/properties.h" | ||
| #include "pass.h" | ||
| #include "support/unique_deferring_queue.h" | ||
| #include "support/utilities.h" | ||
| #include "wasm-builder.h" | ||
| #include "wasm.h" | ||
|
|
||
| namespace wasm { | ||
|
|
||
| using namespace wasm::constraint; | ||
|
|
||
| namespace { | ||
|
|
||
| // In each basic block we will store the relevant operations, which are all | ||
| // local gets and sets, branches, and uses of them. | ||
| struct Info { | ||
| std::vector<Expression**> actions; | ||
|
|
||
| // For each local index, we track the constraints we know about it. We only do | ||
| // so at the end of each block, which is enough for the analysis below. | ||
| LocalConstraintMap endConstraints; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not keep the beginning constraints instead? Then when we can get to the end of a block, we can merge the current constraints (and eventually any additional constraints due to the specific control flow edge) into the beginning constraints of each successor block and only process that successor block again if its starting constraints are different. In contrast, the current approach may reprocess successor blocks even if they don't learn anything new from the single predecessor that was updated. Storing the beginning constraints would also let us avoid re-merging all the predecessors for each block in the optimization phase. This would also help prove convergence. If we can show that the merge operation on constraint sets is monotonic on some partial order on constraint sets and converges after a bounded number of steps, then we will know the analysis will converge. In contrast, it's hard to say anything about how the end constraints will change over time because they are the result of non-monotonic meet operations over the course of the block.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting, yeah, storing the start might have benefits. It does mean adding a bottom element though, so we can merge incrementally like that. However, about the very last point: storing the beginning or the end is NFC, so I don't see how it helps with convergence?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of a bottom element it might be cleaner to use However - thinking more on this, I'm not sure it's right. We can't simply keep merging in content as it flows around. The input to a block is, effectively, So I think it is best to do this as it is written: merge the inputs in a loop, seeing them all at once.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sure,
I'm not sure what you mean here.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we receive data from blocks X, Y, and Z, then we only have a valid state after seeing all of X, Y, and Z. That is, if our state starts at some null/bottom, and X arrives, we cannot flow X onward. Concretely, if X has We only find the valid state of inputs to the block after merging X, Y, and Z. Doing so at once is the simplest way to get the valid state.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When an analysis uses a proper join-semilattice, it's fine to start out with incomplete information because we know that eventually we will reach a sound fixed point where we know everything we need to know. If you're right that it's not sound to start with incomplete information for one block, then it seems that worklist algorithm for flowing information around would not be sound, since it necessarily starts with incomplete information at loop heads. But I think it should be fine to start with incomplete information. We don't have a join-semilattice because the OR operation does not go to a unique least upper bound, but it is at least monotonic. We know that the analysis will only ever learn more possibilities (i.e. loosen or drop constraints) as information flows around. So if we start out knowing X has
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let me put it this way: if we see this as an abstract interpretation situation, then we must calculate the entire transfer function at each node. And the entire transfer function is Or, here is the actual problem: again, if one predecessor X supplies Now, we could define the transfer function so that That is, what does
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, after offline discussion, I rewrote this to save the start of blocks, and now it handles loops as well, see last commits. This uncovered a bug in OR, which did not handle the empty set properly, also fixed. |
||
| }; | ||
|
|
||
| struct ConstraintAnalysis | ||
| : public WalkerPass< | ||
| CFGWalker<ConstraintAnalysis, Visitor<ConstraintAnalysis>, Info>> { | ||
| bool isFunctionParallel() override { return true; } | ||
|
|
||
| // Locals are not modified here. | ||
| bool requiresNonNullableLocalFixups() override { return false; } | ||
|
|
||
| std::unique_ptr<Pass> create() override { | ||
| return std::make_unique<ConstraintAnalysis>(); | ||
| } | ||
|
|
||
| // Branches outside of the function can be ignored, as we only look at local | ||
| // state in the function. | ||
| bool ignoreBranchesOutsideOfFunc = true; | ||
|
|
||
| // Store the actions we care about. | ||
| void addAction() { | ||
| if (currBasicBlock) { | ||
| currBasicBlock->contents.actions.push_back(getCurrentPointer()); | ||
| } | ||
| } | ||
|
|
||
| void visitLocalSet(LocalSet* curr) { addAction(); } | ||
| void visitUnary(Unary* curr) { addAction(); } | ||
| void visitBinary(Binary* curr) { addAction(); } | ||
| void visitRefEq(RefEq* curr) { addAction(); } | ||
| void visitRefIsNull(RefIsNull* curr) { addAction(); } | ||
|
|
||
| void visitFunction(Function* curr) { | ||
| // TODO: optimize for speed, find relevant locals etc. | ||
| flow(); | ||
| optimize(); | ||
| } | ||
|
|
||
| // Flow infos around until we have inferred all we can about the constraints | ||
| // in each location. | ||
| void flow() { | ||
| // Start from all the blocks, and keep going while we find something new. | ||
| UniqueDeferredQueue<BasicBlock*> work; | ||
| for (auto& block : basicBlocks) { | ||
| work.push(block.get()); | ||
| } | ||
| while (!work.empty()) { | ||
| auto* block = work.pop(); | ||
|
|
||
| // Merge incoming data to get the status at the start of the block. | ||
| LocalConstraintMap constraints = mergeIncoming(block); | ||
|
|
||
| // Go through the block, applying things. | ||
| for (auto** currp : block->contents.actions) { | ||
| applyToConstraints(*currp, constraints); | ||
| } | ||
|
|
||
| // We now know the values at the end of the block. If something changed, | ||
| // flow it onward. | ||
| if (constraints != block->contents.endConstraints) { | ||
| block->contents.endConstraints = std::move(constraints); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we prove this will actually converge? Can we create a pathological case where the analysis alternates between two different constraint sets forever?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It must converge now because we simply drop extra things in approximateAnd. If we did something more complex, we'd need to be careful and define a total order, I think.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm worried that a sequence of ORs (control flow merges) and ANDs (from the contents of blocks) could change the order of constraints so that the one that is dropped is not stable.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Atm we don't drop, but just refrain from adding. So once a set saturates, it freezes, basically.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Which is monotonic.) |
||
| for (auto* out : block->out) { | ||
| work.push(out); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // After inferring all we can, apply it to optimize the code. | ||
| void optimize() { | ||
| for (auto& block : basicBlocks) { | ||
| // Follow the general shape of flow(): we need to see what the state is | ||
| // at each intermediate point inside the block. (Flowing between blocks is | ||
| // of course not needed at this stage.) | ||
| LocalConstraintMap constraints = mergeIncoming(block.get()); | ||
| for (auto** currp : block->contents.actions) { | ||
| applyToConstraints(*currp, constraints); | ||
| optimizeExpression(currp, constraints); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Given an expression and the constraints on it, optimize it. | ||
| void optimizeExpression(Expression** currp, | ||
| const LocalConstraintMap& constraints) { | ||
| auto* curr = *currp; | ||
| auto parsed = LocalConstraint::parse(curr); | ||
| if (!parsed) { | ||
| return; | ||
| } | ||
|
|
||
| auto iter = constraints.find(parsed->local); | ||
| if (iter == constraints.end()) { | ||
| return; | ||
| } | ||
| auto& localConstraints = iter->second; | ||
| Result result = localConstraints.proves(parsed->constraint); | ||
| if (result == Unknown) { | ||
| // If we parsed something using two locals, like x != y, we can also look | ||
| // for the flipped condition among y's constraints TODO | ||
|
tlively marked this conversation as resolved.
|
||
| return; | ||
| } | ||
|
|
||
| // We know the result! | ||
| auto& wasm = *getModule(); | ||
| auto value = | ||
| LiteralUtils::makeFromInt32(result == True ? 1 : 0, curr->type, wasm); | ||
| *currp = getDroppedChildrenAndAppend( | ||
| curr, wasm, getPassOptions(), value, DropMode::IgnoreParentEffects); | ||
| } | ||
|
|
||
| // Merge incoming data to a block, by looking at the data arriving from each | ||
| // of the predecessor blocks. | ||
| LocalConstraintMap mergeIncoming(BasicBlock* block) { | ||
| LocalConstraintMap constraints; | ||
|
|
||
| // Merge all preds. | ||
| for (auto* pred : block->in) { | ||
| auto& predConstraints = getConstraintsFromPredToSucc(pred, block); | ||
| if (pred == *block->in.begin()) { | ||
| // This is the first. Just copy. | ||
| constraints = predConstraints; | ||
| } else { | ||
| // Merge in subsequent ones. | ||
| constraints.approximateOr(predConstraints); | ||
| } | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we had a
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, yeah. Maybe worth adding, though this might be the only place it helps? |
||
| } | ||
|
|
||
| // The entry block has incoming values - defaults - for each var. | ||
| if (block == entry) { | ||
| auto* func = getFunction(); | ||
| auto numLocals = func->getNumLocals(); | ||
| for (Index i = 0; i < numLocals; i++) { | ||
| if (!func->isVar(i)) { | ||
| continue; | ||
| } | ||
| auto type = func->getLocalType(i); | ||
| // TODO: support tuples | ||
| if (type.size() == 1 && LiteralUtils::canMakeZero(type)) { | ||
| auto value = Literal::makeZero(type); | ||
| constraints[i].approximateAnd(Constraint{Abstract::Eq, {value}}); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return constraints; | ||
| } | ||
|
|
||
| // Given a source (predecessor) and a target (successor) block, find the | ||
| // constraints for locals as they arrive to that target from that successor. | ||
| const LocalConstraintMap& getConstraintsFromPredToSucc(BasicBlock* pred, | ||
| BasicBlock* block) { | ||
| // TODO: use conditional branching to send different values along branches | ||
| return pred->contents.endConstraints; | ||
| } | ||
|
|
||
| // Given an expression, apply it to the constraints. For example, a local.set | ||
| // sets the value for that local. | ||
| void applyToConstraints(Expression* curr, LocalConstraintMap& constraints) { | ||
| if (auto* set = curr->dynCast<LocalSet>()) { | ||
| auto& localConstraints = constraints[set->index]; | ||
| localConstraints.clear(); | ||
| if (Properties::isSingleConstantExpression(set->value)) { | ||
| auto value = Properties::getLiteral(set->value); | ||
| localConstraints.approximateAnd(Constraint{Abstract::Eq, {value}}); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment. This should logically be an OR (or an assignment), and the current code is only correct because the default value is top rather than bottom.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this was the same issue. Fixed as above. |
||
| } | ||
| } | ||
| } | ||
| }; | ||
|
|
||
| } // anonymous namespace | ||
|
|
||
| Pass* createConstraintAnalysisPass() { return new ConstraintAnalysis(); } | ||
|
|
||
| } // namespace wasm | ||
Uh oh!
There was an error while loading. Please reload this page.