Skip to content

Conversation

@khagankhan
Copy link
Contributor

Stack fixup

We currently have every instruction balanced itself (for example, if an op leaves a struct onto the stack it immediately calls a function that consumes that struct) like we did for our generative fuzzer. However, for mutation based fuzzers this may have some bias.

This PR removes that and fixes the stack in the end. It keeps abstract stack types and check the required types then fixes the actual stack.

+cc @fitzgen @eeide

@khagankhan khagankhan requested a review from a team as a code owner November 11, 2025 01:24
@khagankhan khagankhan requested review from alexcrichton and removed request for a team November 11, 2025 01:24
pub(crate) fn operand_types(&self) -> Vec<StackType> {
match self {
// special-cases
Self::TakeTypedStructCall(t) => vec![StackType::Struct(Some(*t))],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d like to remove these “special cases” that require an index. The macro expansion doesn’t accept it and I’d rather not complicate the macro further. If you know a clean way to handle this, please share it with me. I’ll address it in the next PR anyway.

pub(crate) fn result_types(&self) -> Vec<StackType> {
match self {
// special-cases
Self::StructNew(t) => vec![StackType::Struct(Some(*t))],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same "special case" here

@khagankhan
Copy link
Contributor Author

@fitzgen Ready for review!

@github-actions github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Nov 11, 2025
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@alexcrichton
Copy link
Member

Thanks! Nick's out this week and I so far haven't had a chance to look at this. I may end up deferring this to Nick when he gets back as I'll otherwise have to boot back up on a lot of context here, but do you have other subsequent PRs ready to go which are built on this and so it'd be good to get this in sooner rather than later?

@khagankhan
Copy link
Contributor Author

Hey Alex! I am mostly working on a repo on GitLab where I am ahead. The Wasmtime PRs tend to lag behind my current work because I address comments, failed tests etc. Since Nick and I meet weekly (except this week) and go over everything, I think it makes sense to defer this to him. Thank you for the comment!

@alexcrichton alexcrichton requested review from fitzgen and removed request for alexcrichton November 14, 2025 03:00
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm back now, thanks for your patience!

Comment on lines +272 to +276
match &mut op {
GcOp::StructNew(t) | GcOp::TakeStructCall(t) | GcOp::TakeTypedStructCall(t) => {
if num_types > 0 {
*t = *t % num_types;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fixing-up of ops should go in GcOp::fixup. We can add num_types as a parameter there. And if num_types == 0, we should probably just remove this op, no? How do we struct.new or call a function that takes a concrete struct reference if we don't define any types?

/// The operations for the `gc` operations.
#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
pub(crate) enum GcOp {
/// The operations that can be performed by the `gc` function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically, it is quite rare to have the doc comment below the #[...] attributes. Mind reverting this code motion?


pub fn results_len(&self) -> usize {
#[allow(unreachable_patterns, reason = "macro-generated code")]
pub(crate) fn operand_types(&self) -> Vec<StackType> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid a ton of temporary allocations, and reuse a single allocation instead, let's have this method take a mutable vector as an out parameter:

Suggested change
pub(crate) fn operand_types(&self) -> Vec<StackType> {
pub(crate) fn operand_types(&self, types: &mut Vec<StackType>) {

let new_stack = stack - $params + $results;
Ok((op, new_stack))
#[allow(unreachable_patterns, reason = "macro-generated code")]
pub(crate) fn result_types(&self) -> Vec<StackType> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And let's also do an out parameter here as well.

Comment on lines +414 to +418
fn $op(
_ctx: &mut mutatis::Context,
_limits: &GcOpsLimits,
stack: usize,
) -> mutatis::Result<(GcOp, usize)> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this still be taking a stack: usize and returning a new usize? Shouldn't it be taking a stack: &mut Vec<StackType> now instead?

Copy link
Contributor Author

@khagankhan khagankhan Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am back!!!

Does this also mean that when we generate a new op, we should also be checking type compatibility, not just maintaining stack depth?

@fitzgen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also mean that when we generate a new op, we should also be checking type compatibility, not just maintaining stack depth?

Yes, we either have to do that (if we continue the existing approach) or else we switch to a new approach and instead just generate a random sequence of arbitrary ops and then rely on the fixup pass to make it valid. The latter might be easier long term, so that there is only one code path we have to maintain.

But also, backing up, we shouldn't really need to generate op sequences from scratch. The whole point of using a mutation-based paradigm is that we can generate arbitrary inputs via a series of mutations over time. So, instead of generating whole op sequences, we should be able to get away with something like

impl Generate<GcOps> for GcOpsMutator {
    fn generate(&mut self, ctx: &mut Context) -> mutatis::Result<GcOps> {
        let mut ops = GcOps::default();
        for _ in 0..N {
            self.mutate(ctx, &mut ops)?;
        }
        Ok(ops)
    }
}

(And we should probably build the generic version of this into mutatis directly eventually)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! That was the answer I needed. After addressing this I will push

Thanks a lot

Comment on lines +96 to +97
/// Any value is used for reauested operand not a type left on stack (only for Drop and specially handled ops)
Anything,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's align with the Wasm spec and call this AnyRef.

We will eventually want to differentiate between (ref struct) and (ref any) and (ref eq) as well.

Aside: we will need to track nullability for all our different ref types eventually as well.

Comment on lines +114 to +120
// Anything can accept any type - just pop if available
// If stack is empty, synthesize null (anyref compatible)
if stack.pop().is_none() {
// Create a null externref
Self::emit(GcOp::Null(), stack, out, num_types);
stack.pop(); // consume just-synthesized externref
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ref extern) is a different type hierarchy from (ref any), that is there is no single top type of everything, so there is no function that can take anything and I want to make sure we don't build that assumption in here.

Backing up a bit: I am a bit confused about this function's purpose and why it is necessary. It seems like we shouldn't ever be changing the stack types, we should be only be doing the opposite: fixing up our ops given the types that are actually on the stack. The former doesn't make sense (we can't change what types are on the stack at this point without changing what instructions we emitted earlier).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fuzzing Issues related to our fuzzing infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants