mastodon.org.uk is one of the many independent Mastodon servers you can use to participate in the fediverse.
General purpose mastodon instance

Administered by:

Server stats:

197
active users

important question for anyone good at x86. can microcode cache the top of the stack in processor registers for sufficiently nearby pushes and pops or do stack accesses always require a cache access no matter what

@mothcompute Apparently so; I was thinking that 'store forwarding' would be the thing that lets this happen; but when I was hunting for a reference (e.g. see 24.17 below); I came across 'Mirroring memory operands' 24.17/page 236 which says 'It also works with PUSH and POP instructions.'. Note, I doubt Microcode gets involved - I think microcode only happens for big complex stuff, not anything fast.

agner.org/optimize/microarchit

@penguin42 thats *exactly* what i was looking for. thank you

@penguin42 its very interesting that it mentions that its present in zen 2 but not zen 3 because those are the two machines i usually write for. maybe i can try comparing performance per clock between them in these sections

penguin42

@mothcompute Yeh I guess all these forwarding mechanisms are really complex and interact with the rest of the out of order pipeline; its possible they hit a bug in zen 3 and decided to take it out/turn it off rather than fix it before release; difficult to know; and I'm guessing some of the other parts of the forwarding from the store buffer might get you some of the performance anyway; shrug.