Skip to content

Conversation

@JoeCitizen
Copy link

Add a new HLSL attribute for Compute, Amp and Mesh shaders: GroupSharedLimit.

This is used to limit the amount of group shared memory a shader is allowed to statically declare, and validation will fail if the limit is exceeded.

There is no upper limit on this attribute, and it is expected that shader writers set the limit as the lowest common denominator for their target hardware and software use case (typically 48k or 64k for modern GPUs).

If no attribute is declared the existing 32k limit is used to be compatible with existing shaders.

Extends the PSV structures to include the selected limit so that runtime validation can reject the shader if it exceeds the device support.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@damyanp damyanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some superficial things that'll need to be addressed at some point, I'm afraid I can't really comment on the actual details of how this works.

I suspect that the whole thing needs to be run through clang-format as well.

}
}

void hlsl::SetShaderProps(PSVRuntimeInfo4 *pInfo4, const DxilModule &DM) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void hlsl::SetShaderProps(PSVRuntimeInfo4 *pInfo4, const DxilModule &DM) {
void hlsl::SetShaderProps(PSVRuntimeInfo4 *Info4, const DxilModule &DM) {

For new code we generally follow the LLVM Coding Standards. This includes no-p-prefix.

PSVRuntimeInfo1 *pInfo1, PSVRuntimeInfo2 *pInfo2,
PSVRuntimeInfo3 *pInfo3, uint8_t ShaderKind,
const char *EntryName, const char *Comment) {
PSVRuntimeInfo3 *pInfo3, PSVRuntimeInfo4 *pInfo4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO in this case following the pInfoN pattern is appropriate.

Comment on lines 3930 to 3932
if (SpecifiedTGSMSize > 0) {
MaxSize = SpecifiedTGSMSize;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (SpecifiedTGSMSize > 0) {
MaxSize = SpecifiedTGSMSize;
}
if (SpecifiedTGSMSize > 0)
MaxSize = SpecifiedTGSMSize;

LLVM coding standards say to omit braces here.

Something's also up with the formatting. Did the format-checker spot it? Anyway, clang-format should fix this for you.

@JoeCitizen JoeCitizen marked this pull request as ready for review December 1, 2025 19:10
@damyanp damyanp linked an issue Jan 5, 2026 that may be closed by this pull request

case DxilMDHelper::kDxilGroupSharedLimitTag: {
DXASSERT(props.IsCS(), "else invalid shader kind");
props.groupSharedLimitBytes = ConstMDToUint32(MDO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should set m_bExtraMetadata if shader model is not 6.10 or higher. This is how validation catches newer metadata used in prior shader models where it's unsupported.

      if (!m_pSM->IsSMAtLeast(6, 10))
        m_bExtraMetadata = true;

hlsl::SetShaderProps((PSVRuntimeInfo0 *)&DMPSV, DM);
hlsl::SetShaderProps((PSVRuntimeInfo1 *)&DMPSV, DM);
hlsl::SetShaderProps((PSVRuntimeInfo2 *)&DMPSV, DM);
hlsl::SetShaderProps((PSVRuntimeInfo4 *)&DMPSV, DM);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, part of the purpose of setting the DMPSV props is to check the props (with memcmp) below, which is how Mismatched is set. This sets the props to DMPSV but doesn't check that they match the props loaded from the container.

The function doesn't take PSV3 because it is checked in PSVContentVerifier::Verify (this one requires the string table indirection).

For PSV4, memcmp would be no good because of the PSV3 string index indirection, so it should probably just check the property directly, which requires adding PSVRuntimeInfo4 *PSV4 to PSVContentVerifier::VerifyEntryProperties parameters. I'll submit suggestions too.

void PSVContentVerifier::VerifyEntryProperties(const ShaderModel *SM,
PSVRuntimeInfo0 *PSV0,
PSVRuntimeInfo1 *PSV1,
PSVRuntimeInfo2 *PSV2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have PSV3 and PSV4 pointers as well.

Suggested change
PSVRuntimeInfo2 *PSV2) {
PSVRuntimeInfo2 *PSV2,
PSVRuntimeInfo4 *PSV4) {

else
Mismatched = memcmp(PSV0, &DMPSV, sizeof(PSVRuntimeInfo0)) != 0;

if (Mismatched) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Mismatched) {
if (PSV4 && PSV4->GroupSharedLimit != DMPSV.GroupSharedLimit)
Mismatched = true;
if (Mismatched) {

// Only valid for SM6.10+
if (!SM->IsSM610Plus()) {
unsigned DiagID = Diags.getCustomDiagID(
DiagnosticsEngine::Error, "attribute GroupSharedLimit only valid for "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we prefer to place these diagnostics earlier, in SemaHLSL.cpp (probably under DiagnoseEntry and DiagnoseEntryAttrAllowedOnStage) (in spite of the fact that many existing diagnostics were placed here in the past). Note that this allows for diagnosing library entry points too.

Comment on lines +1670 to +1672
if (SM->IsMS()) { // Fallback to default limits
funcProps->groupSharedLimitBytes = DXIL::kMaxMSSMSize; // 28k For MS
} else if (SM->IsAS() || SM->IsCS()) {
Copy link
Contributor

@tex3d tex3d Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think relying on SM->IsMS() will work for library targets. I think you want to use isMS (and so on) instead. That can cover the node case as well.

Suggested change
if (SM->IsMS()) { // Fallback to default limits
funcProps->groupSharedLimitBytes = DXIL::kMaxMSSMSize; // 28k For MS
} else if (SM->IsAS() || SM->IsCS()) {
if (isMS) { // Fallback to default limits
funcProps->groupSharedLimitBytes = DXIL::kMaxMSSMSize; // 28k For MS
} else if (isAS || isCS || isNode) {


// Check if the entry function has attribute to override TGSM size.
if (M.HasDxilEntryProps(M.GetEntryFunction())) {
DxilEntryProps &EntryProps = M.GetDxilEntryProps(M.GetEntryFunction());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code only checks non-library shaders, it seems we are missing library TGSM size validation. So, the node case is completely unchecked.

However, we do calculate and store GroupSharedBytesUsed in RDAT shader info structs. This could be used for runtime checks against device capabilities instead of a maximum size.

It looks to me like we have a bit of a design gap here. Wouldn't it be better to report the maximum size used (like GroupSharedBytesUsed) to the runtime for the runtime check?

For the compilation, we can check against the max size attribute to prevent accidental over-allocation, and for validation, we can check against the size reported in PSV0 or RDAT, as well as a version-based check (SM 6.10 required larger than default max).

Then you don't need max size in PSV0 or RDAT, just reported usage. The attribute would just drive a compile-time override of default maximum.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed an issue on the proposal for this here: microsoft/hlsl-specs#761

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

[SM 6.10] Groupshared Memory Size preview

3 participants