[SPARK-57867][CORE] Driver should not reserve off-heap memory in non-local mode#56945
[SPARK-57867][CORE] Driver should not reserve off-heap memory in non-local mode#56945dongjoon-hyun wants to merge 1 commit into
Conversation
|
cc @cloud-fan , @HyukjinKwon , @viirya |
cloud-fan
left a comment
There was a problem hiding this comment.
0 blocking, 0 non-blocking, 0 nits.
Clean, well-scoped fix.
Verification
Verified independently: (1) both MEMORY_OFFHEAP_ENABLED=false and MEMORY_OFFHEAP_SIZE=0 are set on the local conf.clone — setting only the size while leaving enabled=true would trip MemoryManager.tungstenMemoryMode's require(size > 0), so setting both is necessary; (2) the disable is contained to the clone (a deep copy), so env.conf and the executor conf path (which sets off-heap size from the ResourceProfile) are unaffected; (3) Utils.isLocalMaster treats local-cluster as non-local, so the two tests exercise the intended branches — non-local asserts maxOffHeapStorageMemory === 0, local asserts > 0. The driver in non-local mode stores no off-heap blocks (broadcast uses on-heap MEMORY_AND_DISK).
|
Thank you, @cloud-fan ! |
|
Thank you for the fix, @dongjoon-hyun. The motivation makes sense to me — the driver's container memory is never sized for 1. This conflicts with SPARK-46947 (
(The core-module test job was still in progress when I looked, which is probably why CI hasn't flagged it yet.) Beyond the test itself, this is a behavior question we should decide explicitly: should a non-local driver ignore off-heap even when a 2. The cloned conf leaves After this change, on a non-local driver One small note: if the clone approach stays, a short comment on why both keys are set would help — |
|
Thank you for the review, @viirya . For the following, yes, because Spark didn't allocate the requested resources with that configuration. Driver JVM is already started. I revised the test case.
For the second question and note, let me check more. |
ee9592a to
6bd3b32
Compare
What changes were proposed in this pull request?
This PR proposes to stop reserving off-heap memory pools (
spark.memory.offHeap.size) in the driver'sMemoryManagerin non-localdeployments.SparkEnv.initializeMemoryManagertakes a newoffHeapAllowedparameter, andSparkContextpassesoffHeapAllowed = isLocalfor the driver. The executor path andlocalmode are unchanged.Why are the changes needed?
Off-heap memory is accounted for only in executor resource sizing (
ResourceProfile.OFFHEAP_MEM, YARN executor container size, K8sBasicExecutorFeatureStep). The driver's container memory request never includesspark.memory.offHeap.size. So, we should not allow it.However, with
spark.memory.offHeap.enabled=true, the Executors UI and REST API show the driver withspark.memory.offHeap.sizeofOff Heap Storage Memorylike the following, which is very misleading.BEFORE
AFTER
Does this PR introduce any user-facing change?
No. The driver in non-local deployments never uses it: it runs no tasks, stores no off-heap blocks.
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Fable 5