-
Notifications
You must be signed in to change notification settings - Fork 9
ARMHF ‐ aligning for running binaries
This document describes proposed changes needed for running ARMHF binaries between two repositories
Abbreviations used:
- ADT - refers to AROS Development Team repository (https://github.com/aros-development-team/AROS)
- dd - refers to deawood2 repository (https://github.com/deadwood2/AROS)
As an outcome, a binary built from sources of ADT repo should be able to run on system built from sources of dd repo and binary from dd repo should be able to run on system built from ADT repo.
State of builds:
- ADT - armhf is currently built in nightly builds and exposed to public. Changes to binary interfaces are (still allowed?/not allowed anymore?)
- dd - armhf is not built in nightly builds and is not officially supported or guaranteed stable. Changes to binary interface are still allowed.
Current call convention for armhf is as follows: For regcall functions, armhf SysV ABI is used to pass arguments with library base being passed as additional, last argument. For stackcall functions, armhf SysV ABI is used to pass arguments with library base being passed via R12 register (scratch register). Currently both repositories have the same call convention.
Proposal: keep the convention unchanged OR if changes are needed align and implement the same changes in both repositories
There is a set of libraries between dd and ADT which have the same name, same LVOs but they have different expectation from binary with regards to initialization of C library. In ADT they require the binary to initialize and share, while in dd in most cases they don't and don't require sharing. Due to this a binary from one repo using these libraries from another repo might behave in predictable way. Currently the following libraries are in this list: bz2, tiff, png, jpeg, z1, lzma, zsdt, lcms2, regina, freetype2, glu.
Proposal: implement dual-loading mechanism by which when dos detects binary from one repository it will then serve it (via OpenLibrary) above mentioned libraries from another one (which will have to be placed somewhere on disk, separate from "native" LIBS:)
dd repo introduced a concept of private and public SDK, not publishing headers and linklibs for selected components (mostly kernel, oop and hidd drivers) in the builds. The reason for this was to allow a level of flexibility in changing internal and low level components. dd repo does not guarantee backwards compatibility for interfaces which are not published in the build. A side-effect is that mentioned elements cannot compiled outside of AROS build system and have to be is some way integrated into AROS repository. On the other hand ADT repo has all it's headers public and published.
A problem my arise in future when a breaking change is done in dd repo in mentioned components. Such change is an allowed change from dd repo guarantees, but since all elements are public in ADT, it would be considered a breaking change there.
Proposal1: Agree and document in ADT repo that SDK for selected components is not guaranteed to be backwards compatible
Proposal2: Port SDK solution from dd to ADT and hide part of SDK in the same way as in dd.
Background: dd repo snapshoted the state of LVOs in 2019.
- Exec
ADT repo moved 'per task storage' functions from exec.library to task.resource (2023). These changes were not reflected in dd repo.
Proposal: in ADT add private functions to exec that will call back to task.resource. in dd repo add functions to task.resource to call back to exec.
- Freetype
There was a re-organization of LVOs in freetype2.library in 2021 in ADT repo. These changes were not reflected in dd repo. Proposal1: return the LVOs in ADT repo to pre-2021 state Proposal2: use "dual-loading of libraries" approach with freetype2.library (since it's already on the list of potentially impacted libraries)
- Png
In 2024, in dd repo, an existing function in png.library was modified (using same LVO) because it could not work correctly due to C library dependency (it was broken before modification). This change is not needed in ADT repository.
Proposal: use "dual-loading of libraries" approach with png.library (since it's already on the list of potentially impacted libraries)
- spinlock_t
The "2019" spinlock_t structure has an added argument of "alignment" to 128 byte boundary. This was making any structure containing spinlock_t significantly grow in size, was causing problem for Pascal compiler and I (deadwood) personally believe this anyhow can't guarantee that the structure is aligned to 128 bytes if it is within a larger structure (struct MsgPort) which is allocated via AllocMem with 16 byte alignment. The only comment in the commit introducing this is: "according to intel docs, spinlock shall be best aligned to 128 byte boundary". dd repo removed the aligment, while ADT repo still has the alignment.
Upon further discussion with Kalamatee, the real requirement was revealed. The need is for two spinlocks to be at least 128 bytes apart from each other in memory. This is because x86_64 CPUs have 128 byte cache line and when two spinlock are in the same cache line it cause cache congestion and loss of performance. The requirement is then not that spinglock are aligned on 128 byte boundary, but that they are 128 bytes apart and growing the structure by using alignment was chosen as solution. However using 128 byte alignment has a serious side effect (document below in section about Object alignment)
Proposal: Understand if spacing off spinlocks is necessary for armhf. If it is, grow all the structures that use spinlocks to be at least 128 bytes in size by using padding. Don't use alignment on spinlock itself, so that other structure alignment is not changed and follows standard ABI definitions. If it is not, keep the alignment only for x86_64.
- struct MsgPort
In dd repo, a 2xIPTR space is added at end of struct MsgPort for everything other then m68k. This space is internally used to hold a spinlock without exposing it externally. In ADT spinlock is exposed based on AROSPLATFORM_SMP which is defined for armhf targets.
Proposal: Keep AROSPLATFORM_SMP defined for armhf in ADT as obligatory. Adjust the size of private space for armhf in dd repo based on decision on 1) above and Other.1 below.
- struct Semaphore
In ADT struct Semaphore has embedded spinlock_t. In dd this spinlock was removed as there was no placed in AROS codes where this spinlock was used.
Proposal: decide if spinlock in Semaphore will be needed in future. If yes, add spinlock to Semaphore in dd repo for everything other then m68k and x86_64. If not, remove spinlock from Semaphore in ADT.
- structure size and alignment
Because of different described in 1), struct MsgPort and struct Semaphore have different sizes as well as different alignment requirements between ADT and dd. This then propagates to all structures which embed struct MsgPort and struct Semaphore and recursively all structures that embed those structures, meaning a large, unidentified portion of structures is not binary compatible between ADT and dd.
Proposal: Agreements on 1, 2 and 3 should automatically close this point.
- Aligment requirement for MUI subclass data
Subclass data in build is created by allocating a continuous block of memory the size of all class data in class hierarchy and then casting address within that block to a given subclass data structure. The address however cannot be randomly aligned. It has to be aligned in such a way that the resulting data structure meets requirements of the ABI - in other words if the data structure has a 8 byte field, that field has to be on the 8 byte boundary.
As an effect of 1), in 2025 a change was made in ADT that now requires all subclass data to be aligned on 16 byte boundary as due to inclusion of struct Semaphore in one of the base MUI class data, that data structure has to be aligned on 16 byte boundary to meet ABI requirements. GCC assumes that alignment and generates SSE vectorization code which fails if the alignment is not met.
Proposal: Agreement on 1 and 3 will define whether alignment has to be kept at 16 or can be lowered to 8
- MsgPorts created on stack
It's a common practice in Amiga codes to create message port on stack and initialize it manually on caller side. This however causes and issue when MsgPort hides a spinlock inside. On x86_64 I (deadwood) was able to get away with this by adding an initization magic field inside spinlock, after union and before Owner pointer where there was a free space due to alignment rules. For armhf this will not work as there is no free space there.
Proposal: for armhf add one more private field to struct MsgPort which will contain initialization magic marker and will be used to perform lazy spinlock initialization.
- Semaphores created on stack
Depending on whether Semaphore will end up having a spinlock, we are possibly looking at same issues as 1) above.