Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug of Loader on ARM platform #732

Open
AnzhongHuang opened this issue Nov 1, 2021 · 2 comments
Open

A bug of Loader on ARM platform #732

AnzhongHuang opened this issue Nov 1, 2021 · 2 comments

Comments

@AnzhongHuang
Copy link

@AnzhongHuang AnzhongHuang commented Nov 1, 2021

A segment fault is observed on ARM platform when a Vulkan application using private Vulkan extensions. And this is a ARM specific bug which cannot be reproduced on X86_64 Windows or X86_64 Linux. This bug is related to the feature of Tail-Call which is used by Vulkan-Loader.

The behavior of Vulkan-Loader is that Vulkan-Loader must expose its own functions to application, since some objects are wrapped by Loader, and those object cannot be recognized by Driver or Layer until they are unwrapped, so the entry point of each Vulkan API must come from Vulkan-Loader, the call sequence could be: App -> Vulkan-Loader API -> Vulkan Layer1 API -->… --> Vulkan LayerN API --> Driver ICD API.
However, for some Driver ICD APIs, they are private extensions of Vulkan driver, which are not published, so Vulkan-Loader doesn’t know the actual parameters of them. Ideally, Vulkan-Loader can just expose the address of the Driver ICD API to application, but since some objects are wrapped by Loader, so Vulkan-Loader has to provide a wrapped function which has a single parameter, the single parameter could be a device, physical device, or instance, and it’s unwrapped by loader before calling Driver ICD API. The pseudo-code:
The Loader:
typedef void (VKAPI_PTR* vkGetDoSomethingAMD)(VkPhysicalDevice device, int arg1, int arg2, int arg3);
VkResult vkGetDoSomethingAMD(VkPhysicalDevice dev)
{
pfnvkGetDoSomethingAMD = GetIcdEntryPoint("vkGetDoSomethingAMD");
VkPhysicalDevice realDev = Unwrapped(dev);
pfnvkGetDoSomethingAMD(realDev);
}

Application:
vkGetDoSomethingAMD(device, arg1, arg2, arg3);

The call sequence is that:
vkGetDoSomethingAMD(device, arg1, arg2, arg3) --> Loader::vkGetDoSomethingAMD(device) --> ICD::vkGetDoSomethingAMD(unwrapped_device, arg1, arg2, arg3)

To support this functionality, the feature of Tail-call is used by the Vulkan-Loader, it’s enabled by GCC when optimization level O2 or O3 is used. However, we found a ARM GCC’s bug which cause a segment fault.

The bug of ARM GCC.
For example, when the application call vkGetDoSomethingAMD(device, 10, 20, 30); then the ICD Api is invoked by loader, ICD::vkGetDoSomethingAMD(unwrapped_device, 0xffffffffab8080, 20, 30), the value of second parameter should be 10, but actually it is the address of ICD::vkGetDoSomethingAMD, i.e. arg1 = ICD::vkGetDoSomethingAMD.

The following is the corresponding assembly code of calling ICD::vkGetDoSomethingAMD from Loader which is generated by ARM GCC:
.cfi_startproc
ldr x1, [x0]
ldr x1, [x1, 4288]
mov x16, x1
br x16
The second parameter is stored into x1 register, but when this piece of code is executed, X1 will be overwritten by the address of ICD::vkGetDoSomethingAMD. This issue can be temporarily work-around when we bypassed the ARM GCC compiler by modifying the assembly code of Vulkan-Loader directly, such as:
.LFB724:
.cfi_startproc
mov x17, x1
ldr x1, [x0]
ldr x1, [x1, 4288]
mov x16, x1
mov x1, x17
br x16
.cfi_endproc

Is it a known issue, can someone provide a official fix from Loader side? Such as write a universal assembly code for the ARM GCC.

@esullivan-nvidia
Copy link
Contributor

@esullivan-nvidia esullivan-nvidia commented Nov 1, 2021

I recently uploaded a PR that should fix this issue. Would you mind checking it out and seeing if it resolves your problem?

#731

@charles-lunarg
Copy link
Collaborator

@charles-lunarg charles-lunarg commented Dec 15, 2021

I'd really love to know if the merged arm assembly code fixes this, since then we know what the cause & fix was.

As in, if fixes the problem close the issue but if the need for cross compile support isn't done we should create a separate issue to track it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants