Skip to main content

Opencl tagged news

PoCL 3.1 provides compatibility with the LLVM/Clang 15.0 release, switches to using lowercase device names for the platform setup via the “POCL_DEVICES” environment variable, there has been a major rework to the custom device driver, much improved SPIR-V support, continued work towards implementing a Vulkan driver, and a basic OpenCL cl_khr_command_buffer implementation.

OpenCL Tooling Task Sub Group (TSG) is actively contributing to the LLVM compiler infrastructure project and is determined to bring first-class support for OpenCL and SPIR-V to LLVM. While the latest release of Clang brought the long-awaited support for the OpenCL 3.0 standard, C++ for OpenCL 2021 kernel language, and the SPIR-V generation interface utilizing an external tool llvm-spirv from the SPIRV-LLVM-Translator repository, the work on the native GlobalISel-based SPIR-V backend continues at full speed. SPIR-V updates and many other exciting changes in the SPIR-V and OpenCL world will be discussed in depth at the upcoming 2022 LLVM Developers’ Meeting.

In this EE Times Europe article, Neil Trevett describes how the need for graphics and compute acceleration in embedded markets is growing. Cameras and sensor arrays are increasingly central to many use cases in diverse industries, ranging from automotive to industrial, and are generating increasingly rich data streams that require sophisticated processing. At the same time, advanced user interfaces are being developed using high-quality 3D graphics and even augmented-reality technology. However, the need to deploy accelerated processing, combined with the complexities of safety-critical certification, has created a confusing landscape of processors, accelerators, compilers, APIs, and libraries. That has driven up integration costs for embedded accelerators, which in turn has constrained innovation and time-to-market efficiencies.

Open standards have an important role in helping hardware and software vendors navigate this complex technology environment. Acceleration standards for the embedded market can enable cross-platform software reusability, decouple software and hardware development for easier deployment and integration of new components, provide cross-generation reusability, and facilitate field upgradability. Such standards reduce costs, shorten time to market, and lower the barriers to using advanced techniques such as inferencing and vision acceleration in compelling real-world products.

Khronos Group President, Neil Trevett, shares how open standards have an important role mitigating the complexities of safety-critical certification in a confusing landscape of processors, accelerators, compilers, APIs, and libraries, that drive up integration costs for embedded accelerators, which in turn has constrained innovation and time-to-market efficiencies.

PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard. It likely supports the minimal v3.0 feature set (official conformance stamp not yet applied for). In addition to being an easily portable multi-device (truly heterogeneous) open-source OpenCL implementation, a major goal of this project is improving interoperability for diversity of OpenCL-capable devices by integrating them to a single centrally orchestrated platform. Another key goal is to enhance performance portability of OpenCL programs across device types utilizing runtime and compiler techniques.

Upstream PoCL currently supports various CPUs, NVIDIA GPUs via libcuda and ASIPs (experimental, see: http://openasip.org). It is also known to have multiple (private) adaptations in active production use.

With the release of Portable Computing Language (POCL) 3.0-RC1, there is now initial support for OpenCL 3.0 running on CPUs with LLVM 14+. In addition, LLVM/Clang 14 support and improved tracing, scripts for converting traces into Chromium trace visualizer format are new major features.

OpenCL 3.0.11 adds two new extensions and continues the regular release cadence for specification bug fixes and clarifications. The cl_khr_subgroup_rotate extension enables an OpenCL kernel to rotate values among work-items in a subgroup for increased data exchange efficiency in many algorithms. The cl_khr_work_group_uniform_arithmetic extension enables an OpenCL kernel to use new work-group scan and reduction operators which can boost the performance of many use cases—and is ideal to accelerate C++ scan and reduction functions in SYCL 2020 implementations targeting OpenCL as a backend.

Join us to help drive the evolution of Machine Learning acceleration standards. ML developers lament the growing fragmentation in the ML ecosystem. Khronos knows that open and royalty-free standards can play an essential role in reducing fragmentation, reducing costs, and providing the industry participants the opportunity to grow. Based on feedback from previous summit and discussions, Khronos is creating a coalition of interested parties to meet the needs of the ML community for hardware acceleration.

The release of the OpenCL 3.0 specification was a significant milestone for this open standard for low-level heterogeneous parallel programming, creating a pervasive baseline that can be cleanly extended with new functionality requested by developers. But a strong open standard ecosystem is much more than just an API document and Khronos is making significant investments to improve the OpenCL developer experience. Read on to discover the latest updates to the OpenCL SDK and what is coming on the SDK roadmap!

AUTOSAR and The Khronos Group have signed a Memorandum of Understanding (MoU) and entered into a collaboration liaison to foster synergy between the two organizations to encourage standardization in the field of Automotive and Future Intelligent Mobility. This joint technical collaboration between AUTOSAR and Khronos is intended to coordinate common requirements and developments with a focus on accelerated graphics and computing in safety critical markets.

In Basis Universal’s v1.16 release, it focuses on smaller code size, less 3rd party dependencies (just Zstd), OpenCL support, faster ETC1S encoding, and fully multithreading/parallel processing.

  • ETC1S encoding is now approximately 30% faster. We added more optimizations to the encoder’s backend and more SSE optimizations to the frontend.
  • Optional OpenCL support has been added to the ETC1S encoder.