This is a guest post by blu about his experience with OpenCL on MacchiatoBin board with a quad core Cortex A72 processor and an Intel based MacBook. He previously contributed several technical articles such as How ARM Nerfed NEON Permute Instructions in ARMv8 or OpenGL ES development on Ubuntu Touch. Qualcomm launched their long-awaited server ARM chip the other day, and we started getting the first benchmarks. Incidentally, I too managed to get some OpenCL ray-tracing code running on an ARM Cortex-A72 machine that same day (thanks to pocl – an LLVM-based open-source OCL multi-platform implementation), so my benchmarking curiosity got me. The code in question is an OCL (half-finished) port of a graphics demo from 2014. Some remarks of what it does: For each frame: a single thread builds a sparse voxel octree from a dynamic voxel scene; the octree, along with current camera settings are passed to an […]
Imagination PowerVR “Furian” Series8XT GT8525 GPU Targets High-end Smartphones, Virtual Reality and Automotive Products
Imagination Technologies has unveiled their first GPU based on PowerVR Furian architecture with Series8XT GT8525 GPU equipped with two clusters and designed for SoCs going to into products such as high-end smartphones and tablets, mid-range dedicated VR and AR devices, and mid- to high-end automotive infotainment and ADAS systems. The Furian architecture is said to allow for improvements in performance density, GPU efficiency, and system efficiency, features a new 32-wide ALU cluster design, and can be manufactured using sub-14nm (e.g. 7nm process once available). PowerVR GT8525 GPU supports compute APIs such as OpenCL 2.0, Vulkan 1.0 and OpenVX 1.1. Compared to the previous Series7XT GPU family, Series8XT GT8525 GPU delivers 80% higher fps in Trex benchmark, an extra 50% fps in GFXbench Manhattan benchmark, 50% higher fps in Antutu, doubles the fillrate throughput for GUI, and increases GFLOPs for compute applications by over 50%. GT8525 GPU is available for licensing […]
Open Source ARM Compute Library Released with NEON and OpenCL Accelerated Functions for Computer Vision, Machine Learning
GPU compute promises to deliver much better performance compared to CPU compute for application such a computer vision and machine learning, but the problem is that many developers may not have the right skills or time to leverage APIs such as OpenCL. So ARM decided to write their own ARM Compute library and has now released it under an MIT license. The functions found in the library include: Basic arithmetic, mathematical, and binary operator functions Color manipulation (conversion, channel extraction, and more) Convolution filters (Sobel, Gaussian, and more) Canny Edge, Harris corners, optical flow, and more Pyramids (such as Laplacians) HOG (Histogram of Oriented Gradients) SVM (Support Vector Machines) H/SGEMM (Half and Single precision General Matrix Multiply) Convolutional Neural Networks building blocks (Activation, Convolution, Fully connected, Locally connected, Normalization, Pooling, Soft-max) The library works on Linux, Android or bare metal on armv7a (32bit) or arm64-v8a (64bit) architecture, and makes use […]
ARM Introduces Bifrost Mali-G51 GPU, and Mali-V61 4K H.265 & VP9 Video Processing Unit
Back in May of this year, ARM unveiled Mali-G71 GPU for premium devices, and the first GPU of the company based on Bifrost architecture. The company has now introduced the second Bifrost GPU with Mali-G51 targeting augmented & virtual reality and higher resolution screens to be found in mainstream devices in 2018, as well as Mali-V61 VPU with 4K H.265 & VP9 video decode and encode capabilities, previously unknown under the codename “Egil“. Mali-G51 GPU ARM Mali-G51 will be 60% more energy efficiency, and have 60% more performance density compared to Mali-T830 GPU, making the new GPU the most efficient ARM GPU to date. It will also be 30% smaller, and support 1080p to 4K displays. Under the hood, Mali-G51 include an updated Bifrost’s low level instruction set, a dual-pixel shader core per GPU core to deliver twice the texel and pixel rates, features the latest ARM Frame Buffer Compression […]
PowerVR GT7200 Plus and GT7400 Plus GPUs Support OpenCL 2.0, Better Computer Vision Features
Imagination Technologies introduced PowerVR Series7XT GPU family with up to 512 cores at the end of 2014, and at CES 2016, they’ve announced Series7XT Plus family with GT7200 Plus and GT7400 Plus GPUs, with many of the same features of Series7XT family, plus the addition of OpenCL 2.0 API support, and improvements for computer vision with a new Image Processing Data Master, and support for 8-bit and 16-bit integer data paths, instead of just 32-bit in the previous generation, for example leading to up to 4 times more performance for applications, e.g. deep learning, leveraging OpenVX computer vision API. GT7200 Plus GPU features 64 ALU cores in two clusters, and GT7400 Plus 128 ALU cores in a quad-cluster configuration. Beside OpenCL2.0, and improvements for computer vision, they still support OpenGL ES 3.2, Vulkan, hardware virtualization, advanced security, and more. The company has also made some microarchitectural enhancements to improve performance […]
Fujitsu MB86S70 and MB86S73 ARM Cortex A15 & A7 Processors Run Linux for the Embedded Market
I like to check the ARM Linux kernel mailing list from time to time, as you may discover a few upcoming ARM processors. This week I found out Exynos 5433 and Exynos 7 are actually two different processors (thanks David!), and that AMD had submitted code for their 64-bit ARM Opteron A1100 SoC for servers. I also noticed a patchset for Fujitsu MB86S7X SoCs, and since I don’t often mention Japanese silicon vendors, probably because they now mainly deal mostly with the embedded market that gets very little press, and most information is in Japanese, I decide to have a look. There seems to be four SoC parts in MB86S7x family with MB86S70 quad core processor with two ARM Cortex A15 and two ARM Cortex A7 cores in big.LITTLE configuration, and MB86S73 with two ARM Cortex A7 cores only, as well as MB86S71/72 with 2x A15 and 2x A7, with […]
Imagination Technologies Introduces PowerVR Series7 GPUs with Up to 512 Cores, Virtualization Support
Imagination Technologies has announced a new PowerVR Series7 GPU architecture that will be used in their high end PowerVR Series7XT GPUs delivering up to 1.5 TFLOPS for mid range and high-end mobioe devices, set-top boxes, gaming consoles and even servers, as well as their low power lost cost PowerVR Series7XE GPUs for entry-level mobile devices, set-top boxes, and wearables. PowerVR Series7 GPU, both Series7XT and Series7XE GPUs, can achieve up to a 60% performance improvement over PowerVR Series6XT/6XE GPUs for a given configuration. For example a 64-core PowerVR7XT GPU should be up to 60% faster than a 64-core PowerVR Series6XT clocked at the same frequency, with all extra performance due to a different and improved architecture. Some of Series7 architectural enhancements include: Instruction set enhancements including added co-issue capability, resulting in improved application performance and increased GPU efficiency New hierarchical layout structure that enables scalable polygon throughput and pixel fillrate […]
Adapteva Announces Three Parallella Fanless Boards for Microserver, Desktop, and Embedded Applications
Adapteva’s Parallella low cost open source hardware “supercomputer” is a board powered by Xilinx Zynq-7010/7020 dual core Cortex A9 + FPGA SoC and the company’s Ephipany epiphany coprocessor, that’s had a successful Kickstarter campaign in 2012 as the 16-core version sold for just $99, and is capable of handling applications such as image and video processing, and ray-tracing, and also comes with an OpenCL SDK. The board was fairly difficult to source after the crowdfunding campaign, and one the common complain of backers was the board had to be actively cooled by a fan. The company has fixed both issues by increasing slightly the price, and redesigning the board so that it can be passively cooled by a larger heatsink. There are now three versions of the parallela board: Parallella Microserver ($119) – Used as an Ethernet connected headless server Parallella Desktop ($149) – Used as a personal computer Parallella […]