Open Source ARM Compute Library Released with NEON and OpenCL Accelerated Functions for Computer Vision, Machine Learning

GPU compute promises to deliver much better performance compared to CPU compute for application such a computer vision and machine learning, but the problem is that many developers may not have the right skills or time to leverage APIs such as OpenCL. So ARM decided to write their own ARM Compute library and has now released it under an MIT license. The functions found in the library include: Basic arithmetic, mathematical, and binary operator functions Color manipulation (conversion, channel extraction, and more) Convolution filters (Sobel, Gaussian, and more) Canny Edge, Harris corners, optical flow, and more Pyramids (such as Laplacians) HOG (Histogram of Oriented Gradients) SVM (Support Vector Machines) H/SGEMM (Half and Single precision General Matrix Multiply) Convolutional Neural Networks building blocks (Activation, Convolution, Fully connected, Locally connected, Normalization, Pooling, Soft-max) The library works on Linux, Android or bare metal on armv7a (32bit) or arm64-v8a (64bit) architecture, and makes use […]

NVIDIA Introduces Jetson TX2 Embedded Artificial Intelligence Computer

NVIDIA has just announced an upgrade to to their Jetson TX1 module, with Jetson TX2 “Embedded AI Computer” with Tegra X2 Parker SoC that either doubles the performance of its predecessor, or runs at more than twice the power efficiency, while drawing less than 7.5 watts of power. The company provided a comparison showing the differences between TX1 and TX2 modules. Jetson TX2 Jetson TX1 GPU NVIDIA Pascal, 256 CUDA cores NVIDIA Maxwell, 256 CUDA cores CPU HMP Dual Denver 2/2 MB L2 + Quad ARM® A57/2 MB L2 Quad ARM® A57/2 MB L2 Video 4K x 2K 60 Hz Encode (HEVC) 4K x 2K 60 Hz Decode (12-Bit Support) 4K x 2K 30 Hz Encode (HEVC) 4K x 2K 60 Hz Decode (10-Bit Support) Memory 8 GB 128 bit LPDDR4 58.3 GB/s 4 GB 64 bit LPDDR4 25.6 GB/s Display 2x DSI, 2x DP 1.2 / HDMI 2.0 / […]

Linux 4.10 Release – Main Changes, ARM & MIPS Architectures

Linus Torvalds has just released Linux 4.10: So there it is, the final 4.10 release. It’s been quiet since rc8, but we did end up fixing several small issues, so the extra week was all good. On the whole, 4.10 didn’t end up as small as it initially looked. After the huge release that was 4.9, I expected things to be pretty quiet, but it ended up very much a fairly average release by modern kernel standards. So we have about 13,000 commits (not counting merges – that would be another 1200+ commits if you count those). The work is all over, obviously – the shortlog below is just the changes in the last week, since rc8. Go out and verify that it’s all good, and I’ll obviously start pulling stuff for 4.11 on Monday. Linus Linux 4.9 added Greybus staging support, improved security thanks to virtually mapped kernel stacks, […]

Self-hosted OpenGL ES Development on Ubuntu Touch

Blu wrote BQ Aquaris M10 Ubuntu Edition review – from a developer’s perspective – last year, and now is back with a new post explaining how to develop and deploy OpenGL ES applications directly on the Ubuntu Touch tablet. Ever since I started using a BQ M10 for console apps development on the go I’ve been wanting to get something, well, flashier going on that tablet. Since I’m a graphics developer by trade and by heart, GLES was the next step on the Ubuntu Touch for me. This article is about writing, building and deploying GLES code on Ubuntu Touch itself, sans a desktop PC. Keep that in mind if some procedure seems unrefined or straight primitive to you – for one, I’m a primitive person, but some tools available on the desktop are, in my opinion, impractical on the Touch itself. That means no QtCreator today, nor Qt, for […]

Imagination PowerVR G6230 is the First GPU To Pass Khronos OpenVX 1.1 Conformance

The Khronos Group is the non-profit consortium group behind open standards and APIs for graphics, media and parallel computation such as OpenGL for 3D graphics, OpenCL for GPGPU, OpenVG for 2D vector graphics, etc… OpenVX is one of their most recent open, royalty-free standard, and targets power optimized acceleration of computer vision applications such as face, body and gesture tracking, smart video surveillance, advanced driver assistance systems (ADAS), object and scene reconstruction, augmented reality, visual inspection, robotics and more. The first revision of the standard was released in 2014, and the latest OpenVX 1.1 revision was just released in May 2016. We’ve already seen OpenVX 1.1 support in Nvidia Jetson TX1 module & board, but Khronos has a conformance program to test  implementations, and if successful, allow companies to use the logo and name of the API. The version first GPU to pass OpenVX 1.1 conformance is Imagination Technologies PowerVR […]

ARM Introduces Bifrost Mali-G51 GPU, and Mali-V61 4K H.265 & VP9 Video Processing Unit

Back in May of this year, ARM unveiled Mali-G71 GPU for premium devices, and the first GPU of the company based on Bifrost architecture. The company has now introduced the second Bifrost GPU with Mali-G51 targeting augmented & virtual reality and higher resolution screens to be found in mainstream devices in 2018, as well as Mali-V61 VPU with 4K H.265 & VP9 video decode and encode capabilities, previously unknown under the codename “Egil“. Mali-G51 GPU ARM Mali-G51 will be 60% more energy efficiency, and have 60% more performance density compared to Mali-T830 GPU, making the new GPU the most efficient ARM GPU to date. It will also be 30% smaller, and support 1080p to 4K displays. Under the hood, Mali-G51 include an updated Bifrost’s low level instruction set, a dual-pixel shader core per GPU core to deliver twice the texel and pixel rates, features the latest ARM Frame Buffer Compression […]

This Video Shows Vulkan API’s Higher Power Efficiency Compared to OpenGL ES API on ARM SoCs

Vulkan was introduced as the successor of OpenGL ES in March 2015, promising to take less CPU resources, and support multiple command buffers that can be created in parallel and distributed over several cores, at the cost of slightly more complex application programming since less software work in done inside the GPU drivers themselves with app developers needing to handle memory allocation and thread management. This was just a standard at the time, so it still needed some time to implement Vulkan, and work is still in program but ARM showcased the power efficiency of Vulkan over OpenGL ES in the video embedded at the end of this post. The demo has the same graphics details and performance using both OpenGL ES and Vulkan, but since the load on the CPU in that demo can be distributed over several CPU cores with Vulkan against a single core for OpenGL ES, […]

HiSilicon Kirin 960 Octa Core Application Processor Features ARM Cortex A73 & A53 Cores, Mali G71 MP8 GPU

Following on Kirin 950 processor found in Huawei Mate 8, P9, P9 Max & Honor 8 smartphones, Hisilicon has now unveiled Kirin 960 octa-core processor with four ARM Cortex A73 cores, four Cortex A53 low power cores, a Mali G71 MP8 GPU, and an LTE Cat.12 modem. The table below from Anandtech compares features and specifications of Kirin 950 against the new Kirin 960 processor. SoC Kirin 950 Kirin 960 CPU 4x Cortex A72 (2.3 GHz) 4x Cortex A53 (1.8 GHz) 4x Cortex A73 (2.4 GHz) 4x Cortex A53 (1.8 GHz) Memory Controller LPDDR3-933 or LPDDR4-1333 (hybrid controller) LPDDR4-1800 GPU ARM Mali-T880MP4 @ 900 MHz ARM Mali-G71MP8 @ 900 MHz Interconnect ARM CCI-400 ARM CCI-550 Encode/ Decode 1080p H.264 Decode & Encode2160p30 HEVC Decode 2160p30 HEVC & H.264 Decode & Encode2160p60 HEVC Decode Camera/ISP Dual 14bit ISP 940MP/s Improved Dual 14bit ISP Sensor Hub i5 i6 Storage eMMC 5.0 UFS 2.1 Integrated Modem […]

EmbeddedTS embedded systems design