OpenCL (Open Computing Language) is a multi-vendor open standard for general-purpose parallel programming of heterogeneous systems that include CPUs, GPUs and other processors. OpenCL provides a uniform programming environment for software developers to write efficient, portable code for highperformance compute servers, desktop computer systems and handheld devices. OpenCL standard is managed and defined by the Khronos Group. The latest version (OpenCL 1.1) was ratified by the Khronos Group on the 14th of June 2010 and adds significant functionality for enhanced parallel programming flexibility, functionality and performance including: Host-thread safety, enabling OpenCL commands to be enqueued from multiple host threads. Sub-buffer objects to distribute regions of a buffer across multiple OpenCL devices. User events to enable enqueued OpenCL commands to wait on external events. Event callbacks that can be used to enqueue new OpenCL commands based on event state changes in a non-blocking manner. 3-component vector data types. Global work-offset which […]
Faster JPEG decoding on ARM with libjpeg-turbo and NEON Instructions
libjpeg-turbo is based on libjpeg, but uses SIMD instructions (MMX, SSE2, etc.) to accelerate JPEG compression and decompression on x86 targets. On such systems, libjpeg-turbo is generally 2-4x as fast as the original version of libjpeg with the same hardware. ARM does not support MMX or SSE2 instructions, but it has its own SIMD instructions processed by the NEON Engine on ARM Cortex Core A5, A8, A9 and A15. ARM claims that “NEON technology can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D graphics, gaming, audio and speech processing, image processing, telephony, and sound synthesis by at least 3x the performance of ARMv5 and at least 2x the performance of ARMv6 SIMD.” Linaro worked on libjpeg-turbo and added NEON support to it. The code is available on launchpad at https://code.launchpad.net/~tom-gall/linaro/libjpeg-turbo Linaro has also provide benchmark result for libjpeg-turbo with a 12 Mpixel image on TI OMAP4 (Pandaboard) using the […]
AMD G-Series based Ximea Currera-G Smart Camera running Linux
Ximea, a company based in Germany, announced a new version of its Currera Smart Camera: the Currera-G based on AMD G-Series APU. Here’s an excerpt of the press release: The CURRERA-G Smart Camera from XIMEA GmbH, manufacturer of industrial, smart camera, and scientific imaging equipment, sets a new standard for machine vision smart camera processing power. The CURRERA-G Smart Camera houses a single-board-computer built around AMD’s new Fusion accelerated processing unit (APU), which combines the power of both CPU and GPU cores on a single die. “AMD’s Fusion processor means the CURRERA-G can deliver 90Gflops of processing power to tackle the toughest machine vision system applications,” says Vasant Desai, XIMEA Co-Founder and Managing Director. “Combining GPU cores on the same die as the CPU enables the heterogeneous system to offload computation intensive pixel data processing from the CPU to the GPU. Released form this task, the CPU can serve I/O […]
ARM TechCon 2011: Software & System Design Schedule
ARM Technology Conference (TechCon) 2011 will be hosted in Santa Clara on the 25-27 October 2011. There will be many events and classes related to Chip Design and Software & System Design. The Software & System Design events will take place on the 26th and 27th October 2011. Here’s the schedule for Software & System Design events for the 26th of October: Time Class Track 11 am The 2012 Compute Subsystem Creating Smarter Systems 11 am Practical Cortex Debugging: Serial Wire Viewer and ETM Tracing Developing/Debugging 11 am Integrating a CMOS Imaging Sensor into an ARM-Based Embedded Application Human Interface Design 11 am Embedded IPv6 – Now is the time Networking & Connectivity 11 am RSA & AES Libraries protected against side-channel attacks Safety & Security 11 am Introduction to the ARM Architecture The Fundamentals of ARM 12 pm Optimizing SoC development through a common design foundation Creating Smarter Systems […]
Picture Size Optimization for Embedded Systems
If you are developing an embedded system that requires a graphical user interface, you’ll likely have quite a few icons and/or images to store in the flash/rom. If your hardware has limited space, you may have to optimize the size of picture so that they can fit into your flash with no or minimal loss of quality. Reducing image size may also be of interest for mobile websites that can be accessed by devices with lower hardware specs and relatively low network throughput (EDGE/3G). I’ll use GIMP 2.6 – The GNU Image Manipulation Program to work on pictures in order to optimize their size. Selecting the picture format The most common picture file formats are bmp, jpg, png and gif. BMP File Format (aka Bitmap Image File or Device Independent Bitmap) can not compress images except for 8-bit color depth, so it is not suitable for embedded systems. JPG File […]
What is Augmented Reality ? How to develop Augmented Reality applications ?
Now, we start to hear many smartphones or tablet pc feature Augmented Reality (AR). But what is it exactly? Are there any applications right now? How to development application making use of Augmented Reality? Augmented Reality Definition Ronald Azuma’s definition says that Augmented Reality: combines real and virtual is interactive in real time is registered in 3D Augmented Reality Applications Augmented Reality usually involves a real-life background capture (image or video) which is then going through image recognition and finally, some data is overlaid on top of this background. Here are some of applications that are available today (commercially or experimentally): PSV Eindoven has created an application to track offside at football matches. Augmented Reality Business cards can be scanned, recognized and a video presentation of the person and/or the company on the business card can be played back. Augmented Reality can help you find your way (and your holes) […]
Digital Signage: Implementing a smooth scrolling text
Many digital signage hardware feature scroll text. However, in many cases the scrolling text is either not smooth, sometimes teared or very slow. It may depends on the performance of the hardware used but also on the implementation of the software. Once easy way to implemented scrolling text is just to redraw the text again and again at different position. However, this is very slow and yields poor results unless maybe you have a Truetype accelerator or similar hardware font accelerator. The next step is then to convert the text into pixmaps. This can either be done in the digital signage manager software (Windows PC/MAC or Linux based) or the digital signage player. Doing so in the latter makes it much more flexible. So you may create 2 pixmaps whose width and length match the region to be displayed, you write the text on those 2 pixmaps, then simply move […]
How to do a framebuffer screenshot
I’ll explain how to do framebuffer screenshots on 16-bit and 32-bit framebuffer. For 16-bit this is fully based on http://docs.blackfin.uclinux.org/doku.php?id=framebuffer Capturing screenshots Whatever the bit-depth of your framebuffer, the first step is to capture the frambuffer raw data on the board:
1 |
cat /dev/fb0 > screen.raw |
Now the you need to take the raw image, and convert it to a standard image format. This step depends on what type of display is there Converting 16-bit Framebuffer screenshot (RGB565) into png To convert the raw rgb data extracted from /dev/fb0, use iraw2png perl script
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
#!/usr/bin/perl -w $w = shift || 240; $h = shift || 320; $pixels = $w * $h; open OUT, "|pnmtopng" or die "Can't pipe pnmtopng: $!\n"; printf OUT "P6%d %d\n255\n", $w, $h; while ((read STDIN, $raw, 2) and $pixels--) { $short = unpack('S', $raw); print OUT pack("C3", ($short & 0xf800) >> 8, ($short & 0x7e0) >> 3, ($short & 0x1f) << 3); } close OUT; |
To do the conversion, type the following command in the host:
1 |
./iraw2png 640 480 < screen.raw > screen.png |
where 640 and 480 are respectively the width and height of your framebuffer. This has been tried on a 16-bit framebuffer on EM8620 series. Converting 32-bit Framebuffer screenshot (ARGB, RGBA, BGRA…) into png The solution proposed here is not as neat as the blackfin’s solution for 16-bit framebuffer, […]