A Web Page on

Embedded Computer Vision

for practitioners and researchers from academia and industry
overview | DSPs | FPGAs | CV tools/libs | algorithms | labs+companies | resources | people

News

5 Sept 2007: The CVPR 2008 CfP is out, paper deadline is November 26, 2007. We'll hopefully see another Embedded CV Workshop! 30 April 2007: The following PhD position has been taken :)
27 April 2007: 3 year PhD position available in Austria: Job Offer
14 February 2007: The Embedded Systems Conference India will be held the first week of October 2007 in Bangalore, India. Paper deadline: April 10, 2007.
24 January 2007: The IEEE Intl. Conf. on Distributed Smart Cameras will be held Sept 25-28, 2007, in Vienna, Austria. Paper deadline: May 21, 2007.
10 October 2006: We are experimenting with a Wiki page to replace this page. Please modify or add content as desired! (offline for now)
18 July 2006: The Workshop on Distributed Smart Cameras Boulder, CO, USA, October 31, 2006. Deadline extended: 15 August 2006.
5 July 2006: slides for the CVPR Tutorial are online now.
27 June 2006: created this web page and the mailing list.
Please send me information on your lab, your company, and software/hardware products that we forgot to include on this page!!

Overview

This web page was created to continue the spin of the Tutorial on Embedded Computer Vision, held in conjunction with IEEE CVPR 2006 on 17 June 2006 at NYU in New York City. The intention of this page is to serve as a first-stop shopping solution for people interested in performing image processing and computer vision on dedicated hardware, including right on the camera chip. With a target audience of researchers, graduate students, and practicioners of computer vision, it provides tips and tools for beginners as well as information for the experienced.

Mailing List

The main communication means of this community is the embeddedCV mailing list, hosted as a Google group. Anybody can read and search the archive. To post messages, you first have to subscribe to the group (this reduces the amount of spam). You can subscribe either via Google's web interface, or by sending an email to embeddedCV-subscribe@googlegroups.com. See the help page on how to customize your group email options.

Tutorial Aftermath

Here are the tutorial slides. Some parts of the tutorial notes are embedded into this page. The remaining parts are Ryan Kastner's slides on Reconfigurable Computing Architectures (the FPGA hardware section of the tutorial) and information on the FPGA security research that we briefly mentioned: The RCsec project on reconfigurable security, and the more general ArchLab. The DFT part will be available shortly.

Languages and Operating Systems for DSPs

DSPs are usually programmed in C at first, followed by machine code optimization for critical parts. A DSP rarely just executes one tiny program on an endless stream of rather uniform data, but instead has to perform some general tasks occasionally. Thus, it is usually controlled by a DSP operating system (OS). A large number of companies offer an even larger number of them, frequently classified as a real-time OS. Linux is a common choice due to its flexibility, particularly on Systems-on-Chips. Following are some of your choices for operatings for DSPs and/or SoCs:

One of the main characteristics of these OSs is their small footprint, typically only one to tens of MB.

Further resources that we found helpful: The ``Pocket Guide to Processors for DSP,'' at http://www.bdti.com/pocket/pocket.htm, and ``The Scientist and Engineer's Guide to Digital Signal Processing'' by Steven W. Smith, at http://www.dspguide.com/, in particular Chapter 4: DSP Software.

FPGAs

Determining the function of each of the logic elements on a Field-Programmable Gate Array (FPGA) as well as their interplay and connectivity is called the logic design or digital design, the realm of hardware design engineers. This section will introduce the design flow and the traditional tools available to that end. We will also present tools and techniques that allow FPGA ``programming'' with a traditional, sequential language such as C, bypassing the need for software engineers to learn a hardware description language.

The primary block of an FPGA is called a logic element, which usually functions as a lookup-table (LUT) and contains a fancy flip-flop and a few logic gates. The FPGA needs to know how to arrange these, via its own machine code. The problem at hand needs to be mapped to logic elements, this is called technology mapping. The netlist is the actual file that gets mapped and routed to a specific FPGA. There are many ways to accomplish this, we will describe a fairly comprehensive one that most other ways are a subset of.

The (logic) synthesis translates design into a so-called netlist. This is usually a heavily optimizing synthesis that not only optimizes the HDL with traditional compiler means, but also makes use of hard macros, chip specifics etc. The netlist is usually in the Electronic Design Interchange Format, or EDIF. This entire process is also called FPGA technology mapping. After creation, a netlist can be simulated for debugging purposes in order to detect things like clock skew and effects caused by gate delays.

Next, the netlist needs to be written to the FPGA. This process is called place-and-route. First, the netlist's component references (registers, logic, macros) are assigned to FPGA locations. This is obviously a manufacturer-dependent process as FPGA layouts vary considerably. The result is a FPGA bitmap, which is sent to the FPGA via a number of different means, the most popular being JTAG.

For program-once type FPGAs (antifuse), this needs to - and can - be done only once. Actel Corp. offers these FPGAs, which stay programmed for a lifetime thereafter. Flash-type FPGAs are also non-volatile, but they can be erased and re-programmed many times. Non-volatile FPGAs also have a zero startup time, whereas an SRAM-type FPGA or a DSP needs to be initialized from ROM and/or flash memory, or possibly even dynamically load a program.

Hardware Description Languages

A Hardware Description Language, or HDL, describes the behavior of a logic module, in our case the behavior of an entire chip. Examples are VHDL, Verilog, PALASM, CUPL, and ABEL, with the first two being by far the most popular languages.

VHDL is an acronym which stands for VHSIC Hardware Description Language. VHSIC is yet another achronym which stands for Very High Speed Integrated Circuits. VHDL is a more structured language, analogous to C. Verilog has a bit more loose syntax, analogous to Basic.

The FPGA programming process as described in the previous section requires a bewildering array of steps. To make things easier, many companies offer products that combines all necessary and a host of extra tools for FPGA programming into one software package. This is referred to as electronic design automation (EDA), as HDLs are very low-level languages, there are a number of efforts and products that try to automate HDL creation from a (possibly sequential) program written in a higher-level. The following is a list of EDA tools and higher-level languages:

One essential component that EDA tools provide are hardware libraries or hardware macros, containing the building blocks for common functionalities like memory, signals, multipliers, and all the way up to entire DSP and GPP soft cores, which are to be created out of generic FPGA logic elements.

Depending on the level of sophistication of the higher-level tools, they might directly transform sequential C code into HDL, performing such optimizations as pipelining, parallelization, and loop unrolling.

Image Processing and Computer Vision Tools

hardware:

software:

Suitability of Vision Algorithms

Some image processing and computer vision methods are better suited to execution on a DSP or FPGA than others, for different reasons. Following is a look at the theoretical, potential speedup of common algorithms to be placed in a DSP or FPGA. In general, these are properties that make an algorithm a good candidate to be executed in special hardware:

Single pixel operations are mutually independent and can thus be easily parallelized or fed into a pipeline. Those operations perform the same calculation on every pixel. Single pixel operations on multiple input images are to be treated almost identically. However, memory access to multiple locations at the same time might require careful memory management of the source imgaes. Examples of those functions are arithmetic and logic operations, including addition, multiplication, comparison, and pixel-level masking. Summing all pixel values, counting non-zero pixels, finding average, standard deviation, minimum and maximum etc are single-pass operations but in comparison to single pixel operations, here the result depends on all pixels.

Methods to the left are generally more suitable to DSP and FPGA implementation than methods further to the right in this table.

Labs and Companies

please send me your info!

Other Resources

This is a list of additional resources, some of which are not included in the text above.

People

Tutorial organizers: Branislav Kisacanin and Mathias Kölsch. Helped with tutorial preparation: Mark Belding, Andreas Doblander, Ryan Kastner, Vladimir Pavlovic, Bernhard Rinner, and Tim Sherwood.