Imaging solutions with Free Software & Open Hardware

Who's online

There are currently 0 users online.

The content below is downloaded from, copyright(s) of the source apply. Archive Index (1999-2012) | 2013-current at | About  
Follow LinuxGizmos:
Twitter Google+ Facebook RSS feed

Elphel camera under the hood: from Verilog to PHP

Feb 6, 2009 — by Henry Kingman — from the LinuxDevices Archive
Share this: Tweet about this on TwitterGoogle+Share on FacebookShare on LinkedInShare on Reddit

Foreword: — This paper describes the recent imaging advances by Elphel, supplier of open source (hardware and software) cameras to customers that include Google (for select Street View and book scanning projects). It should interest imaging engineers, fans of open source, and those curious about open source hardware.

The paper was written by Dr. Andrey Filippov, the Russian physicist who founded Elphel in 2001. Filippov has contributed many papers to LinuxDevices through the years, as regular readers will be aware.

This paper begins with some background on Elphel, and Filippov's decision to build a company around open source. It then describes the evolving open source software stack used in Elphel's camera's, “from verilog to PHP.”

Along the way, Filippov painstakingly explains recent work done to improve the data pathway in the camera. As a result of the improvements, the newest Elphel cameras are less prone to dropping frames when image parameters are adjusted, he says. And, though very low-powered, the cameras are said to capture 5Mpix images (2592×1936) at 15fps, or 2MPix images (1920×1088) at 30fps.

It gets better after that, as Filippov includes a fairly mind-boggling description of the image compression algorithms used in Elphel cameras. He provides instructions for calculating the number of electrons each of your camera sensor's pixels can hold before welling over, and then provides a javascript calculator to use that “full well capacity” FWC figure, along with several other specifications, to determine the effective bit depth of your sensor. Based on the effective depth, an encoding strategy can be worked out.

(To visit Elphel on the web, click here).

Enjoy . . . !

Elphel camera under the hood: from Verilog to PHP

by Andrey N. Filippov (Feb. 6, 2009)


When I started Elphel in 2001, I was inspired by the effectiveness of the FOSS projects developed by my colleagues. I bought into the idea of reusing whatever others already did before me, and only having to develop the new things. The same “Standing on the Shoulder of Giants” metaphor that was already proven in science is probably the most powerful instrument of the human mind.

I immediately wanted to apply this approach that already gained momentum in the software world to the hardware devices that I had been developing through most of my career. I also considered it to be a good business opportunity. As a user of camera systems, I had often encountered the problem that off-the-shelf products are often very close to one's requirements, but do not usually make an exact match. And it is usually difficult, if not impossible, to make that last step to modify or improve an otherwise good product.

And yes, it did work even better than I expected. In just about three months, I had developed the first-generation model 303 camera running GNU/Linux. I borrowed functionality from the operating system and other free software included in the (Axis Communications SDK). I also ported some applications to the cris architecture myself (the camera used an Axis Etrax system-on-chip processor). Of course, some code still had to be written, but only the code specific to the brand new hardware that nobody but me had ever used before. This newborn camera already had more functionality than I could make in a year of developing a ground-up, in-house system.

As a hardware manufacturer, we have an advantage over the pure software projects, in that we can make a profit from selling tangible goods, but there is the another side to that story. Hardware manufacturers also have a smaller developer community to call upon, because an active developer has to have actual hardware, which is more expensive to “duplicate” than software. We are also developing new products and providing support to the existent customers. This, and the fact that the freedom (and therefore the flexibility and possibility to customize) of our designs is itself a value for our customers, allows us to keep all our hardware designs free. They are provided with either GNU GPL or GNU FDL licenses. Early in 2004, we started a project at Sourceforge, and since then it has become an Elphel software repository both for distribution to our users, as well as for team collaboration. All the Elphel software and FPGA code is released under GNU GPL v3.0, and the same code we use internally is available for download and loaded in the CVS repository for any GNU GPL compliant use.

From the very beginning, we were trying to make developer-friendly, customizable products, and we experimented with different technologies (see end of article for links to Elphel product stories.) Most of our experiments proved the viability of the technologies used. The camera project was growing incrementally, accommodating new sensors and other electronic components, adding more features to the FPGA and the software code. In parallel, we were gaining experience in designing network cameras that combined the power of free software running on a general purpose CPU with the performance and flexibility of the “hardware” processing implemented in the reconfigurable FPGA using the GPL-ed code written in Verilog HDL. As the project grew, I was tempted to redesign the core software, and last year we finally re-wrote the key parts of the FPGA code and released version 8.0 of the camera software.

Camera hardware architecture

The first camera (model 303) developed at Elphel in 2001 had a stable architecture, and was made more so when we moved to the Xilinx reconfigurable FPGA a year later. Designed to accommodate a variety of sensors and extension boards, the camera is based on a main system board that is essentially a universal computer running GNU/Linux, with the addition of the FPGA and a dedicated video buffer memory. The FPGA is used for image processing and compression (which it does much faster than could a CPU). It is also used for interfacing sensors and daughter boards, simplifying support of devices that were not yet available at the time when the system boards were manufactured. The block diagram below shows a typical Elphel camera, including camera processor board (10353), CMOS sensor front end (10338), and optional I/O extension board (10369) with adapter boards (103691,103692). Other configurations may include CCD sensor front ends, mechanical shutter control, and additional FPGA processing/multiplexer boards. Either a universal I/O extension board or smaller sync boards can be used to lock (synchronize) multiple cameras for stereo or panoramic applications.

Elphel model 353 camera: hardware and FPGA layers
(Click to enlarge)

The diagram above shows camera main hardware components, I/O ports, and internal connections between the boards. It also includes details about the major processing modules implemented in the FPGA — the intersection of the hardware and software domains.

Code layers in the camera

The camera's software spans multiple layers of system hierarchy, and combines code developed at Elphel with other free software. We use the KDevelop IDE to navigate through all this code, as well as compile and build drivers and applications, launch FPGA code simulation, and generate the camera flash image. Only the “silicon compilation” — the generation of the FPGA configuration file from the source code — is handled by a non-open-source software, Xilinx WebPack. The software, however, is free for download.

  • Verilog HDL code in the FPGA performs most of the computationally-intensive tasks in the camera, such as image processing/compression operations that provide a compressed bitstream ready to be sent over the network or recorded to storage media. The compiled FPGA images are conveniently loaded by the CPU through its GPIO pins connected to the FPGA's JTAG port. This can be done repeatedly, making it easy to add 'printf' equivalents to the code while troubleshooting. Unlike the CPU code, the Verilog code statements are compiled to the different physical parts of the chip and are executed in parallel. As a result, there is virtually no performance penalty when adding more code, as long as there are some FPGA resources left. The FPGA code directly interfaces the sensor front ends, usually through the I2C bus for the sensor commands and parallel data in for images, and also interfaces any optional extension boards. A multichannel external DDR SDRAM controller is included for the dedicated video buffer memory. The FPGA shares the system bus with the CPU, communicates with the CPU software in the PIO mode (to read/write control registers and write table data) as well as in DMA mode, for transferring compressed frames to system memory. Version 8.0 adds support for scheduling of the internal register writes as well as sensor i2c commands, so that they are activated for the specified frame.
  • Kernel drivers make up the next software layer in the camera, and the lowest layer of the CPU code. The kernel drivers supplement the standard network, IDE, and USB drivers, and provide the software interface to the FPGA modules for gamma correction, histograms, color converter, and compressor, as well as to external, FPGA-connected devices, such as the image sensor. Because the camera is designed to stream or record video that can run at high frame rates, the related drivers are designed to ease the real-time requirements to the application layer software. Such drivers rely on the interrupts generated by the FPGA code for each frame transferred from the sensor and/or compressed and stored in system memory. The software makes use of the “hardware” command queues (Multiframe I2C sequencer, Multiframe register write sequencer) implemented in the FPGA, as well as a large output buffer for the compressed video. Such command/data buffering facilitates camera control and video data handling by the CPU software and synchronizing with the FPGA's stream processing, making applications tolerant to pauses caused by I/O events (i.e. waiting for the mass storage device) or just long calculations.
  • The Application layer consists of standard programs like Busybox collection, web, ftp, telnet and ssh servers. There are also camera-specific applications, such as:

    • imgsrv — fast image server that avoids extra memory copying and serves JPEG (including multi-part) images directly from the circular video buffer to HTTP GET requests. It also implements buffer navigation (buffer holds several seconds of full frame rate video) and Exif image meta data browsing (including GPS and compass data if attached).
    • camogm — application for recording video (and audio if USB audio adapter available) to any of the attached mass storage devices, including CompactFlash cards or SATA (or 1.8″ ZIF) HDD. It uses a system circular video buffer and makes sure there are no frame drops, as may occur when closing or opening files in a sequence. The program runs as a daemon that accepts commands over the named pipe, and can be integrated into web applications as camogmgui.
    • str — unicast/multicast RTSP video server capable of streaming full sensor frame video. This is used because plain RTP has a limit of 2040 pixels for image width or height. The stream can be rendered with such media players as MPlayer or VLC, provided they use the current live555 library and include improvements that overcome earlier restrictions on frame size. Streamer can work in “nice” mode when used as a viewfinder while video is recorded with camogm, skipping frames to prevent recorded frame drops if CPU resources are insufficient. It is designed to tolerate short frame size changes, making it possible to reprogram the sensor and acquire a full resolution snapshot while simultaneously streaming lower resolution (but higher frame rate) video.
    • autoexposure — a daemon running in the camera that uses histograms calculated by the FPGA code inside a specified sub-window of the image frame to automatically adjust sensor exposure time and white balance. It has multiple programmable parameters to achieve flexible regulation suitable for a particular application. Most of these parameters can be adjusted through the camvc AJAX web GUI.

  • Web applications and scripts. As network devices, Elphel cameras have always relied on a web interface, such as described in “AJAX, LAMP, and liveDVD for a Linux-based camera”, but that was only after the last system upgrade to the model 353. (The camera's timeline is available on Elphel Wiki.) This included the faster CPU (ETRAX FS) and 64MB of system memory. To use the scripting language as the top layer of the camera software, we replaced the Boa web server that was provided in the Axis software development kit (SDK) with Lighttpd, which is known to work nicely with FastCGI, which we needed to run PHP efficiently. This technology allows several copies of the PHP to be running ready to serve HTTP requests, so the interpreter (over 2MB in the camera) does not needed to be re-started for each new one. Looking for a scripting language that was both easy to use and efficient, we decided on PHP (Tiobe Index).

    PHP is well documented on the site, which includes user-provided, real-life examples. PHP offers an abundance of dedicated functions, integrated with the program itself or offered as extensions, that handle many programming tasks in a single call. At Elphel, we also added our own camera-specific extension functions. We started our PHP journey with a nice online tutorial by Sara Golemon Extension Writing Part I: Introduction to PHP and Zend. She also wrote a book, “Extending and Embedding PHP.”

    Both the hardware (FPGA Verilog code) and drivers may change in the future, so we gave particular consideration to synchronization between different software layers. Parameter definitions in the driver code are exported as PHP constants, and their symbolic names are available throughout the extension functions. In addition to the individual names of the 32-bit parameters defined in the driver header file, extensions handle composite names that include optional offset in the register file (i.e. hardware-specific CMOS sensor registers) and bit-field selection.

    There are multiple scripts installed that are accessible through the lighttpd web server (with FastCGI), including those designed to be part of AJAX applications. Others provide access to the camera hardware, and many are intended for development, such as creating control interfaces, fine-tuning driver parameters, monitoring internal data structures, and profiling the interrupt service routine. It is easy to transfer custom scripts to the camera, both to the camera RAM disk (tmpfs) for safe experimentation, or to the camera flash memory.

    PHP can also be used in CLI mode, powering multiple hardware initialization scripts such as for reading I2C devices, discovery of the FPGA add-on boards via JTAG, preparation of the Exif templates for the images or video that match camera hardware capabilities, and optional peripherals, such as a GPS module. In this capacity, such scripts partially replace regular shell scripts, which are more difficult to write and modify for many developers. I realize that shell scripts can be smaller and more efficient than PHP code, but for me, it requires too much time reading manuals and Googling for examples.

New features of release 8.0 of the camera software

Over the years, the camera's software, including the FPGA code, has been modified to survive multiple upgrades of all of the major hardware components, including CPU, FPGA, memories, and Ethernet PHY. Last year, however, we implemented several major changes to the code running in the cameras. First, we separated the reading of the video circular buffer from controlling the camera video frames's acquisition to the buffer that is subject to hardware-dependent conditions and frame latencies. That was the first step to implement support of the pipeline operation of the camera modules. Several applications were created that made use of the new operation of the video buffer. Next, we redesigned the frame control/acquisition part, which, together with the additional modes of the color processing, made up the core of release 8.0.

  • Camera control tailored to the pipeline operation

    The driver in the earlier software supported two sets of acquisition parameters: imageParamsW[ ] — parameter values requested by applications, and imageParamsR[ ] — an array of parameters currently used by the driver. When application such as ccam.cgi — a CGI program called from the web server — wanted to change an acquisition parameter, it changed elements of the imageParamsW[ ] array and then called a driver function to program the sensor and/or compressor. Most parameters had several frames of latency, from the moment the parameter was changed until the influenced image was registered in system memory, so the function had to use interrupts, and the application had to wait for the requested image to become ready.

    Such a model worked well when everything was programmed once and only a few new images (or frames in a video stream) were acquired with the same settings. It was possible, for example, to change some parameters on the fly, such as exposure or analog gains that did not disturb the following pipeline of the camera data processing:

    sensor registers setup ->
    sensor video frame out ->
    FPGA image preprocessing ->
    video frame buffer ->
    FPGA compressor ->
    DMA ->
    system memory ->
    frame output

    Most other parameters, however, even the sensor window of interest (WOI) vertical pan, often caused the sensor to output a corrupted frame, while also interrupting the pipeline operation. The software could not handle the propagation of related changes through the pipeline, as it was only aware of the “requested” and “actual” values pair for each parameter. The real-life change of the parameters involved modification to the sensor register, FPGA pre-processor height, video buffer read- and write controllers, compressor number of tiles, and output JPEG image height (in the header). All these changes happened at different times, with different frame latencies. The software learned how to handle some specific cases, but often, the only way to change some of the acquisition parameters was to shut everything down, reset the FPGA memory controller and compressor, reprogram them, and re-start the pipeline operation. That caused multi-frame interruptions, and the output video buffer also had to be reset, which influenced all the applications that relied on that buffering.

    Release 8.0 solves these problems by supporting pipelined operation of the camera hardware and FPGA modules in order to eliminate the need for acquisition restart, as well as minimize the number of the lost frames. Instead of having only two values per parameter, release 8.0 maintains individual parameters for each frame. They are set by applications in advance, so the driver has time to resolve possible parameter dependencies, as well as compensate for the latencies in the various pipeline stages. Only eight parameter values are maintained by the driver — six frames in the future, and the current and previous frame. A subset of the parameters is copied to the deeper buffer, so that data is preserved longer after the particular frame acquisition.

    Applications now write acquisition parameters for the specified frame through the driver write() calls. Such calls are “atomic,” making sure that all the parameters written are applied to the same frame. Many of these parameter modifications involve scheduling actions that are evaluated by the driver when serving frame interrupts. The driver uses parameter action tables for such scheduling. These tables are both compiled in the driver, and are still available for editing at runtime through the web interface. This convenient feature enables code modification to accommodate new hardware. Each action has a related latency — the number of frames between when the related commands are sent to the sensor (or applied to the FPGA modules) and when the first frame influenced by this action appears in the output circular video buffer.

    The number of frames between the related commands are sent to the sensor (or applied to the FPGA modules), and the first that is influenced by this action frame appears in the output circular video buffer. The action latency tables, which are different for free-running and an externally synchronized sensor, are provided similarly to the action table, both as compiled in constants, as well as runtime edited through web interface parameters.

    There are 32 actions implemented in the software, so a single 32-bit word can specify all actions needed for the frame. Each action includes an image sensor-agnostic function that deals only with the FPGA and software that is common for every sensor front end. Such a function is optionally followed by a sensor-specific one that is dynamically linked during sensor identification, so only the sensor-specific functions need to be added for each new sensor front end.

    Scheduled actions are processed by the driver during once-per-frame interrupts, at least with latency frames before the target frame, the frame that is to appear at the output with new parameters applied. Additional programmable parameters specify how many frames ahead of the last possible frame the driver is allowed to execute the actions. Doing this ahead of time makes the driver tolerant to missing interrupts, which could happen at very high FPS with small WOI and many parameter changes applied to the same frame. It uses the hardware command queues. Each action involves validation of the input parameters and their modification if they are not compatible with the hardware capabilities or other imposed restrictions. In that case, the modified parameters can also trigger more actions. Parameter validation is normally followed by the calculation of the hardware register values that are needed to be sent to the sensor and FPGA modules, and these values are finally written to one of the hardware command queues.

    The command queues are implemented in the FPGA. (They are referred to as the Multiframe I2C sequencer and Multiframe register write sequencer in the block diagram). Each of the two sequencers can store up to 64 commands in each of the seven individual queues: for the current and the next six frames and then output them to the sensor and internal FPGA modules, respectively, directly after the beginning of the scheduled frame in the sensor. The commands for the current frame are sent out immediately.

    Read access to the driver parameters does not involve the actions required by write access, so it is implemented as a memory mapped array accessible from applications, including PHP. This program opens the related files and maps the data arrays during initialization. Because it is configured to work in FastCGI mode, many HTTP requests are served without re-initialization of the driver data access. Parameter symbolic names defined in the driver header file are exported to the PHP extension as PHP constants, and extension functions are able to process array arguments with the key names automatically derived from the driver parameters names. The PHP extension functions use driver write() to set the camera acquisition parameters, and then direct mmap() to read them. The driver also maintains an array of global parameters that are not linked to the particular frames and do not cause any actions when being written to. Those parameters are accessible with mmap() for both direct reading and writing; other extension functions use similar access to other driver structures, such as gamma tables used for pixel data conversion and image histograms calculated in the FPGA.

  • Color processing in Elphel cameras

    The Elphel camera color processing modes offer additional options for balancing between image quality and compression ratio, and between real-time in-camera color processing compatible with standard viewers and post-processing using the host computer. The latter results in higher quality for images recorded with regular sensors that have color mosaic filters.

    Most color image sensors, including those used in Elphel cameras, rely on a Bayer pattern filter, so each pixel can only detect a single color of the three RGB as shown below:

    Image registration by the sensor through the Bayer color mosaic filters
    (Click for actual demo page with mouse-over effects)

    It may seem that a lot of information is lost from this process — 2/3 is discarded if only one of three color components is left in each pixel — but modern algorithms can restore missing color components with good precision. High quality algorithms use multi-pass calculations, and need information from the pixels far from the one where the colors are to be interpolated. Such processing is a challenge to implement in the FPGA, as the data transfer with the required external buffer memory can be a natural bottleneck. Elphel cameras use much simpler processing. The data for JPEG compression comes from the buffer memory, formatted as 20×20 overlapping tiles so that each tile corresponds to one 16×16 pixel macroblock (MCU) for JPEG compression as YCbCr 4:2:0.

    With this mode, for each 16×16 pixel block there will be four 8×8 pixel blocks to represent intensity (Y) and two additional 8×8 blocks to encode color. Color is represented as Cb (the difference between blue and green) and Cr (between red and green), each with half-spatial resolution in both vertical and horizontal directions, compared to the original pixels (and Y component).

    Color processing to convert Bayer pixel data into color YCbCr 4:2:0 ready for JPEG or Ogg Theora compression
    (Click for actual demo page with mouse-over effects)

    How many bits are really needed in an image pixel?
    Well, that could depend on your particular sensor's full well capacity (FWC), sensor read-out noise, and step-to-noise ratio. For a handy calculator, as well as instructions for measuring your particular sensor's FWC (the number of electrons each pixel can hold) click here

    Additional pixels in the tiles (two rows from each of the four sides) are provided by the memory controller to interpolate pixels near the edges of the 16×16 macroblock, enabling one to use 5×5 pixel areas to calculate each of the macroblock pixel's missing color components. The current code uses only simple 3×3 bilinear interpolation, disregarding the outmost pixels. This code combines Bayer pattern interpolation and RGB->YCbCr conversion in a single step, so resulting values for Y, Cb and Cr are calculated directly from the input Bayer data, as illustrated by the image above.

    By clicking on that image (and similar images below) to enlarge it, and then hovering the mouse pointer over a pixel block, information will be displayed, including coordinates, values of the pixels, and how they are calculated. A PHP script used to generate those images can also be applied to the other samples.

    Such simple color processing is sufficient for many video monitoring applications, but produces unacceptable color artifacts when the object contains sharp gradients, as can happen during document scanning. Other video applications do not require real-time availability, but need the best image quality hardware can provide. In either case, it would be nice to have raw pixel data. However, that requires higher bandwidth than our 100Mbps Ethernet connection affords, and would use more space in the storage media if being recorded.

    We needed to provide raw Bayer data while maintaining an acceptable frame rate on our previous model 323 camera, where network speed was additionally limited by the on-chip MAC of the earlier Axis ETRAX 100LX processor. Two network connectors, and two of the 10313 boards operated in parallel to increase overall data rate. I tried to modify the already implemented JPEG compression chain, and reuse the developed code and make it suitable for the Bayer data compression. It was possible to treat the Bayer pixels as if they were the data from the monochrome sensor (the rightmost picture on the Image registration by the sensor through the Bayer color mosaic filters illustration above), but that would produce very inefficient JPEG compression. JPEG and other codecs are designed so that the high spatial frequency components are sacrificed first, as they carry the least perceivable information. But if the sensor is Bayer color instead of monochrome, a lot of high frequency components appear even if there are no sharp gradients in the image. For a colored object, this results in a repetitive pattern with maximal frequency, with odd and even pixels having different values.

    Preparation of the sensor raw Bayer data for efficient JPEG compression (JP4 mode)
    (Click for actual page with mouse effects)

    To increase compression efficiency in this case, it was only necessary to rearrange the pixels in each 16×16 macroblock so each color component pixel moves into a separate 8×8 pixel block. Pixel numbers in the second (re-arranged) image show their location in the original macroblock there. Opening the enlarged image shows they are consecutive hex numbers. This modification made possible efficient compression of the raw Bayer pixel data with the otherwise standard JPEG encoding. In that original implementation, we even preserved dummy color components. Because they consist of all zero values, compression is very efficient, and the file size does not increase much, but that makes it possible to open the file with the unmodified JPEG decoders. The processing is still required, but the picture does not “fall apart.” Here is a sample video made by Sebastian Pichelhofer that demonstrates application of the JP4 mode for video recording.

    Modified JP4 mode (JP4D): primary green color (G) is encoded as absolute data (Y1), Three other color components use the difference from G: Y0=R-G, Y1=G, Y2=G2-G, Y3=B-G
    (Click to enlarge)

    The original JP4 mode, although compatible with the unmodified libjpeg on the host computer, still had some shortcomings that were targeted by the later modifications:

    • The major problem was the compressor frame rate penalty for the two dummy color blocks. The FPGA still needed time to process those zeros. When running at 160MHz, and using two clock cycles per pixel, it could compress 80MPix/sec, but processing of the dummy blocks reduced this speed to just 80*(4/6)~=53MPix/sec. That was enough when handling earlier sensors, with up to a 48MHz pixel clock, but newer Micron (now Aptina) sensors use a pixel clock of 96MHz, with an average pixel rate of about 75MPix/sec. This was reduced from the pixel clock frequency because of the required horizontal blanking. When compressing the pixel data only, and omitting the dummy blocks, the compressor runs at 80MPix/sec. That is faster than the 75MPix/sec that the sensor can provide, so the compressor is not a limiting FPS factor again, and the camera can run close to 15fps at 5MPix and above 30fps at 1920×1088.
    • Another improvement of compression ratio, without additionally degrading image quality, derived from referencing the DC component of each block. In JPEG, DC components apply to the previous block of the same color. Average values of the 8×8 block are encoded differently than the rest of the components. They use the difference from the DC of the previous block. The earlier JP4 implementation was bound by the compatibility with the regular JPEGs, and had to treat them all as different blocks of the same Y (intensity) component.
    • Next compression improvement exploits the correlation between the adjacent pixel values of different colors by replacing all but one color component (G, the first of the two green pixels in an RG/GB Bayer pattern) with differences from G. These differences are R-G, G, B-G, and G2-G. (G2 is the second green in the Bayer pattern.) This modification also provides additional flexibility in the fine tuning of quality vs. image size by using a separate quantization table for the differential data, similar to that of the quantization tables for color components in standard JPEG.

  • Increasing effective dynamic range of the image sensor

    Most of the CMOS image sensors have all the analog circuitry and analog-to-digital converters (ADC) on-chip. This is the beauty of this technology compared to the older CCD one. It is possible to place photo detectors and the rest of the circuitry on the same chip, while CCD requires separate support components. But that comes at a price — you have to use whatever circuitry is implemented. The performance of modern sensors is well balanced. Our Micron/Aptina MT9P001/MT9P031 sensor provides 5 megapixels with 12-bit output. Such resolution is enough for most applications, but pixels are capable of a little more. This is why this (and most other similar) sensors have additional programmable gain amplifiers between the pixel array and the ADC. When measuring the noise performance of the sensor, we found that pixels have full well capacity around 8,500 electrons with the dark noise (at the same analog gain of 1.0) of around 4-5 electrons. Thus, the effective number of bits (ENoB) is ~11, approximately two electrons per ADC count. When setting the analog gain to the maximal 15.75, the dark noise increased to ~10 ADC counts, but this time each electron in the pixel resulted in ~8 ADC counts. If both analog gain settings were combined together that would result in close to 13 bits pixel dynamic range, while a single gain setting provides only about 11 bits at the optimal gain of 1.0. As a result, pixels are 3-4 times better than what is possible in a single frame without changing gains.

    Differential mode, adjusted for the HDR applications (JP4DH). Both primary green color (G) and the second high-gain one (G2) are encoded as absolute data (Y1,Y2). Two other color components use the difference from G: Y0=R-G, Y1=G, Y2=G2, Y3=B-G
    (Click to enlarge)

    It is still possible to increase the overall intraframe dynamic range by using different gain settings for the two green pixels available in each 2×2 Bayer cell. If the first green, red, and blue are set on or close to the gain of 1.0, and the second green is set to the high gain, to achieve close to a single electron dark noise value, then it is possible to increase sensitivity in the deep shadows for that second green pixel. In bright parts of the scene, this green will be saturated. Post processing should rely on the data from three other pixels of a 2×2 cell to interpolate the image. In the shadows, that high gain green pixel will be the only one of the cells to provide signal above the noise floor. Kodak sells special image sensors that have white pixels in the Bayer pattern, for the same purpose of increasing sensitivity and dynamic range. However, the trick with the different gain settings for the two green pixels is applicable to most of the CMOS sensors available


The latest Elphel firmware upgrade (8.0) is the first major redesign of the camera's software. The redesign was performed just before migrating to the new processor architecture, both to reduce the number of changes during the scheduled hardware upgrade and to simplify simultaneous support of the current and the next camera models, in order to unify their codebases.

The new firmware has immediate advantages of:

  • Improved system stability achieved by removing duplicate code, eliminating the processing of multiple individual cases when dealing with the sensor, and lowering FPGA latencies
  • Simplified future development, thanks to cleaner core-camera code, by using new tools for internal parameters monitoring and modification, as well as profiling
  • Reduced entry barrier for new developers by integrating the project with KDevelop IDE. This integration also simplifies project maintenance
  • Reduced, and in some cases eliminated, wasted frames, and enabled HDR applications with alternating exposure durations by adding support for pipeline operation of the camera modules
  • Enabled precisely synchronous image acquisition for stereo and panoramic applications
  • Integrated additional FPGA functionality, including:

    • Enhanced JP4 modes that allow 15fps at the full sensor resolution of 2592×1936 pixels
    • Command queues for the sensor and FPGA internal modules
    • “Focus helper” module in the compressor that can be used to evaluate sharpness of the lens focus
    • Lens/sensor vignetting correction

About the Author
— Andrey N. Filippov has over 25 years of experience in embedded systems design. Since graduating from the Moscow Institute for Physics and Technology in 1978, he worked for the General Physics Institute (Moscow, Russia) in the area of high-speed high-resolution, mixed signal design, application of PLDs and FPGAs, and microprocessor-based embedded system hardware and software design. Andrey holds a PhD in Physics from the Moscow Institute for Physics and Technology. In 1995, Andrey moved to the United States, and after working for Cordin Company (Salt Lake City, Utah), in 2001 he started Elphel, Inc., dedicated to doing business in the emerging field of open systems based on free (GNU/GPL) software and open hardware. This photo of the author was made using a Model 303 High Speed Gated Intensified Camera.

More by this author

This article was originally published on LinuxDevices and has been donated to the open source community by QuinStreet Inc. Please visit for up-to-date news and articles about Linux and open source.

(advertise here)

Comments are closed.