Back to the Future of Parallel Programming

    Twenty years ago it was relatively easy to write a program that ran efficiently on anything from a single microprocessor to an array of 50,000. Just ask anyone who developed systems using the INMOS® Transputer1. Unfortunately, designers of large scale parallel processing systems today face a tougher choice: Simple to develop OR efficient at using the available processing power.

    So why the change? The Transputer provided an extremely lightweight and low latency process and communication model that allowed the designer to build enormous parallel systems by employing large collections of relatively simple processes. In addition to the design advantages of more naturally representing whatever parallelism was inherent in a particular application, doing things this way also tended to create large numbers of processes that could be easily distributed across multiple processors to meet specific performance goals.

    Post-Transputer parallel systems, by comparison, tend to utilize a smaller number of more complex processes, and to have different underlying techniques for internal and external process communication. In addition, they frequently feature heterogeneous combinations of microprocessors, DSP's, FPGA's and full-custom silicon. They work, they are fast enough, but they sure aren't as easy to develop! And the resulting systems don't scale as well to higher performance levels, either.

    So, can we reclaim the option of having both easy development AND very efficient use of available resources?

    This web site is devoted to answering this question by updating the original INMOS vision for how massive quantities of processors should be managed, given the huge technological advances embodied in modern mass-market microprocessors and communications systems developed since the Transputer era. More specifically, this web site hosts PCPUTER; the “secret sauce” needed to get your massively-parallel application up and running in record time, using the minimum necessary amount of hardware and other associated resources (energy, finances, gray hair, etc.)

    PCPUTER consists of a suite of software tools and hardware techniques that streamlines the creation of very efficient, very high performance, distributed parallel processing systems. PCPUTER uses commodity hardware powered by high performance microprocessors from Intel® and AMD®, and is based on a proven and successful application development suite originally crafted for the Transputer.

    So what makes PCPUTER different from other popular parallel processing “recipes”? PCPUTER provides an ultra lightweight approach to process threading, synchronization, and local/global communication. Reducing process and communication overhead is the key step that allows you to more easily capture the parallel performance possibilities inherent in your application.

    And PCPUTER allows you to harness this parallelism, and the associated system design clarity/scalability, whether you are using a single processor, a shared-memory/multicore system, or distinct processors connected by local or global serial communication links. However, with this performance comes a price: PCPUTER doesn't run under your favorite operating system (directly anyway), it IS the operating system. Targeted applications are similar to those for which arrays of Transputer's were once employed; large scale data collection, data analysis and simulation systems. Think “embedded” and “real time” writ large, rather than “data center”.

    So, does your application need PCPUTER?

1 If you don't know what a Transputer was, see here for a bit of history.