1.8 The Sea Change: The switch from Uniprocessors to Multiprocessors
The power limit has forced a dramatic change in the design of microprocessors. Figure 1.17 shows the improvement in response time of programs for desktop microprocessors over time. Since 2002, the rate has slowed from a factor of 1.5 per year to a factor of 1.2 per year.
Rather than continuing to decrease the response time of a single program running on the single processor, as of 2006 all desktop and server companies are shipping microprocessors with multiple processor per chip, where the benefit is often more on throughput than on response time. To reduce confusion between the words processor and microprocessor, companies refer to processors as "cores," and such microprocessors are generically called multicore microprocessors. Hence, a "quadcore" microprocessors is a chip that contains four processors or four cores.
In the past, programmers could relay on innovations in hardware, architecture, and comilers to double performance of their programs every 18 months without having to change a line of code. Today, for programmers to get significant improvement in respone time, they need to rewrite their programs to take advantage of multiple processors.Moreover, to get the histotoric benefit of running faster on new microprocessors, programmers will have to continue to improve performance of their code as the number of cores increase.
To reinforce how the software and hardware systems work hand in hand, we use a special section, Hardware/software Interface, througheout the book, with the first one appearing below. These elements summarize important insights at this critical interface.
Parallelism has always been critical to performance in computing, but it was often hidden. Chapter 4 will explain pipelining, an elegant technique that runs programs faster by overlopping the execution of instructions. This is one example of instruction-level parallelism, where the parallel nature of the hardware is abstracted away so the programmer and comipler can think of the hardware as executing instructions sequentially.
Forcing programmers to be aware of the parallel hardware and to explicitly rewrite their programs to be parallel had been the "third rail" of computer architecture, for companies in the past that depended on such a change in behavior failed(see Section 6.15). From this historical perspective, it's starting that the whole IT industry has bet its future that programmers will finally successfully switch to explicitly parallel programming.
Why has it been so hard for pragrammers to write explicity parallel programs?
The first reason is that parallel programming is by definition performance program need to be correct, solve an important problem, and provide a useful interface to the people or other programs that invoke it, the program must also be fast.
The second reason is that fast for parallel hardware means that the programmer must divide an application so that each processor has roughtly the same amount to do at the same time, and that the overhead of scheduling and coordination doesn't fritter away the potential performance benefits of parallelism.
As an analogy, suppose the task was to write a newspaper story. Eight reporters working on the same story could potentially write a story eight times faster. To achieve this increased speed, one would need to break up the task so that each reporter had something to do at the same time. Thus, we must schedule the sub-tasks. If anything went wrong and just one reporter took longer than the seven others did, then the benefits of having eight writers would be diminished. Thus, we must balance the load evenly to get the desired speedup. Another danger would be if reporters had to spend a lot of time talking to each other to write their sections. You would also fall short if one part of the story, such as the conclusion, couldn't be written until all of the other parts were completed. Thus, care must be taken to reduce communication and synchronization overhead. For both this analogy and parallel programming, the challenges include scheduling, load balancing time for synchronization, and overhead for communication between the parties. As you might guess, the challenge is stiffer with more reports for a newspaper story and more processors for parallel programming.
To reflect this sea change in the industry, the next five chapters in this edition of the book each have a section on the implications of the parallel revolution to that chapter.