In this blog, you can read the 'aha' moments in my rendezvous with computers, computing, and electronics, HOWTOs and whys that you could not get an answer from your textbook and then once in a while, a strange mixture of global economics.

Friday, August 28, 2009

A simple question. Jack is a cobbler, working in DumbCobbler Inc. He stitches on an average 20 shoes per day. DumbCobbler Inc has got some new orders and so they hire 9 new cobblers to work with Jack. Do the math. How much shoe can DumbCobbler make in a day? If the answer is 200,...

Wrong. Actually they are making 225 shoes a day. Where does the extra 25 shoes come from, if all the cobblers are equally qualified and in isolation they can make only 20 shoes per day. The extra 25 is the result of synergy. So as a team of talented people work together, some kind of a team thing develops among them and even without much process improvement they make more in a given time. Where there is synergy, whole is greater than the mathematical sum of the parts.

Now let me change the problem a little bit. I have a processor, say MIPS 32 bit processor. I have an image processing problem and the processor takes 6 seconds to run the algorithm that solves the problem. Now I am putting two processors, both MIPS 32-bit, in a multicore environment. The same algorithm has to be run. How much time would it take now? If the answer is 3 seconds, ...

Wrong again. The answer probably would be somewhere between 4-5 seconds. It depends on two factors. One is whether the algorithm can be made parallel. Some algorithms are inherently sequential. For example, adding numbers in an array (am I sure?). The second factor is how skillful the programmer is in recognizing the parallelism present in the algorithm. For example, adding numbers in an array can be done in parallel, since you can divide and conquer. But even in a completely parallel implementation the lower bound remain intact. That is, if one processor takes up 6 units of time, two processors can at the best take 3 units of time to finish the job. A little more perhaps for synchronization, but not any less, unlike humans.

So will the processors ever get synergy? If we assume processors to be dumb compared to humans, would robots with AI have synergy? For this I have to be explained how synergy works in terms of a cerebral model. Has any psychologist tried it? No idea; I don't follow medical discoveries.

But here is an interesting fact. In IBM's Cell multicore processor architecture, there are a set of RISC processors called SPE, which stands for Synergistic Processing Element. They are SIMD processor suited for vector processing (as with any SIMD processor). It does not do out of order execution, because it has 128 registers. Register renaming can be done liberally which obviates OOO. Instead of cache, SPE uses something called "Local store". Just like cache it is inside the chip. SPEs act together when a set of SPEs are chained together for stream processing. This property is of great use in a GPU which requires fast video processing. More on Cell architecture some other time.

Probably since these processors are chained together, they have that prefix, "Synergistic". It still does not produce the effect of whole being larger than the sum of the parts. That's something that we have to wait for a long time.

Wednesday, August 26, 2009

Fashion is cyclic. What goes out of fashion today would be fashion tomorrow. Now whatever was the fashion is coming back slowly. I am not talking about the pure fashion news. I am talking about a silicon PCB and a 3D chip with buses through bare silicon between layers. Some start ups are trying to build PCB-like system on a bare silicon. One does it in 3D and the other in 2D. With a high ASIC price and the need for compact logic, this may come as an alternative for many struggling IC makers.

Ted Kennedy passes away. I cannot forget his very famous speech, "Dreams Shall Never Die". It is an inspiration for all youngster who feel heart broken by our current economic state; for college grads who do not have a funding, for graduated folks who do not find a job; for those hard workers who just lost their jobs for nothing wrong in their part. Must watch video.

Sunday, August 23, 2009

This news in IEEE website is 2 months old, but I read it just now. And it's a bad news. The unemployment claim of electrical and electronics engineers hit a record low of 8.3%. And what's worse the biggest slump was in the last quarter in which it nearly doubled.

The news for EEs was particularly bad as the jobless rate more than doubled from 4.1 percent in the first quarter to a record-high 8.6 percent in the second. The previous quarterly record was 7 percent, in the first quarter of 2003.

In the first quarter, the computer professionals' jobless claims was at 5.6%, but it is now standing at 5.6%. Good that it has not gone down too.

This is a two months old news. Now the economy is improving. We can see GDP is picking up in the US and Europe. Home sales are rising. We are certainly in the path of recovery. So we can expect that the unemployment claims for engineers would come down. Not so fast. Unfortunately that's not how it works. The economy is currently undergoing a jobless recovery, like it did back in 2001. This means that you can see rise in GDP and consumer price index, but the unemployment rate will remain fixed for a while. So the bad days are far from over. Lets hope we can get out of this hell soon.UPDATED 23-AUG-09: Here is Paul Krugman affirming jobless recovery.

At the beginning of this month, Richard Posner published an extreme criticism for Christina Romer's prediction about the stimulus package. That article even questioned the ethical responsibility of Romer and some prominent progressive economists (Stiglitz's name was missing!!)

But there was a basic mathematical mistake in the criticism. Posner compared annual GDP with quarterly spending. Although I observed this, my inferiority complex did not allow me to publish this in my blog. After all, I am not a professional economist. I am an engineer who loves macroeconomics and in fact who loves anything that can be derived from reason.

On the other hand, Brad DeLong has no such inhibition to spot this flaw and other mistakes in Posner's article.

Posner is trying to get his readers to compare the number 5 (the percentage-point swing in the growth rate between the first and the second quarter of 2009) to the number 2/3 (the percentage share of second-quarter stimulus expenditures to annual GDP). He hopes that they will conclude that Christina Romer's claims are wrong because the effect is disproportionate to the cause: $1 of stimulus could not reasonably be expected to produce $7.5 of boost within the same quarter. But the stimulus money spent in the second quarter was spent in one quarter, so the right yardstick to use to evaluate it is not annual but rather quarterly GDP--stimulus spending in the second quarter was not 2/3 of one percent but 2.6% percent. And the level of production in the economy in the first quarter was not 6% but rather 1.5% below its level in the fourth quarter--the 6% number is not the decline from one quarter to the next but rather the rate of decline, how much the decline would be after a year were it to go on for four quarters. So the right comparison is 1.5% to 2.6%[1].

Posner is off by a factor of 16.

I am glad that my observation was right that Posner was wrong in comparing the annual GDP with quarterly fiscal spending. Thanks, Dr. DeLong for bringing it up.

Friday, August 21, 2009

Here is an interesting video on the making of Intel Core i7. It starts off well, but slowly it becomes more of a marketing video than a technical video. It gives an overview of the architecture, but I was looking for an in depth analysis of the microarchitecture with design decisions and why they were made. It disappointed me. Anyway it is a good video for a high-level picture and it's worth watching.

Thursday, August 20, 2009

Here is a good news in the last weekend. Earlier what was considered to be an impossible and impractical challenge has received more than 400 proposals. I am talking about Smart Grid project (part of the Grid Vision 2030) through which the United States Department of Energy (DOE) tries to modernize and improve the electric grid through the introduction of microprocessor and microcontroller based control systems.

Frankly, I have been wondering whether the DOE already does not use microprocessors! Processors have been around for more than two decades. The devil may mostly be in the detail. Anyway it's a good news in terms of efficiency, since if we can reduce the power loss due to transmission and distribution, which implies reduction in global warming due to human activities. Losses during transmission and distribution of power accounts for 7.2% of the total power produced. DOE has to be appreciated for its good job of conducting e-forums for Smart Grid. Something aggressive like this needs to be done in developing countries like India which experiences power shortage and huge transmission and distribution losses.

Tuesday, August 18, 2009

A quick and dirty post while drinking a tall, non-fat latte in a coffee shop. A few months ago, my friend asked me whether C/C++ had garbage collector, since there is no native garbage collector in C and C++. My answer was no. Apparently my answer was wrong. There is a garbage collector for C and it is not just some crackpot code, it is in Hewlett-Packard site. Sweet. But do C or C++ need it? I have coded Java for fun as well as for food. From coding perspective garbage collector is very helpful, making your job easy. Garbage collector is done efficiently without your intervention, but still it uses up considerable CPU cycles. So there are times in which the garbage collector has become a performance nightmare. In those cases, we need to tune the garbage collector for efficiency which takes time that you saved in coding.

C and C++ are used widely in embedded system, because of the efficiency, reduced CPU cycles, and real time performance. Although some have demonstrated Java program reaching up to the speed of C program, you cannot expect that consistently in a wide range embedded systems programming. Garbage collection is a sensible sensitization that you require when the operating system takes care of the memory. But in embedded systems programming, you know how much memory you have and you know what to do to manage that. It's the same way the operating system itself manages the memory. So you have to be sensible enough in your programming task to allocate and remove memory and construct and destroy classes. Leaving it to the garbage collector will slow down your embedded system.

Garbage collection is a great idea. C or C++ can use it certainly while programming a general purpose processor that runs an operating system. But it is not for embedded systems.

Wednesday, August 12, 2009

Gregory Clark has written an article in Washington Post in which he has claimed that in future we would not have any job as people would be replaced by machines to do those jobs.

I recently carried out a complicated phone transaction with United Airlines but never once spoke to a human; my mechanical interlocutor seemed no less capable than the Indian call-center operatives it replaced. Outsourcing to India and China may be only a brief historical interlude before the great outsourcing yet to come -- to machines. And as machines expand their domain, basic wages could easily fall so low that families cannot support themselves without public assistance.

Interesting. But this is not something that I hear for the first time. As an engineer, I was in the receiving side of this blame, during several instances when I met some old bank employees and typewriter mechanics who were pushed to the verge of fear of loosing jobs when computers were introduced.

So does the innovation lead to crisis? If so, why were we not happy being cavemen; hunting rabbits and hunted by saber-tooths? So is the evolution of mankind as an intellectual animal actually a negative thing?

I disagree with Gregory Clark and this is my perspective. There is a strange feature about technology growth - that it never stops. In a competitive environment, people always want to make their product better in the market and faster to reach the market. Only way of doing it is through technology. So every technical innovation has a lifetime, which ends when something better is invented.

When cavemen invented wheels, many people who used to push big blocks of stones might have lost their jobs. But their kids might have got jobs as wheel-makers. The important part is that even the son of the jobless caveman should have access to the school of wheel-making. Once that is assured, this cycle would continue endlessly. So the unskilled workers just need the pay to keep up themselves and provide education for their kids to become a skilled worker. Governmental intervention must ensure that this happens through several means like minimum wage policy, inflation control, subsidy on education, etc.

Tuesday, August 04, 2009

Back in 2005, I was listening to the recorded voice of Intel President Paul Otellini saying this in Intel Developers Forum. When he described the future direction of Intel, this is what he said:

We are designing all of our future development to multicore designs. We believe this is a key inflection point for the industry.

Followed by the diminishing returns from the Instruction-led Parallelism in a uniprocessor, the world of computer architecture decided that multicore processor and chip multiprocessor is the direction of the future.

I knew the importance of multicore processors even before they became famous in the general purpose computing. Part of my undergraduation research thesis involved implementing digital beamforming in quad-core SHARC processor. Now it is apparent that multicore processors are here to stay and whether you like it or not parallel programming is the future way of computing. Web programs are already running in parallel managed by the web application servers. Embedded systems programming are rapidly moving towards introducing parallelism wherever performance matters. There still are two issues that make parallel programming difficult. One is the availability of debugging tools, especially the rather unique bugs like Heisenbugs. The firms are moving towards developing debuggers to reveal the heisenbugs and ease the programming. Although the multicore developers and compiler designers are coming up with parallel programming debugger extensions to solve this problem, it is clear, present and painful at this stage.

Second issue with multicore processor is the lack of simulators for multicore processors. SimpleScalar is certainly a excellent processor simulator. But simulating a Chip MultiProcessor (CMP) with hundreds of core is still an open problem for the computer architecture community. Recently Monchiero et al of Hewlett-Packard Laboratories have come up with an idea to simulate shared-memory CMP of large size, published in the recent SIGARCH transaction.

The best part of this paper is the simplicity of the underlying idea. The idea is to translate the thread-level parallelism of the software to core-level parallelism in the simulated CMP. First step is to use the existing full system simulator to separate instruction streams belonging to different threads. Then the instructions flow of each thread is mapped to different cores of the targeted CMP. And then the final step is simulating the synchronization between the different cores. The simulator explained in this paper can be used to simulate any multithreaded application in a conventional system simulator and extend the evaluation to any homogenous multicore processor. I believe this framework is going to be used in many CMP-simulators in future.

UPDATED ON 02/02/2010: This might be a viable multicore processor simulator.