Tigran Aivazian Says His SMP Contributions to Linux Kernel While at SCO Were Approved by his Boss

Friday, December 12 2003 @ 03:01 PM EST

Groklaw has reported before on contributions made to the Linux kernel by Christoph Hellwig while he was a Caldera employee. We have also offered some evidence of contributions by oldSCO employees as well. Alex Rosten decided to do some more digging about the contributions of one kernel coder, Tigran Aivazian.

Tigran contributed code to the kernel, including to SMP, while working at oldSCO, and he informs us he did it with the approval of his superiors there at the company and his boss knew the code would be distributed under the GPL.

This paper is a group effort. Alex's research was shared with others in the Groklaw community, who honed, edited, and added further research. Then the final draft was sent to Tigran himself, so he could correct and/or amplify, which he has done.

***************************************

TIGRAN AIVAZIAN SAYS HIS SMP CONTRIBUTIONS TO LINUX KERNEL WHILE AT SCO WERE APPROVED BY HIS BOSS
~by Alex Roston

One of the most interesting bodies of evidence in the SCO case is the work of Tigran Aivazian, a truly excellent programmer who's made some fantastic contributions to Linux. Mr. Aivazian worked for Old SCO before the Caldera purchase and then went on to work at VERITAS. Mr. Aivazian has been a kernel maintainer for several years in two separate areas, the BFS filesystem and INTEL P6 microcode update support as noted here.

Note that the link refers to 1.1.1.2 of the kernel maintainers list. A later update of the same list (2.4.20 maintainers) shows him in charge of "INTEL IA32 MICROCODE UPDATE SUPPORT," (obviously a renaming of "INTEL P6 MICROCODE UPDATE SUPPORT") and he's also still in charge of the BFS filesystem, though his email address has changed to tigran (at) veritas.com. Intermediate versions of the list and most of the URL's shown below list him as tigran (at) sco.com.

In other words, he's a longtime maintainer of two parts of the Linux kernel, both while at SCO and at VERITAS. BFS filesystem support isn't terribly important to most Linux users. According to the Filesystem HOWTO
the "UnixWare BFS filesystem type is a special-purpose filesystem. It was designed for loading and booting UnixWare kernel." In other words, you won't need BFS unless you're experimenting with some kind of UnixWare implementation. However, the appearance of the BFS kernel in Linux is interesting because it's a clearcut case of a SCO programmer transferring something that could be described as an enterprise enhancement from UnixWare to Linux. According to this note by Tigran, it's been part of the standard kernel since October 28th of 1999.

At this point, you may be wondering two things: first, whether Mr. Aivazian is some kind of rogue coder, a guy who couldn't keep his employer's trade secrets, so let me assure you that this is not the case. Take for example this email, where we discover that Tigran Aivazian, writing from his sco.com email address, is finally able to speak publicly about certain facts that previously he wasn't free to discuss:

"The stuff below is no longer a secret so I think I can share it with you. It does mention Linux quite a few times and may even answer someone's question as to whether SCO is interested at all in Linux."

The next question you likely have is, were his Linux contributions authorized by the company and did they realize it would be distributed under the GPL? As frequent Groklaw contributor Harlan put it upon reading a first draft of this piece, "These are the good guys. Chris Hellwig, Tigran Aivazian, Steve Pate, Jun Nakajima, and Niels Christiansen have each taken time out to write about what they are doing, and to explain or teach others. This is communal computing - exactly what Ken Thompson and Dennis Ritchie wanted to preserve when Bell Labs pulled their group out of the MULTICS Project."

We contacted Mr. Aivazian about this matter, and he wrote back as follows:

"Yes, my very humble contributions to the Linux kernel (BFS filesystem and IA32 microcode update driver) done during my work as an escalations (UnixWare kernel) engineer at SCO were approved by our then-director Wendy Jones (who now works for Sun I think) and by higher management as well (I have bad memory on names, so I can't remember exactly, I think it was on Doug's level or so).

"For example in the case of BFS filesystem the matter was as follows. I did NOT use any of the UnixWare (or other) proprietary code for the implementation, of course. However, despite this fact, I still (for courtesy and generally being cautious) requested permission from Wendy (Development director) before the release under GPL and she confirmed that SCO has no claims to this work whatsoever and has no objections to its release under GPL, because it is not connected to UnixWare source code in any way." [emphasis added]

Let's move along to the the microcode update feature of the Linux kernel, which Mr. Aivazian also maintains. This particular kernel code is much more important than the BFS filesystem because it deals with installing updates from Intel into their CPUs. It's a part of the kernel you don't hear much about, but should someone discover a major Pentium bug the phrase "IA32 microcode update support" will suddenly be on everyone's lips. Microcode update support was added as of kernel version 2.3.46, and it has also been backported to kernel 2.2.18.

In terms of the SCO vs. IBM case, Mr. Aivazian's work is important for two reasons. First, he has worked on SMP, though not all of his SMP work takes place in kernel space. Second, he's done an enormous amount of work toward making Linux into an "enterprise level" operating system. Paragraph 114 of SCO's Amended Complaint reads as follows:

"114. IBM has breached its obligation of confidentiality by contributing portions of the Software Product (including System V source code, derivative works and methods based thereon) to open-source development of Linux and by using UNIX development methods in making modifications to Linux 2.4.x and 2.5.x, which are in material part, unauthorized derivative works of the Software Product. These include, among others, (a) scalability improvements, (b) performance measurement and improvements, (c) serviceability and error logging improvements, (d) NUMA scheduler and other scheduler improvements, (e) Linux PPC 32- and 64-bit support, (f) AIX Journaling File System, (g) enterprise volume management system to other Linux components, (h) clusters and cluster installation, including distributed lock manager and other lock management technologies, (i) threading, (j) general systems management functions, and (k) others."

Keep reading and you'll see that Mr. Aivazian, who worked for SCO while he did most of the work we're about to discuss, made major contributions toward helping Linux "achieve," as SCO puts it in their complaint, "high-end enterprise functionality."

Let's start by looking at Mr. Aivazian's contributions to SMP. Here
we discover him being thanked by the maintainer of the SMP HOWTO.

Here
we see, in the file header to smpboot.c, that Mr. Aivazian is credited with fixing a minor problem called the "0.00 in /proc/uptime on SMP" bug. Here we see the letter where he made his suggestion for dealing with the problem:

Here Tigran talks about a bug in linux-smp. To be more exact, he's adding to the bug description. The bug was originally found in the "read(2)" section of the code and Aivazian takes things one step further, noting that the bug is also in the "write(2)" section of the code. You can see that he worked as part of SCO's Escalations Research Group.

This takes place on the 16th of October in 1998. In other words, three years after Caldera financed the purchase of a dual Pentium board so Alan Cox could develop SMP, a SCO coder was also making contributions to SMP and this was before the Caldera-oldSCO deal.

On this thread from 1999, long before IBM started working on Linux, he discusses "SMP safe operation."

Here's another SMP thread from 1999, with Tigran writing from his sco.com email address.

Neither of the URLs above discusses the SMP in the kernel in any depth, but it makes clear that a SCO programmer was actively helping to make someone else's Linux work properly under SMP. What's even more exciting about this discussion is that the programmers are talking about improving get-cycles(), which is a tool for counting CPU cycles. Improving this code provides Linux with improvements in performance testing and scheduling. Also, as you doubtless know, code that runs properly under SMP is, according to SCO, "enterprise capable" code.

In another thread, also from 1999, Tigran provides a bootlog showing an SMP error in the 2.3.32 kernel.

Linus accepts the bug report and replies that he's removing the bad code.

On January 4th of 2000, Aivazian talks about code that might be "...not only inconsistent but disastrous (on SMP if list is modified at teh (sic) same time)"
and here's the reply from Manfred Spraul. He agrees that the code can cause SMP problems. Once again, we see the SCO programmer helping Linux coders make their code run with SMP.

Here's a patch to the 3c509 driver, (3c509 is an ethernet chip) where Tigran and two other programmers discuss possible problems relating to SMP code. However, you'll also note that the patch wasn't accepted.

A couple other references to the conversation can be found here, where Tigran is writing directly to Linus,
and here.

In this post Tigran talks about making the piece of code under discussion SMP safe. This is a big deal for Aivazian, because as we'll see later, he prefers to use the vmalloc function (rather than the kmalloc function) for his microcode update work.

This is from kernel 2.3.35. The person who replies to him is Alan Cox, who implemented the first version of SMP for Linux. Cox dislikes Aivazian's suggestion.

Tigran replies, noting that he sent Alan a private e-mail on this subject, and suggesting a patch.

This idea is kicked around by Aivazian, who notes that he looking for some code for testing this subsystem.

James Lokier, Manfred Spraul, and Bill Wendling discuss this for a little while. Meanwhile, on a different subthread, Tigran suggests that perhaps it's time to use a patch he submitted some time back.

Then he changes the name of the thread to "smp-safe vmalloc (was Re: [Patch] Polling on more than 16000 file descriptors)" and continues in this post.

Note that this is some very high-level discussion about SMP, though it appears, in the next mail from Bill Wendling, that this particular idea didn't work very well.

You'll notice that in addition to the issues with SMP, much of the discussion above centers around Aivazian's concerns with vmalloc and kmalloc. Aivazian is credited with making vmalloc.c SMP safe in May of 2000.

This listing is from kernel 2.5.69, but note the date of Aivazian's contributions.

Aivazian's concerns with vmalloc, both in the 2.5.69 kernel and in the kernels he was discussing in January of 2000, are very significant for two reasons. First, because the improvements to the vmalloc function are minor performance enhancements, (the previous vmalloc interacted with SMP in a fairly clumsy manner, the new one is much more subtle in the way it interacts with multiple CPUs) and second, because Mr. Aivazain prefers the use of vmalloc for the IA32 microcode update feature he maintains. To quote his Linux Magazine article on the microcode update:

"The microcode_write() routine performs the following steps...

"4. Allocates a kernel buffer (using vmalloc()) large enough to hold the user-supplied sequence of microcode chunks. If this request fails, we return to user space without freeing mc_applied, hoping that it may be needed later. The vmalloc() function is preferred over the kmalloc() function because the buffers may be very large (on the order of 100-200K), and we do not need a physically contiguous area but only a virtually contiguous one; so vmalloc() can suffice."

This, of course, brings us to the microcode update program. You should note that the microcode update feature was a work-in-progress for some time. He developed it while he worked at SCO but continued to maintain it while employed at VERITAS. He did not work alone. Roland Smith has pointed out that he believes the actual microcode binary data itself comes from Intel and that Mr. Aivazian also received help from Intel employees.

Now things get really interesting. In the Linux Magazine article above, Aivazian states, "The other Unix-like IA32 operating systems known to the author that support microcode update (on P6 family only) are SCO OpenServer 5.0.6 and SCO UnixWare 7.1.1. The Linux implementation was written from scratch in the author's spare time and was not based on any Unix or non-Unix version."

We should also note that, as Mr. Aivazian explained above, he uses a different approach to the microcode update feature than was used by SCO. As he tells us in the Linux Magazine article:

"Obviously, any device driver needs some user space program to interact with it, even if it is one of the existing programs like dd(1) or cat(1). It is therefore important to decide early on how much work is to be done in the kernel and how much in user space.

"For the Linux implementation of the driver, the author decided to perform all the work of selecting the appropriate microcode chunk, checking the revision, validating the checksum, and applying the microcode in kernel space. The only work left for user space is to convert the microcode from the format supplied by Intel to one that is easier to manipulate in the kernel and to control the kernel driver via the ioctl(2) system call.

"This design choice is not the only possible one. To give an example, I will mention a non-Linux implementation of the microcode update feature, namely that of SCO UnixWare 7.x. Since the UnixWare kernel allows the running process to bind itself to the current CPU so that it can safely operate on the data structures corresponding to that CPU (knowing that it will never be scheduled to run on a different CPU), it is possible to implement the microcode update feature almost entirely in user space. I say "almost" because the ability to execute privileged instructions, such as reading and writing MSRs, is still restricted to the code running in the kernel."

Moving along, we find this URL, where Mr. Aivazian asks for comments on a patch to kernel 2.3.47. Note the reference to "lock_kernel" (locking is an important SMP concept,) smp_lock.h, and the variable "smp_num_cpus."

So we can see that there's an enormous amount of SMP-related stuff in this file, which as I noted earlier, is now part of the Linux kernel. (The user portion of the microcode update feature is here.)

Let's put SMP issues aside for now (after all, every piece of kernel code needs to be SMP-safe) because this is an important point to make. To explain why, let's begin by defining microcode. According to Frank Wales:

"... the instruction set for the processor is broken down into a small set of elements that are designed to be activated in various combinations, a bit like the keys on a piano. Combine a few this way, and the processor adds the contents of that memory location to this register. Combine some others another way, and the processor jumps to this address in memory for its next program instruction. And so on.

"The specification for all these combinations of parts that make up all the CPU's instructions is what the microcode represents. It's basically the programming within the processor that makes it work as the CPU specification says it should..."

The purpose of microcode update is to update the microcode on an Intel CPU. If, for example, the eight Pentium IV chips in your big server are discovered to have a bug, you can do two things. You can wait until Intel brings out new chips (or hands out a BIOS upgrade) and replace the chips in your server, which could get expensive, or you could download the fix from Intel and update the microcode yourself by using the microcode update feature Tigran Aivazian put into the Linux kernel. Now each time the machine boots, the correct microcode will be loaded into the CPUs and your server will now run correctly. (Using of the microcode update feature is not absolutely necessary - the kernel has other ways of detecting and avoiding CPU bugs, but it certainly is a very useful tool.)

So why all the SMP code? The answer is simple. Microcode update might be working with the internal structures of multiple CPUs. In order to load the new code to multiple CPUs correctly, the microcode update feature has a need for SMP capabilities that goes well beyond merely being "SMP-safe." As Mr. Aivazian tells us in the Linux Magazine article, "On an SMP system, we must follow the procedure for updating the microcode for each processor separately, using a different microcode update for "mixed-stepping" SMP systems."

In other words, if there's more than one CPU, the kernel's microcode update code has to know how to handle the problem. It has to separately check each of the CPUs and make sure they are capable of accepting a microcode update. Then it needs to get processor flags from the CPU it's currently working on and check the microcode which is to be applied to that CPU against those flags so it can make sure that it's updating the right type of CPU. After that, it checks the current revision of the CPU against the microcode it is attempting to load, so it can make sure that the CPU isn't already more advanced than the microcode you want to use. Finally, it makes sure that the microcode has been properly applied to the CPU and gives an error message if the update doesn't work for some reason. Then it goes on to the next CPU and does everything again.

Needless to say, SMP-aware microcode update is definitely an enterprise-level feature, and it interacts with other enterprise-level features, such as the Symmetric Multiprocessing in the main kernel. And who's responsible for this wonderful enterprise-level tool? A SCO programmer.

Also, in addition to the multiple SMP calls, the microcode update feature also uses vmalloc. As you'll recall, Aivazian shows great concern for vmalloc in many of the messages we review above, and he's also credited for making the kernel's vmallo.c. code SMP-safe. This doubtless led to improvements in microcode update reliability because any problems in vmalloc might be reflected in microcode update's performance. As you can see from the way these two issues intertwine, Mr. Aivazian obviously knows how to keep his ducks in a nice, neat row.

While the microcode update feature and SMP work are very important, in terms of the SCO vs. IBM case, the most interesting contributions Mr. Aivazian made were to the kernel debugger.

"kgdb is a source level debugger for linux kernel. It is used along with
gdb (the gcc debugger - Alex) to debug linux kernel. Kernel developers can debug a kernel similar to application programs with use of kgdb. It makes it possible to place breakpoints
in kernel code, step through the code and observe variables.

"Two machines are required for using kgdb. One of these machines is a development
machine and the other is a test machine. The machines are connected through a
serial line, a null-modem cable which connects their serial ports. The kernel
to be debugged runs on the test machine. gdb runs on the development machine.
The serial line is used by gdb to communicate to the kernel being debugged.

"kgdb is a kernel patch..."

That's right. kgdb is not included in the main kernel. Linus has opposed the addition of a kernel debugger for years - but if you apply the kgdb patch and rebuild the kernel, kgdb will feed its output to the regular gcc debugger, gdb, which is located on a separate machine. Why is this valuable? Because even if the kernel you're working on contains some truly evil code which melts your CPU and obliterates your hard drive when it crashes, the debugging information, up to the point where your code crashed, is sent to another computer so you can use the standard gcc tools to discover exactly how the disaster happened.

You'll note on this page that the kgdb patch was initially available for kernel 2.2.5, which was released in March of 1999, about a year and half before Caldera bought SCO.

According to the author of the piece, Martin Pool, Mr. Aivazian integrated "thread support" and "support for multiple processors" (that is, SMP) into the kgdb code that had been contributed by the "Lake Stevens Instrument Division." To quote Mr. Pool, this code is for "on-line debug support for multiprocessor machines, which is an enterprise feature inasmuch as that word has meaning."

This is significant because Paragraph 84 of SCO's Amended Complaint says that they believe Linux could not have succeeded without "...access to expensive and sophisticated design and testing equipment," and this idea is rehashed several times in the course of SCO's complaint. If the ability to use one machine to debug another isn't "sophisticated design," I don't know what is. It's certainly "expensive," particularly if you're debugging SMP code and need a multiprocessor motherboard, and it might even fall under SCO's definition of "testing equipment." According to Roland Smith the "testing equipment" portion of SCO's complaint probably refers to an In Circuit Emulator, such as the one pictured here.

According to Dave from the sales staff at American Arium, their version of an In Circuit Emulator for Pentium chips costs anywhere from ten thousand to forty thousand dollars.

Beyond being an enterprise-level enhancement, kernel debugging is also interesting because it is tightly tied to the kernel's SMP support, and we know that SMP is one of the things SCO claims could not have happened without IBM support. In fact, kgdb can be used to test the kernel's SMP modules.

In other words, at least one SCO programmer was putting important contributions of code into the Linux kernel since before SCO was bought by Caldera, and one of the things he contributed to is an important testing suite to help make SMP safe. One of the functions of a programmer is to debug code, so kgdb is an important general addition to the kernel.

To continue, here we see a Mr. Aivazian responding to a letter from Andrea Arcangeli about which debugger Arcangeli should include in his own source tree.

Here's another patch for kgbd dating from October of 2003. In fact, this is meant to work with the 2.6pre6 kernel. Once again, Tigran Aivazain is mentioned twice for his earlier contributions and you'll also find considerable amounts of SMP.

The references above demonstrate that Mr. Aivazian tried to add debugging code to the official kernel and had influence on the development of kernel debugging for Linux. It should also be obvious that besides containing SMP code the kernel debugger is also very useful for writing good SMP code. This enterprise-level enhancement dates back to kernel 2.2.5 - once again, long before the Caldera-SCO purchase and long before IBM programmers began contributing to the Linux kernel.

It seems everything Mr. Aivazian does works together. If SMP or vmalloc.c doesn't work, the microcode update feature might fail, so he works on SMP and vmalloc.c. If either SMP or microcode update is crashing, he needs to know why, so he works on the debugger. The thing that makes this important to us is that SCO's complaint accuses IBM of illegally helping to develop enterprise level features in Linux. To quote paragraph 82 of SCO's Amended Complaint:

"Virtually none of these software developers and hobbyists had access to enterprise-scale equipment and testing facilities for Linux development. Without access to such equipment, facilities and knowledge of sophisticated development methods learned in many years of UNIX development it would be difficult, if not impossible, for the Linux development community to create a grade of Linux adequate for enterprise use."

But here we see the SCO programmer working on "enterprise-scale equipment and testing facilities" - and SCO's name is literally all over it. Also, from the way all Mr. Aivazian's work connects together, we can also conclude that he has made a very important contribution toward creating "a grade of Linux adequate for enterprise use."

Now let's go back to paragraph 114 of SCO's Amended Complaint. IBM is accused of giving Linux:

(b) performance measurement and improvements - Mr. Aivazian has done a great deal of work toward giving us a nice kernel debugger, which of course can help us measure and improve performance. As you'll recall, he also advised Andrea Arcangeli in his efforts to improve the of get_cycles() function, which is the code for measuring CPU cycles which we discussed above. Lastly, his improvements to vmalloc, as I noted earlier, represent minor performance improvements.

Once again, these are all "enterprise level" features, and Mr. Aivazian has been doing this work since at least 1998. Further, he did almost all the work we've seen here both before SCO was purchased by Caldera and before IBM began working on Linux. And he did it with authorization from his superiors at SCO.

Thanks to everyone who read the previous version of this story, making it a true open-source effort, particularly bruzie, D, Doughnuts Lover, emebit, John Gabrial, gunnark, Harlan, jamesw, kevin, rsmith, Ruidh, snorpus, tazer, and Frank Wales. Special thanks, of course, go to Mr. Tigran Aivazian who was kind enough to read this manuscript and offer some final corrections. He wishes to make clear that the views he expresses are his own, not those of his employer.