The Developer DiariesOr, Random Thoughts on Life and Softwarehttps://blogs.oracle.com/lunchware/feed/entries/atom2012-01-16T04:16:17+00:00Apache Rollerhttps://blogs.oracle.com/lunchware/entry/solaris_10_containers_releasedSolaris 10 Containers Released on OpenSolarisJordan Vaughan 2009-10-23T20:10:33+00:002009-10-24T03:14:16+00:00<p>After roughly nine months of nonstop development, Jerry Jelinek <a href="http://opensolaris.org/os/community/on/flag-days/pages/2009102201">integrated the first phase of <em>solaris10</em>-branded zones</a> (a.k.a. Solaris 10 Containers) into <a href="http://www.opensolaris.com/get/index.jsp">OpenSolaris</a> build 127 yesterday. Such zones enable users to host environments from <a href="http://www.sun.com/software/solaris/get.jsp">Solaris 10 10/09</a> and later inside <a href="http://www.opensolaris.org/os/community/zones">OpenSolaris zones</a>. <a href="http://blogs.sun.com/lunchware/entry/solaris_10_containers_for_opensolaris">As mentioned in one of my earlier posts</a>, we're developing <em>solaris10</em>-branded zones so that users can consolidate their Solaris 10 production environments onto machines running OpenSolaris and take advantage of many innovative OpenSolaris technologies (such as <a href="http://opensolaris.org/os/project/crossbow">Crossbow</a>) within such environments.</p>
<p><a href="http://blogs.sun.com/jerrysblog/entry/solaris10_branded_zones_on_opensolaris">As Jerry mentioned in his blog</a>, this first phase delivers emulation for Solaris 10 10/09, physical-to-virtual (p2v) and virtual-to-virtual (v2v) capabilities to help users deploy Solaris 10 environments in <em>solaris10</em>-branded zones, and support for all three OpenSolaris-supported platforms (sun4u, sun4v, and x86). He also explained that there are some limitations that will be addressed in the second phase of the project. However, he didn't mention that users are unable to use <a href="http://docs.sun.com/app/docs/doc/816-5166/dtrace-1m">dtrace(1M)</a> and <a href="http://docs.sun.com/app/docs/doc/816-5165/mdb-1">mdb(1)</a> on processes running in <em>solaris10</em>-branded zones if dtrace(1M) and mdb(1) are executed in the global zone. This resulted from incompatible changes made to some of the debugging libraries between Solaris 10 and OpenSolaris and it will be addressed during the second development phase. In the meantime, users can use dtrace(1M) and mdb(1) inside <em>solaris10</em>-branded zones to examine processes running inside of the zones.</p>
<p>If you are an OpenSolaris or Solaris 10 kernel developer, then I admonish you to read the <a href="http://www.opensolaris.org/os/community/zones/s10brand_dev_guide"><em>Solaris10</em>-Branded Zone Developer Guide</a>, which explains the purpose and implementation of <em>solaris10</em>-branded zones as well as what you'll need to do to avoid breaking such zones. It's every kernel developer's responsibility to ensure that <em>solaris10</em>-branded zones will work with his/her changes to the Solaris 10 and OpenSolaris user-kernel interfaces (syscalls, ioctls, kstats, etc.).</p>
<p>This project was full of surprises and challenges. One of my favorite bugs involved Solaris 10's libc's use of the x86 <code>%fs</code> segment register. Solaris 10's libc expected the x86 <code>%fs</code> register to contain a nonzero selector value in 64-bit processes (Solaris 10's <a href="http://src.opensolaris.org/source/search?q=&defs=__curthread">__curthread()</a> returns <code>NULL</code> if <code>%fs</code> is zero.), which was problematic because OpenSolaris' kernel cleared <code>%fs</code>. Libc has always used <code>%fs</code> to locate the current thread's <a href="http://src.opensolaris.org/source/search?q=&defs=ulwp_t"><code>ulwp_t</code></a> structure on 64-bit x86 machines. Therefore, 64-bit x86 processes running inside <em>solaris10</em>-branded zones were unable to use <a href="http://docs.sun.com/app/docs/doc/816-5168/thr-main-3c"><code>thr_main(3C)</code></a> and other critical libc functions as well as several common libraries, such as libdoor.</p>
<p>The fix was somewhat complicated because it had to guarantee that all threads in all 64-bit x86 processes running in <em>solaris10</em>-branded zones would start with nonzero <code>%fs</code> registers. Fortunately, only two system calls modify <code>%fs</code> in Solaris 10 and OpenSolaris: <code>SYS_lwp_private</code> and <code>SYS_lwp_create</code>. <code>SYS_lwp_private</code> is a libc-private system call that's invoked once when libc initializes after a process execs (see OpenSolaris' implementation of <a href="http://src.opensolaris.org/source/search?q=&defs=libc_init"><code>libc_init()</code></a>) in order to configure the <code>FS</code> segment so that its base lies at the start of the single thread's <code>ulwp_t</code> structure. <code>SYS_lwp_create</code> takes a <a href="http://src.opensolaris.org/source/search?q=&defs=ucontext_t"><code>ucontext_t</code></a> structure and the address of a <code>ulwp_t</code> structure and creates a new thread for the calling process with the given thread context and an <code>FS</code> segment beginning at the start of the specified <code>ulwp_t</code> structure.</p>
<p>My initial fix did the following:</p>
<ol>
<li>The solaris10 brand's emulation library interposed on <code>SYS_lwp_private</code> in <code>s10_lwp_private()</code>. It handed the system call to the OpenSolaris kernel untouched and afterwards invoked <code>thr_main(3C)</code> to determine whether the Solaris 10 environment's libc worked after the kernel configured <code>%fs</code>. If <code>thr_main(3C)</code> returned <code>-1</code>, then the library invoked a special <code>SYS_brand</code> system call to set <code>%fs</code> to the old nonzero Solaris 10 selector value.</li>
<li>The brand's emulation library also interposed on <code>SYS_lwp_create</code> in <code>s10_lwp_create()</code> and tweaked the supplied ucontext_t structure so that the new thread started in <code>s10_lwp_create_entry_point()</code> rather than <a href="http://src.opensolaris.org/source/search?q=&defs=_thrp_setup"><code>_thrp_setup()</code></a>. Of course, new threads had to execute <code>_thrp_setup()</code> eventually, so <code>s10_lwp_create()</code> stored <code>_thrp_setup()</code>'s address in a predetermined location in the new thread's stack. <code>s10_lwp_create_entry_point()</code> invoked <code>thr_main(3C)</code> to determine whether the Solaris 10 environment's libc worked when <code>%fs</code> was zero. If <code>thr_main(3C)</code> returned <code>-1</code>, then the new thread invoked the same <code>SYS_brand</code> system call invoked by <code>s10_lwp_private()</code> in order to correct <code>%fs</code>. Afterwards, the new thread read its true entry point's address (i.e., <code>_thrp_setup()</code>'s address) from the predetermined location in its stack and jumped to the true entry point.</li>
<li>The <em>solaris10</em> brand's kernel module ensured that forked threads in <em>solaris10</em>-branded zones inherited their parents' <code>%fs</code> selector values. This ensured that forked threads whose parents needed <code>%fs</code> register adjustments started with correct <code>%fs</code> selector values.</li>
</ol>
<p>I committed the fix and was content until a test engineer working on <em>solaris10</em>-branded zones, Mengwei Jiao, reported a segfault of a 64-bit x86 test in a <em>solaris10</em>-branded zone. I immediately suspected my fix because the test was multithreaded, yet I was surprised because I thoroughly tested my fix and never encountered segfaults. Mengwei's test created and immediately canceled a thread using <a href="http://docs.sun.com/app/docs/doc/816-5168/pthread-create-3c">pthread_create(3C)</a> and <a href="http://docs.sun.com/app/docs/doc/816-5168/pthread-cancel-3c">pthread_cancel(3C)</a>. After spending hours debugging core dumps, I discovered that I forgot to consider signals while testing my fix.</p>
<p>The test segfaulted because its new thread read a junk address from its stack in <code>s10_lwp_create_entry_point()</code> and jumped to it. Something clobbered the thread's stack and overwrote its true entry point's address. I noticed that the thread didn't start until its parent finished executing <code>pthread_cancel(3C)</code>, so I suspected that the delivery of the <code>SIGCANCEL</code> signal clobbered the child's stack. It turned out that the child started in <code>s10_lwp_create_entry_point()</code> as expected but immediately jumped to <a href="http://src.opensolaris.org/source/search?q=&defs=sigacthandler"><code>sigacthandler()</code></a> in libc to process the <code>SIGCANCEL</code> signal. Such behavior might have been acceptable because the thread's true entry point's address was stored deep within the thread's stack (2KB from the top of the stack) and neither <code>sigacthandler()</code> nor any of the functions it invoked consumed much stack space, but <code>sigacthandler()</code> invoked <a href="http://docs.sun.com/app/docs/doc/816-5168/memcpy-3c"><code>memcpy(3C)</code></a> to copy a <a href="http://src.opensolaris.org/source/search?q=&defs=siginfo_t"><code>siginfo_t</code></a> structure and the dynamic linker hadn't yet loaded <code>memcpy(3C)</code> into the library's link map. Consequently, the thread executed <code>ld.so.1</code> routines in order to load <code>memcpy(3C)</code> and fill its associated PLT entry. Eventually the thread's stack grew large enough for <code>ld.so.1</code> to clobber the thread's true entry point's address, which produced the junk address that later led to the segfault.</p>
<p>My final solution eliminated the use of new threads' stacks and instead stored entry points in new threads' <code>%r14</code> registers. Libc doesn't store any special initial values in new threads' <code>%r14</code> registers, so I was free to use <code>%r14</code>. Additionally, any <a href="http://www.x86-64.org/documentation/abi.pdf">System V ABI</a>-conforming functions invoked by <code>s10_lwp_create_entry_point()</code> and <code>sigacthandler()</code> had to preserve <code>%r14</code> for <code>s10_lwp_create_entry_point()</code> (<code>%r14</code> is a <em>callee-saved register</em>), so it was impossible for such functions to clobber <code>%r14</code> as seen by <code>s10_lwp_create_entry_point()</code>.</p>
<p>I also renamed <code>s10_lwp_create()</code> to <a href="http://src.opensolaris.org/source/search?q=&defs=s10_lwp_create_correct_fs"><code>s10_lwp_create_correct_fs()</code></a> and used a trick that I call <em>sysent table patching</em> to ensure that the brand library only causes <code>SYS_lwp_create</code> to force new threads to start at <a href="http://src.opensolaris.org/source/search?q=&defs=s10_lwp_create_entry_point"><code>s10_lwp_create_entry_point()</code></a> after <a href="http://src.opensolaris.org/source/search?q=&defs=s10_lwp_private"><code>s10_lwp_private()</code></a> determines that the Solaris 10 environment's libc can't function properly when <code>%fs</code> is zero. The brand's emulation library accesses a global array called <a href="http://src.opensolaris.org/source/search?q=&defs=s10_sysent_table"><code>s10_sysent_table</code></a> to fetch system call handlers. An emulation function can change a system call's entry in the array in order to change the system call's handler. The emulation library invokes <a href="http://src.opensolaris.org/source/search?q=&defs=s10_lwp_create"><code>s10_lwp_create()</code></a> to emulate <code>SYS_lwp_create</code> by default, which simply hands the system call to the OpenSolaris kernel untouched. If <code>s10_lwp_private()</code> determines that new threads require nonzero <code>%fs</code> selector values, then it modifies <code>s10_sysent_table</code> so that <code>s10_lwp_create_correct_fs()</code> handles <code>SYS_lwp_create</code> system calls. <code>SYS_lwp_private</code> is only invoked while a process is single-threaded, so races between <code>s10_lwp_private()</code> and <code>SYS_lwp_create</code> are impossible.</p>
<p>I encourage you to <a href="http://www.opensolaris.com/get/index.jsp">download and install the latest version of OpenSolaris</a>, update it to build 127 or later (once the builds become available), and try <em>solaris10</em>-branded zones. Jerry and I would appreciate any feedback you might have, which you can send to us via the <a href="http://opensolaris.org/os/community/zones/discussions/">zones-discuss</a> discussion forum on opensolaris.org. Remember that <em>solaris10</em>-branded zones are capable of hosting production environments even though they are still being developed.</p>
<p>Enjoy!</p>https://blogs.oracle.com/lunchware/entry/svr4_packages_available_for_solarisSVR4 Packages Available for Solaris 10 ContainersJordan Vaughan 2009-07-15T15:33:56+00:002009-07-15T22:33:56+00:00<p><a href="http://blogs.sun.com/jerrysblog/">Jerry Jelinek</a> recently posted SPARC and x86/x64 SVr4 packages on the <a href="http://opensolaris.org/os/project/s10brand/">Solaris 10 Containers project page</a>. The packages contain the binaries that allow administrators to create and manage Solaris 10 Containers. (<a href="http://blogs.sun.com/lunchware/entry/solaris_10_containers_for_opensolaris">See this post</a> for information about Solaris 10 Containers.) As of this writing, the packages are synced to ONNV build 118 and should be able to manage Solaris 10 Containers running S10U7.</p>
<p>Please feel free to download and use the packages. However, please note that the technology behind Solaris 10 Containers is still in development. The binaries in the packages represent the technology as it currently stands.</p>
<p>Please send us any comments that you might have via the <a href="http://www.opensolaris.org/jive/forum.jspa?forumID=6">zones discussion forum on OpenSolaris.org</a>. Any feedback you can provide regarding bugs would be especially welcome because it would help us discover behaviors that will require emulation. :)</p>https://blogs.oracle.com/lunchware/entry/solaris_10_containers_for_opensolarisSolaris 10 Containers for OpenSolarisJordan Vaughan 2009-05-07T18:43:38+00:002009-05-08T04:24:06+00:00<p><em>Branded Zones/Containers</em> is a technology that allows Solaris system administrators to virtualize non-native operating system environments within <a href="http://www.sun.com/software/solaris/containers/">Solaris zones</a>, a lightweight OS-level (i.e., no hypervisor) virtualization technology that creates isolated application environments. (<a href="http://www.sun.com/bigadmin/content/zones/">Look here for more details.</a>) Brands exist for <a href="http://opensolaris.org/os/community/brandz/">Linux on OpenSolaris</a> and <a href="http://www.sun.com/software/solaris/containers/getit.jsp">Solaris 8 and 9 on Solaris 10</a>, but not Solaris 10 on OpenSolaris...until now.</p>
<p>On April 23, <a href="http://blogs.sun.com/jerrysblog/">Jerry Jelinek</a> announced the <a href="http://www.opensolaris.org/jive/thread.jspa?threadID=100967&tstart=0">development of Solaris 10 containers on OpenSolaris.org</a> and requested that the project be open-sourced as a part of ON (i.e., the OpenSolaris kernel). Solaris 10 Containers will allow administrators to adopt technologies found in the OpenSolaris kernel (e.g., Crossbow networking and ZFS enhancements) by maintaining Solaris 10 operating system environments on top of the OpenSolaris kernel. In other words, you will be able to run your Solaris 10 environments on top of the OpenSolaris kernel (provided that your Solaris 10 environments meet the standard Solaris zone requirements).</p>
<p>Both Jerry and I have been working on Solaris 10 containers for at least a month. We are currently able to archive and install Solaris 10 environments into Solaris 10 containers (i.e., p2v Solaris 10 systems) and boot the containers as shared-stack zones. Automounting NFS filesystems, examining processes with the proc tools, tracing process and thread behavior with <code>truss</code>, and listing installed Solaris 10 patches are a few of the many features that appear to run without problems within Solaris 10 containers as they currently are. I even managed to forward X connections over SSH and establish VNC sessions with my Solaris 10 containers on all three Solaris-supported architectures (x86, x64, and SPARC).</p>
<p>Jerry and I prepared screencast demos of archiving, installing, booting, and working within a Solaris 10 container for the upcoming <a href="http://developers.sun.com/events/communityone/2009/west/index.jsp">Community One West developer conference</a>. We couldn't decide whose narration was best suited for the demo, so we submitted two versions, one featuring my voice and the other featuring Jerry's voice. <a href="http://mediacast.sun.com/users/flippedbits/media/s10c-demo-jerry.swf/details">Take a look at Jerry's demo</a> if you want to see the results (though you might have to download the flash video file because it might not fit within the preview window). We are considering producing more videos or blog posts (or both) as the technology evolves.</p>
<p>For more information on Solaris 10 containers and zones/containers in general and how you can contribute to both, visit the <a href="http://opensolaris.org/os/community/zones/">OpenSolaris.org zones community page</a> and the <a href="http://opensolaris.org/os/project/s10brand/">Solaris 10 Brand/Containers project page at OpenSolaris.org</a>.</p>
https://blogs.oracle.com/lunchware/entry/zone_hostid_emulation_for_everyoneNon-Global Zone Hostid Emulation for Everyone!Jordan Vaughan 2009-03-05T13:45:48+00:002009-05-07T22:07:45+00:00<p>Suppose that you want to consolidate systems running legacy applications that are licensed based on your systems' hostids or migrate the applications to new machines. If your applications' publishers are unable or unwilling to relicense your applications (perhaps the publishers no longer exist), then you are stuck with your legacy applications running on their original, possibly outdated systems because the new hosts would most likely have different hostids. Moving your applications to new hosts would probably impact the applications and prevent them from functioning. How can you overcome this difficulty?<br /></p>
<p>A solution is now available via Solaris non-global zones (a.k.a. containers). With the putback of PSARC 2008/647 (Configurable Hostids for Non-Global
Zones), administrators can give each non-global zone in OpenSolaris a 32-bit hostid
via <code>zonecfg</code> starting in build 108. (The default non-global zone behavior is to use the global zone's [i.e., the physical host's] hostid.) All processes that execute within the zone will see the configured hostid instead of the physical host's hostid. This hostid emulation feature will be available in OpenSolaris 2009.06.</p>
<p>Here is an example of zone-emulated hostid in action. Suppose that you create and boot a non-global zone without using the hostid emulation feature. Here is what you might see:</p>
<blockquote>
<pre># zonecfg -z godel
godel: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:godel&gt; create
zonecfg:godel&gt; set zonepath=/zones/godel
zonecfg:godel&gt; info
zonename: godel
zonepath: /zones/godel
brand: native
autoboot: false
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: shared
hostid:
inherit-pkg-dir:
dir: /lib
inherit-pkg-dir:
dir: /platform
inherit-pkg-dir:
dir: /sbin
inherit-pkg-dir:
dir: /usr
zonecfg:godel&gt; exit
# zoneadm -z godel install
A ZFS file system has been created for this zone.
Preparing to install zone <godel>.
Creating list of files to copy from the global zone.
Copying &lt;9718&gt; files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize &lt;1442&gt; packages on the zone.
Initialized &lt;1442&gt; packages on zone.
Zone <godel> is initialized.
The file </godel> contains a log of the zone installation.
# zoneadm -z godel boot
# zlogin godel "zonename &amp;&amp; hostid"
godel
83405c0b
# zonename &amp;&amp; hostid
global
83405c0b
</godel></pre>
</blockquote>
<p>The system's hostid is the same within the non-global zone and the global zone. Now specify a hostid for the zone via <code>zonecfg</code>, reboot the zone, and observe the results:</p>
<blockquote>
<pre># zonecfg -z godel set hostid=1337833f
# zoneadm -z godel reboot
# zlogin godel "zonename &amp;&amp; hostid"
godel
1337833f
# zonename &amp;&amp; hostid
global
83405c0b
</pre>
</blockquote>
<p>You can specify any 32-bit hexadecimal hostid for a non-global zone except <code>0xffffffff</code> and the hostid will take effect on subsequent boots of the non-global zone. ('Boots' means boots from the global zone via <code>zoneadm</code>, as in &quot;<code>zoneadm -z &lt;zone_name&gt; boot</code>&quot; or &quot;<code>zoneadm -z &lt;zone_name&gt; reboot</code>&quot;.) The hostid emulation feature is available for all native-based brands (<code>native</code>, <code>ipkg</code>, <code>cluster</code>, etc.); however, Linux containers (i.e., <code>lx</code>-branded zones) do not support hostid emulation via <code>zonecfg</code>. To emulate hostids in Linux containers, modify the <code>/etc/hostid</code> file within the container.</p>
<p>Migrating non-global zones with legacy hostid-bound licensed software across physical hosts is a cakewalk. Detach the zone, migrate it to the new system, attach it, specify the source system's hostid for the now-attached zone, boot the zone, and <i>viola!</i>, the licensed software still thinks it's on the old system.</p>https://blogs.oracle.com/lunchware/entry/bug_wars_subtle_but_annoyingBug Wars: The Phantom Pool Bug MenaceJordan Vaughan 2008-12-01T19:03:28+00:002008-12-02T03:03:28+00:00<p>Most systems programmers like to swap tales about tackling tricky or annoying bugs. Now, after a month of pulling my hair out, I can share my first &quot;bug war&quot; story as a systems programmer.</p>
<p>A somewhat long time ago, in a Sun Microsystems office not too far away, I occasionally encountered system panics of the following form while running my bug fixes through the standard zones test suite in snv_96:</p>
<blockquote>
<pre>assertion failed: pool-&gt;pool_ref == 0, file: ../../common/os/pool.c, line: 454</pre>
</blockquote>
<p><a href="http://src.opensolaris.org">src.opensolaris.org</a> located the assertion in <code>pool_pool_destroy()</code>:</p>
<blockquote>
<pre>428 /\*
429 \* Destroy specified pool, and rebind all processes in it
430 \* to the default pool.
431 \*/
432 static int
433 pool_pool_destroy(poolid_t poolid)
434 {
435 pool_t \*pool;
436 int ret;
437
438 ASSERT(pool_lock_held());
439
440 if (poolid == POOL_DEFAULT)
441 return (EINVAL);
442 if ((pool = pool_lookup_pool_by_id(poolid)) == NULL)
443 return (ESRCH);
444 ret = pool_do_bind(pool_default, P_POOLID, poolid, POOL_BIND_ALL);
445 if (ret == 0) {
446 struct destroy_zone_arg dzarg;
447
448 dzarg.old = pool;
449 dzarg.new = pool_default;
450 mutex_enter(&amp;cpu_lock);
451 ret = zone_walk(pool_destroy_zone_cb, &amp;dzarg);
452 mutex_exit(&amp;cpu_lock);
453 ASSERT(ret == 0);
454 ASSERT(pool-&gt;pool_ref == 0);
455 (void) nvlist_free(pool-&gt;pool_props);
456 id_free(pool_ids, pool-&gt;pool_id);
457 pool-&gt;pool_pset-&gt;pset_npools--;
458 list_remove(&amp;pool_list, pool);
459 pool_count--;
460 pool_pool_mod = gethrtime();
461 kmem_free(pool, sizeof (pool_t));
462 }
463 return (ret);
464 }
</pre>
</blockquote>
<p>Line 454 caused the panic, so something was referring to the dying pool.
Looking at the source code further, I discerned that <code>pool_do_bind()</code> was supposed to rebind all processes within the pool specified by
<code>poolid</code> to the pool to which the first function argument referred (in
this case, the default pool). The zone callback invoked on like 451 only set a
zone's pool and processor set associations; it didn't rebind processes.
<code>pool_do_bind()</code> returned zero after completing successfully, so the
problem was that <code>pool_do_bind()</code> indicated that it successfully
rebound all processes from the dying pool to the default pool when in fact it sometimes did not.
</p>
<p>
I dug around the source tree and determined that a process' pool
association (as indicated by the <code>proc_t</code> structure's
<code>p_pool</code> field) only changed when a new system process spawned
(that is, a process without a parent spawned; see <code>newfork()</code> in
uts/common/os/fork.c), a process forked (<code>cfork()</code> in
uts/common/os/fork.c), a process exited (<code>proc_exit()</code> in
uts/common/os/exit.c), or a process was bound to a pool via
<code>pool_do_bind()</code>.
</p>
<p>A gentle introduction to pool rebinding is necessary before I proceed further. When a process forks or exits, it enters the <i>pool barrier</i>,
which encloses operations that are sensitive to changes in the
process' pool binding. (In other words, a process' pool binding should
not change while the process is within the pool barrier.) The pool
barrier is sandwiched between invocations of <code>pool_barrier_enter()</code> and <code>pool_barrier_exit()</code> (see uts/common/os/fork.c:211-224,229-236,299-309,525-527,668-672 and uts/common/os/exit.c:489-493,590-605).</p>
<p>When a pool (call it <i>P</i>) is destroyed, <code>pool_do_bind()</code> (uts/common/os/pool.c:1239-1647) is invoked to rebind all processes within <i>P</i> to the default pool. <code>pool_do_bind()</code> creates an array of <code>proc_t</code> pointers called <code>procs</code> that can hold twice the number of active processes. <code>procs</code> will hold pointers to all processes that will be rebound to the default pool. Once <code>procs</code> is allocated, <code>pool_do_bind()</code> grabs <code>pidlock</code> and enters what I will call the <i>first loop</i>, which adds all active processes bound to <i>P</i> to <code>procs</code> (see pool.c:1359:1432). These processes are also marked with the <code>PBWAIT</code> flag (pool.c:1408), which causes them to block in <code>pool_barrier_enter()</code> and <code>pool_barrier_exit()</code>, effectively stopping them from entering or exiting the pool barrier. Once the first loop is done, <code>pool_do_bind()</code> releases <code>pidlock</code> and waits until all processes in <code>procs</code> that were within the pool barrier when marked with <code>PBWAIT</code> to block at <code>pool_barrier_exit()</code>. This guarantees that pool rebinding won't occur while the targeted processes are sensitive to pool rebinding.</p>
<p>Once the thread in <code>pool_do_bind()</code> resumes execution, it enters what I will call the <i>second loop</i>, which checks if the children of the processes in <code>procs</code> should be added to <code>procs</code>. This loop catches any processes that were spawned via <code>cfork()</code> while the thread in <code>pool_do_bind()</code> waited for marked processes to block at <code>pool_barrier_exit()</code>.
(Note that a newly-spawned process' LWPs are not started until the
parent process exits the pool barrier.) Once the second loop completes,
<code>pool_do_bind()</code> rebinds the processes in <code>procs</code> to the default pool, adjusts <i>P</i>'s reference count, and wakes all processes in <code>procs</code> that are blocked within <code>pool_barrier_enter()</code> and <code>pool_barrier_exit()</code>. (Note that <code>cfork()</code> and <code>proc_exit()</code> also adjust pool reference counts when processes fork or exit.)</p>
<p>Now, back to my story. I turned to MDB to give me some clues as to what was going
wrong:
</p>
<blockquote>
<pre>&gt; ::status
debugging crash dump vmcore.4 (64-bit) from balaclava
operating system: 5.11 onnv-bugfix (i86pc)
panic message: assertion failed: pool-&gt;pool_ref == 0, file:
../../common/os/pool.c, line: 454
dump content: all kernel and user pages
&gt; ::panicinfo
cpu 3
thread ffffff03afa72400
message assertion failed: pool-&gt;pool_ref == 0, file:
../../common/os/pool.c, line: 454
rdi fffffffffbf31690
rsi ffffff0008017988
rdx fffffffffbf311b0
rcx 1c6
r8 ffffff00080179c0
r9 20
rax 0
rbx 1c6
rbp ffffff00080179b0
r10 ffffff00080178d0
r10 ffffff00080178d0
r11 ffffff01ce469680
r12 fffffffffbf311b0
r13 fffffffffbf31018
r14 fffffffffbc5b4d8
r15 3
fsbase 0
gsbase ffffff01d243b580
ds 4b
es 4b
fs 0
gs 1c3
trapno 0
err 0
rip fffffffffb84be90
cs 30
rflags 246
rsp ffffff00080178c8
ss 38
gdt_hi 0
gdt_lo f00001ef
idt_hi 0
idt_lo 10000fff
ldt 0
task 70
cr0 8005003b
cr2 feda43a8
cr3 e49a3000
cr4 6f8
</pre>
</blockquote>
<p>The faulting thread had an address of <code>ffffff03afa72400</code> and
was on CPU 3.</p>
<blockquote>
<pre>&gt; ffffff03afa72400::findstack -v
stack pointer for thread ffffff03afa72400: ffffff00080178c0
ffffff00080179b0 panic+0x9c()
ffffff0008017a00 assfail+0x7e(fffffffffbf31018, fffffffffbf311b0, 1c6)
ffffff0008017a50 pool_pool_destroy+0x16b(47)
ffffff0008017aa0 pool_destroy+0x40(2, 8067ce8, 47)
ffffff0008017ca0 pool_ioctl+0xa32(a300000000, 3, 8064ca0, 102003,
ffffff05de08fc48, ffffff0008017e8c)
ffffff0008017ce0 cdev_ioctl+0x48(a300000000, 3, 8064ca0, 102003,
ffffff05de08fc48, ffffff0008017e8c)
ffffff0008017d20 spec_ioctl+0x86(ffffff03ac78f700, 3, 8064ca0, 102003,
ffffff05de08fc48, ffffff0008017e8c, 0)
ffffff0008017da0 fop_ioctl+0x7b(ffffff03ac78f700, 3, 8064ca0, 102003,
ffffff05de08fc48, ffffff0008017e8c, 0)
ffffff0008017eb0 ioctl+0x174(3, 3, 8064ca0)
ffffff0008017f00 sys_syscall32+0x1fc()
</pre>
</blockquote>
<p>As expected, the failed assertion occurred in
<code>pool_pool_destroy()</code> after <code>pool_do_bind()</code> was called.
Pool 0x47 (a non-default pool) was being destroyed.
</p>
<blockquote>
<pre>&gt; ::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fffffffffbc38fb0 1f 1 0 10 yes no t-0 ffffff03ad328700 ppdmgr
1 ffffff01d23a1580 1f 0 0 60 no no t-0 ffffff01e9f98000 ppdmgr
2 ffffff01d23ce580 1f 0 0 60 no no t-0 ffffff01d2a01560
mesa_vendor_sele
3 fffffffffbc40600 1b 1 0 41 no no t-0 ffffff03afa72400
pooladm
</pre>
</blockquote>
<p>pooladm was responsible for the panic.</p>
<blockquote>
<pre>&gt; pool_list::walk list | ::print -a pool_t
{
ffffff01ce887140 pool_id = 0
ffffff01ce887144 pool_ref = 0x5f
ffffff01ce887148 pool_link = {
ffffff01ce887148 list_next = 0xffffff01d0fb0b48
ffffff01ce887150 list_prev = pool_list+0x10
}
ffffff01ce887158 pool_props = 0xffffff01d28d36d0
ffffff01ce887160 pool_pset = 0xffffff01cf500508
}
{
ffffff01d0fb0b40 pool_id = 0x47
ffffff01d0fb0b44 pool_ref = 0x1
ffffff01d0fb0b48 pool_link = {
ffffff01d0fb0b48 list_next = pool_list+0x10
ffffff01d0fb0b50 list_prev = 0xffffff01ce887148
}
ffffff01d0fb0b58 pool_props = 0xffffff01d4d759c8
ffffff01d0fb0b60 pool_pset = 0xffffff01cf500508
}
</pre>
</blockquote>
<p>There were two pools. The first was the default pool, which appeared to have
been consistent when the system panicked. The second pool was being destroyed. However, its reference count was one when the assertion failed. Everything else in the guilty pool appeared
to have been consistent when the system panicked.</p>
<blockquote>
<pre>&gt; ::walk proc | ::print -a proc_t p_pool ! grep ffffff01d0fb0b40
ffffff07e7586fa0 p_pool = 0xffffff01d0fb0b40
&gt; ::offsetof proc_t p_pool
offsetof (proc_t, p_pool) = 0xc00
&gt; ffffff07e7586fa0-0xc00=X
e75863a0
</pre>
</blockquote>
<p>There was exactly one process that referred to the guilty pool:
<code>ffffff07e75863a0</code>.</p>
<blockquote>
<pre>&gt; ffffff07e75863a0::print proc_t p_zone
p_zone = 0xffffff01edf1bf00
&gt; ::walk zone
fffffffffbfb1180
ffffff01edf1bf00
&gt; 0xffffff01edf1bf00::print zone_t zone_name
zone_name = 0xffffff01e6eec200 "jj1"
&gt; 0xffffff01edf1bf00::print zone_t zone_pool
zone_pool = 0xffffff01ce887140
</pre>
</blockquote>
<p>The non-global zone <code>jj1</code> contained the guilty process. The global zone was the only other zone in the system. Notice that
<code>jj1</code> was associated with the default pool, not the pool that was
being destroyed, when the system panicked. So all but one of the processes within
<code>jj1</code> and <code>jj1</code> itself were rebound to the
default pool.</p>
<blockquote>
<pre>&gt; ffffff07e75863a0::ptree
fffffffffbc36f70 sched
ffffff01d25c33a0 init
ffffff07e3d2a3a0 svc.startd
ffffff07e5b563a0 ppd-cache-update
ffffff07e5c4f3a0 ppdmgr
ffffff07e60aa3a0 ppdmgr
ffffff07e70323a0 ppdmgr
ffffff07e731c3a0 ppdmgr
ffffff07e75863a0 ppdmgr
</pre>
</blockquote>
<p><code>cfork()</code> forked the guilty process. It was not created
via <code>newproc()</code><code></code>, for if it were, then
it would not have had a parent process (<code>ffffff07e731c3a0</code>).</p>
<blockquote>
<pre>&gt;⁞ ffffff07e731c3a0::print proc_t p_pool
p_pool = 0xffffff01ce887140
&gt; ffffff07e731c3a0::print proc_t p_zone
p_zone = 0xffffff01edf1bf00
&gt; ffffff07e70323a0::print proc_t p_pool
p_pool = 0xffffff01ce887140
&gt; ffffff07e70323a0::print proc_t p_zone
p_zone = 0xffffff01edf1bf00
</pre>
</blockquote>
<p>Both the parent and grandparent of the guilty process were within zone
<code>jj1</code> and both referred to the default pool when the system
panicked.</p>I started to look for interleavings of code from <code>cfork()</code>, <code>proc_exit()</code>, and <code>pool_do_bind()</code> that would lead to inconsistent states, going so far as to create diagrams illustrating which locks were held (and the order in which they were acquired) at various points in the aforementioned functions, but found nothing that suggested a race condition. I struggled to understand the fork and exit code (a nontrivial task) to see if any of the invoked subroutines were generating race conditions, but did not find anything. A fellow engineer suggested three or four possible sources of the bug, including a three-way race between the aforementioned functions, but a little investigation and a few counterexamples put his theories to rest. I made no progress for at least two weeks.
<p>My frustrations were about to drive me insane when I stumbled upon what I thought was the source of the bug. The problem was in the second loop, in pool.c:1491-1500:</p>
<blockquote>
<pre>1491 mutex_enter(&amp;p-&gt;p_lock);
1492 /\*
1493 \* Skip processes in local zones if we're not binding
1494 \* zones to pools (P_ZONEID). Skip kernel processes also.
1495 \*/
1496 if ((!INGLOBALZONE(p) &amp;&amp; idtype != P_ZONEID) ||
1497 p-&gt;p_flag &amp; SSYS) {
1498 mutex_exit(&amp;p-&gt;p_lock);
1499 continue;
1500 }
</pre>
</blockquote>
<p>The problem was on line 1496. The first disjunct of this conditional statement made <code>pool_do_bind()</code> skip child processes that were not in the global zone. (<code>idtype == P_POOLID</code> [<code>idtype</code> is one of <code>pool_do_bind()</code>'s parameters] when a pool is being destroyed.) Therefore, if a process in a non-global zone was forking (but had not created its child's <code>proc_t</code> structure yet via <code>getproc()</code> [fork.c:907-1177, esp.1055-1067]) when <code>pool_do_bind()</code> went through the first loop, then the second loop would never have added the process' child to <code>procs</code>. Thus the child process would have remained bound to pool <i>P</i>, resulting in the failed assertion.</p>
<p>Here is a sample execution that illustrates this bug (thread <i>A</i> is the thread that is destroying <i>P</i> while thread <i>B</i> is executing <code>cfork()</code>):</p>
<ol>
<li>A enters <code>pool_do_bind()</code>.</li>
<li>B enters <code>cfork()</code>.</li>
<li>B enters the pool barrier.</li>
<li>B enters <code>getproc()</code>.</li>
<li>B allocates and zeroes the child proc's <code>proc_t</code> structure.</li>
<li>A acquires <code>pidlock</code>, adds B's proc to <code>procs</code>, and releases <code>pidlock</code> (i.e., A goes through the first loop).</li>
<li>B adds the child proc to the process tree and the active process list (both of which require B to grab <code>pidlock</code>).</li>
<li>B attempts to exit the pool barrier via <code>pool_barrier_exit()</code>, but <code>PBWAIT</code> is set in its <code>proc_t</code>'s <code>p_poolflag</code> field, so it wakes A and blocks, waiting for A to signal it.</li>
<li>A grabs <code>pidlock</code> and examines all processes in <code>procs</code> (i.e., A goes through the second loop).</li>
<li>While examining <code>procs</code>, A looks at B's child process, sees that it is not in the global zone and <code>idtype</code> is not <code>P_ZONEID</code> (it is <code>P_POOLID</code>), and consequently does not add B's child process to <code>procs</code>.</li>
<li>A rebinds all processes in <code>procs</code> and decrements the old pool's (<i>P</i>'s) reference count accordingly.</li>
<li>A signals (wakes) all processes in <code>procs</code>.</li>
<li>B wakes up.</li>
<li>B turns its child process's LWPs loose.</li>
</ol>
<p>The solution was simple: extend the first disjunct of the above conditional statement (pool.c:1496) with another conjunct, <code>idtype != P_POOLID</code>, so that the first disjunct reads &quot;<code>!INGLOBALZONE(p) &amp;&amp; idtype != P_ZONEID &amp;&amp; idtype != P_POOLID</code>&quot;. That way emptying pools of processes (e.g., during pool destruction) will not skip new processes in non-global zones.</p>
<p>I thought, &quot;At last, I nailed it!&quot; but my success was short-lived. The same assertion failed after a few more runs through the zones test suite. I jumped back into MDB and examined the new dump, which was the same as the old dump (see above) with two exceptions: First, the guilty process had descendants in the new dump. That meant that the guilty process was not being spawned when <code>pool_do_bind()</code> executed. If it were being spawned when <code>pool_do_bind()</code> started the first loop, then its parent process would have blocked at <code>pool_barrier_exit()</code> and the child's LWPs would not have started until <code>pool_do_bind()</code> finished executing, which would have given the child no opportunity to spawn descendants.</p>
<p>Furthermore, if the child was being spawned when <code>pool_do_bind()</code> started the first loop and the child started spawning descendants between the time when the thread executing <code>pool_do_bind()</code> returned to <code>pool_pool_destroy()</code> and when the thread encountered the failed assertion, the child's descendants would have been bound to the child's pool, making the pool's reference count greater than one. But the pool's reference count was one, so the child was not being spawned. (One might claim that the descendants could have rebound themselves to other pools before the assertion was made, but that was impossible because the pool lock, which prohibited concurrent pool operations, was held while <code>pool_pool_destroy()</code> and <code>pool_do_bind()</code> were executed.)</p>
<p>Second, the guilty process was executing a subroutine called by <code>relvm()</code> (which was inside the pool barrier) within <code>proc_exit()</code>. That fact led me to think that some interaction between <code>proc_exit()</code> and <code>pool_do_bind()</code> was responsible for the bug.</p>
<p>Further source code analysis did not reveal anything, so I scattered over twenty static DTrace probes throughout <code>cfork()</code>, <code>proc_exit()</code>, and <code>pool_do_bind()</code> in a desperate effort to acquire more useful information. After taking a few more dumps, adjusting the probes, and parsing the DTrace buffers stored in the dumps, I acquired a vital clue: a process that was exiting (via <code>proc_exit()</code>) and had entered (but not exited) the pool barrier was not being caught by the first loop in <code>pool_do_bind()</code>. Curious, I looked closely at the code surrounding <code>pool_barrier_enter()</code> in <code>proc_exit()</code> and the first loop in <code>pool_do_bind()</code>. I noticed nothing out of the ordinary, so I thoguht, &quot;Great, I might as well reexamine functions called by <code>proc_exit()</code> and <code>pool_do_bind()</code> that I thought were correct.&quot; So I reexamined <code>procinset()</code> (which <code>pool_do_bind()</code> used in both the first and second loops to determine if a given process was bound to the pool that was being destroyed) and saw the following (uts/common/os/procset.c):</p>
<blockquote>
<pre>270 /\*
271 \* procinset returns 1 if the process pointed to by pp is in the process
272 \* set specified by psp, otherwise 0 is returned. A process that is
273 \* exiting, by which we mean that its p_tlist is NULL, cannot belong
274 \* to any set; pp's p_lock must be held across the call to this function.
275 \* The caller should ensure that the process does not belong to the SYS
276 \* scheduling class.
277 \*
278 \* This function expects to be called with a valid procset_t.
279 \* The set should be checked using checkprocset() before calling
280 \* this function.
281 \*/
282 int
283 procinset(proc_t \*pp, procset_t \*psp)
284 {
285 int loperand = 0;
286 int roperand = 0;
287 int lwplinproc = 0;
288 int lwprinproc = 0;
289 kthread_t \*tp = proctot(pp);
290
291 ASSERT(MUTEX_HELD(&amp;pp-&gt;p_lock));
292
293 if (tp == NULL)
294 return (0);
295
296 switch (psp-&gt;p_lidtype) {
</pre>
</blockquote>
<p>Notice lines 293-294. If a process' thread list was <code>NULL</code>, then <code>procinset()</code> indicated failure (the process was not in the process set). Now look at the code surrounding <code>pool_barrier_enter()</code> in <code>proc_exit()</code>:</p>
<blockquote>
<pre>470 mutex_enter(&amp;p-&gt;p_lock);
471
472 /\*
473 \* Clean up any DTrace probes associated with this process.
474 \*/
475 if (p-&gt;p_dtrace_probes) {
476 ASSERT(dtrace_fasttrap_exit_ptr != NULL);
477 dtrace_fasttrap_exit_ptr(p);
478 }
479
480 while ((tmp_id = p-&gt;p_itimerid) != 0) {
481 p-&gt;p_itimerid = 0;
482 mutex_exit(&amp;p-&gt;p_lock);
483 (void) untimeout(tmp_id);
484 mutex_enter(&amp;p-&gt;p_lock);
485 }
486
487 lwp_cleanup();
488
489 /\*
490 \* We are about to exit; prevent our resource associations from
491 \* being changed.
492 \*/
493 pool_barrier_enter();
494
495 /\*
496 \* Block the process against /proc now that we have really
497 \* acquired p-&gt;p_lock (to manipulate p_tlist at least).
498 \*/
499 prbarrier(p);
500
501 #ifdef SUN_SRC_COMPAT
502 if (code == CLD_KILLED)
503 u.u_acflag |= AXSIG;
504 #endif
505 sigfillset(&amp;p-&gt;p_ignore);
506 sigemptyset(&amp;p-&gt;p_siginfo);
507 sigemptyset(&amp;p-&gt;p_sig);
508 sigemptyset(&amp;p-&gt;p_extsig);
509 sigemptyset(&amp;t-&gt;t_sig);
510 sigemptyset(&amp;t-&gt;t_extsig);
511 sigemptyset(&amp;p-&gt;p_sigmask);
512 sigdelq(p, t, 0);
513 lwp-&gt;lwp_cursig = 0;
514 lwp-&gt;lwp_extsig = 0;
515 p-&gt;p_flag &amp;= ~(SKILLED | SEXTKILLED);
516 if (lwp-&gt;lwp_curinfo) {
517 siginfofree(lwp-&gt;lwp_curinfo);
518 lwp-&gt;lwp_curinfo = NULL;
519 }
520
521 t-&gt;t_proc_flag |= TP_LWPEXIT;
522 ASSERT(p-&gt;p_lwpcnt == 1 &amp;&amp; p-&gt;p_zombcnt == 0);
523 prlwpexit(t); /\* notify /proc \*/
524 lwp_hash_out(p, t-&gt;t_tid);
525 prexit(p);
526
527 p-&gt;p_lwpcnt = 0;
528 p-&gt;p_tlist = NULL;
529 sigqfree(p);
530 term_mstate(t);
531 p-&gt;p_mterm = gethrtime();
532
533 exec_vp = p-&gt;p_exec;
534 execdir_vp = p-&gt;p_execdir;
535 p-&gt;p_exec = NULLVP;
536 p-&gt;p_execdir = NULLVP;
537 mutex_exit(&amp;p-&gt;p_lock);
</pre>
</blockquote>
<p>Notice anything fishy?</p>
<p><code>proc_exit()</code> set the exiting process' <code>p_tlist</code> field to <code>NULL</code> after entering the pool barrier but before releasing the process' <code>p_lock</code> (exit.c:528), which <code>pool_do_bind()</code> grabbed during the first loop before invoking <code>procinset()</code> (pool.c:1367-1378). So if a process entered the pool barrier but did not exit and another process attempted to destroy the pool, then <code>procinset()</code> would have informed the latter process that the former process was not bound to the pool that was being destroyed. Thus the thread executing <code>pool_do_bind()</code> would have skipped the exiting process, which would have remain bound to the dying pool. Hence the failed assertion.</p>
<p>(It is funny that I did not notice the comment in procset.c:272-274 when I first examined <code>procinset()</code>. It would have saved me much grief.)<br /></p>
<p>The following sample execution will illustrate my point. Suppose that thread <i>A</i> belongs to a process that is bound to a non-default pool <i>P</i>. Suppose further that <i>A</i> is in the middle of <code>proc_exit()</code> and that some other thread <i>B</i> (in a different process) is destroying <i>P</i> and is in the middle of <code>pool_do_bind()</code>. Then the following might happen:</p>
<ol>
<li>B constructs <code>procs</code> and grabs <code>pidlock</code>. (pool.c:1333-1357)</li>
<li>B begins checking each process in the active process list (i.e., it starts going through the first loop). (pool.c:1359-1366)</li>
<li>B is context-switched with A.</li>
<li>A grabs its process' <code>p_lock</code> and enters the pool barrier. (exit.c:470-493)</li>
<li>A sets its process' <code>p_tlist</code> field to <code>NULL</code>. (exit.c:528)</li>
<li>A releases its process' <code>p_lock</code>. (exit.c:537)</li>
<li>A is context-switched with B.</li>
<li>B grabs A's process' <code>p_lock</code>. (pool.c:1367)</li>
<li>B calls <code>procinset()</code> and sees a return value of zero. (pool.c:1373)</li>
<li>B skips A's process and does not add it to <code>procs</code>. (pool.c:1376-1377)</li>
<li>B finishes <code>pool_do_bind()</code> successfully and returns to <code>pool_pool_destroy()</code>.</li>
<li>B asserts that the targeted pool's reference count is zero and fails. [pool.c:454]</li>
</ol>
<p>Thus A's process would not be rebound to the default pool and the assertion would fail.</p>
<p>The second loop in <code>pool_do_bind()</code> did not examine the missed process (even if its parent were added to <code>procs</code> during the first loop) because the second loop also used <code>procinset()</code> to determine if child processes were bound the the targeted pool. So <code>pool_do_bind()</code> was incapable of catching an exiting process as described above.</p>
<p>Further examination of <code>procinset()</code> revealed that a process' <code>p_tlist</code> field was used only when the <code>idtype</code> argument was <code>P_CID</code>. Thus the most straightforward fix was to check that <code>p_tlist != NULL</code> iff <code>idtype == P_CID</code>. I took this approach, which (in addition to a few minor changes elsewhere) worked beautifully. The bug never appeared again, even when I executed my own test that created several pools, bound one of them to a running zone, and destroyed the pools in a tight loop for days.</p>
<p>Thus I found the two causes of the bug in roughly a month. You can imagine what a sigh of relief I gave when I verified my fix!</p>
<p>This episode has one moral: <i>RTFC</i> (<i>Read The F\*%!@#&amp; Comments</i>)!<br /></p>https://blogs.oracle.com/lunchware/entry/the_best_way_to_learnThe Best Way to Learn Kernel Programming Is to Do It YourselfJordan Vaughan 2008-07-30T16:12:02+00:002008-07-31T08:10:29+00:00<p>I want to contribute to an open-source operating system in order to broaden my understanding of operating systems and make my mark in the F/OSS community.&nbsp; The three biggest contenders in my mind are the OpenSolaris OS, the Linux kernel, and the FreeBSD project.&nbsp; I don't think I could go wrong choosing any of them, but I decided that spending time learning how the kernels work and trying to navigate the source trees would be a waste of time.&nbsp; So I concluded that if I want to learn about operating systems on the lowest possible level, then I should construct a kernel for fun.</p>
<p>Following my tradition of sticking 'KANE' into my project names in honor of <a href="http://en.wikipedia.org/wiki/Kane_%28Command_%26_Conquer%29" target="_blank" title="Wikipedia entry on Kane">Kane</a> from <a href="http://en.wikipedia.org/wiki/Command_and_conquer" target="_blank" title="Wikipedia entry on Command and Conquer">the Command &amp; Conquer series of video games</a>, I have decided to name my new kernel KANEOS, the Kick-Ass New and Expeditious Operating System.&nbsp; I'm not quite sure what I'll put into it, but I'm looking at targeting 64-bit x86 extensions and multitasking.&nbsp; It'll be fun to write a small kernel that provides a basic standard C library.&nbsp; (Of course, I'll continue to contribute to OpenSolaris.&nbsp; :-D )<br /></p>
<p>I found a couple of websites that might be helpful for amateur kernel hackers like me:</p>
<ul>
<li><a href="http://www.osdever.net/" target="_blank" title="Bona Fide OS Development News">Bona Fide OS Development News</a></li>
<li><a href="http://www.osdev.org/" target="_blank" title="OS Development">OS Development</a></li>
<li><a href="http://www.osdev.org/" target="_blank" title="OS Development"></a><a href="http://www.nondot.org/sabre/os/articles" target="_blank" title="OSRC: The Operating System Resource Center">OSRC: The Operating System Resource Center</a></li>
</ul>
<p>Typing &quot;osdev&quot; into Google search yielded a fair number of OS developer sites, including the ones above.</p>
<p>I'm sure that I'll be in for a long but profitable experience.&nbsp; :-)&nbsp;</p>https://blogs.oracle.com/lunchware/entry/a_lunchware_licenseA Lunchware License?Jordan Vaughan 2008-07-29T16:56:18+00:002008-07-30T00:02:47+00:00<div align="left">
<p>This is my first blog for work, so I thought I'd kick it off with a note on open-source licensing.&nbsp; I'm a fan of F/OSS but used to struggle when deciding between open-source licenses for my software.&nbsp; For me, the battle was between the GPL, the revised BSD license, the MIT license, and <a href="http://en.wikipedia.org/wiki/University_of_Illinois/NCSA_Open_Source_License">the University of Illinois/NCSA Open Source License</a>.&nbsp; However, I tend to favor BSD-style licenses (&quot;permissive licenses,&quot;) for the following reasons:</p>
<ol>
<li>For me, &quot;free software&quot; means that the licensed code can be incorporated into any application or library, including proprietary projects.&nbsp; Proprietary software should be able to incorporate and modify free software and remain proprietary.&nbsp; BSD-style licenses permit such incorporation: the GPL does not.</li>
<li>BSD-style licenses are pithy: the GPL is not.</li>
</ol>
<p>To complicate matters, I recently stumbled across <a href="http://en.wikipedia.org/wiki/Beerware">a Wikipedia article on &quot;beerware&quot;</a> and was instantly amused by its simplicity and liberalness.&nbsp; After some more consideration, I decided that all of my software would be licensed under either a revised BSD license or a beerware-like license.&nbsp; Unfortunately, I can't use the beerware license because I don't drink beer, so I created the &quot;Lunch-ware License&quot; for drys everywhere:</p>
<blockquote>
<pre>/\*
\* THE LUNCH-WARE LICENSE
\* I, &lt;AUTHOR&gt; &lt;&lt;EMAIL&gt;&gt;, wrote this file in &lt;YEAR&gt;.
\* You can do whatever you want to do with it as long as you retain this notice
\* verbatim. If we meet some day and you think this stuff is worth it, you can
\* buy me a lunch in return.
\*/</pre>
</blockquote>
<p>I doubt that this license will be widely used. Whatever. :-)</p>
</div>