Oracle Blog

Random thoughts of a disorganized mind...

Friday Oct 31, 2014

Last week, a blog post
Hints for writing Unix tools by
Marius Eriksen made the rounds.
It presented nine suggestions on what makes a command a good citizen of
the Unix command-line ecosystem, especially for fitting into pipelines
and filters.

This reminded me of a longer list of guidelines I recently gathered
as part of our efforts to train new hires in Solaris engineering.
I polled long time engineers, trawled the Best Practices documents of
our Architecture Review Committee, cross referenced to the
WCAG2ICT accessibility guidelines
for non-web applications recommended by
Oracle’s
accessibility group, and linked to our online documentation, to come
up with our suggestions on writing new CLI tools for Solaris.

Since these may be useful to others writing commands, I figured I’d
share some of them. I’ve left out the bits specific to complying
with our internal policies or using private interfaces that aren’t
documented for external use, but many of these are generally applicable.
Do note that these are based in part on lessons learned from 40+ years
of Unix history, and that history means that many existing commands
do not follow these suggestions, and in some cases, can’t follow them
without breaking backwards compatibility, so please don’t start
calling tech support to complain about every case our old code isn’t
doing one of these things.

One of the key points of our best practices is that many commands belong
to part of a larger family, and it’s best to fit in with that existing
family. If you’re writing a Solaris-specific command, it should
follow the Solaris Command Line Interface Paradigm guidelines (as listed in
the Solaris Intro(1) man page), but GNU commands should instead follow the
GNU Coding Standards,
classic X11 commands should use the options described in the
OPTIONS section of the X(7) man page, and so on.

Command names & paths

Most new commands should have names 3-9 letters long.
Command names longer than 9 letters should be commands users
rarely have to type in.

Follow common naming patterns, such as:

Pattern

Usage

*adm

Command to change current state & administer a subsystem

*cfg

Command to make permanent configuration changes to a subsystem

*info

Command to print information about objects managed by a subsystem

*prop

Command to print properties of objects managed by a subsystem

*stat

Command to print/monitor statistics on a subsystem

Commands run by normal users should be delivered in
/usr/bin/. Commands normally only run by sysadmins
should be delivered in /usr/sbin/. Commands only run
by other programs, not humans, should be in an appropriate
subdirectory under /usr/lib/. (Commands not delivered with
the OS should instead use the appropriate subdirectory under /opt
instead of /usr in the above paths.)

Options

Never provide an option to take a password or
other sensitive data on the command line or environment variables,
as ps and the proc tools can show those to other users.
(see Passing secrets to subprocesses).

All commands should have a --help and -?
option to print recognized options/arguments/subcommands.

Option parsing should use one of the standard getopt() routines
if at all possible. If you don’t use one, your custom
parser will need to replicate a lot of things the standard routines
provide for error checking & handling.

Subcommands

If you are writing a command that uses subcommands, then being
careful in your work can make your command much easier to use.

Good examples to follow: hg, zfs, dladm

The help subcommand should list the other subcommands,
but not overwhelm the user with pages of details on all of them.
(Remember, the Solaris kernel text console has no scrollback and users
with text-to-speech don’t want 10 minutes of output from it.)
Good examples: hg, svccfg

The help foo or foo --help subcommands should
list the options specific to that subcommand.
Good examples: hg

Look at existing commands with similar subcommands and use similar names
for your subcommands

Text output

All functionality should be available when TERM=dumb.
Use of color output, bold text, terminal positioning codes,
etc. can be used to enhance output on more capable terminals, but
users need to be able to use the system without it. Users
may need to run different commands to get plain text interface
instead of curses/terminal mode, such as ed instead of
vi, or mailx vs. mutt, as long as it’s
clearly documented what they need to run instead, but they must be
able to get their work done in some way. (See WCAG2ICT #1.3.2, WCAG2ICT #1.4.1, & WCAG2ICT #1.4.3)

Text output is generally composed of messages and
data. Messages are the text included in the program,
such as status descriptions, error messages, and output headers;
while data comes from the subsystem the command interacts with, and
depends on the system in question.

Messages displayed to users should use
gettext(3C)
to allow translation & localization.

Errors should be printed to stderr, other output to
stdout, unless specific output redirection options (such
as logging errors to a file) are given.

Users should be able to disable any use of ASCII art, line
drawing characters, figlet-style
text and any other output other than plain text which a
text-to-speech screen reader cannot figure out how to read, while
not losing information, only formatting of it. (See
WCAG2ICT #1.1.1
&
WCAG2ICT #1.4.5)

Commands that do any form of user authentication should
use a full PAM conversation to do so, allowing PAM to provide
required prompts for the user.

User Interaction

If you offer an interactive command prompt mode, such as svccfg
does for executing subcommands, consider using libtecla or similar support for command line editing in this
mode.

Any operation that may permanently alter or destroy data should
either have an “undo” option (such as rollback to prior snapshot)
or have a mode offering the user a chance to confirm (such as the
-i option to rm). (See WCAG2ICT #3.3.4)

Users should be able to configure timeout lengths for any operation that
expects user interaction before a timeout expires. (See
WCAG2ICT #2.2.1)

Implementation

Commands are expected to return 0 on success, 1 upon a fatal
error, and 2 when invoked incorrectly (usage error). Standard C also
provides EXIT_SUCCESS and EXIT_FAILURE in stdio.h for the first
two cases. If other exit status values can be returned, document them in
your man page.

Wednesday Oct 01, 2014

A few people have noticed a trend in the Oracle Solaris 11 update releases
of delivering more and more Solaris commands as 64-bit binaries, so I figured
it was time to write a detailed explanation to answer some of the questions
and help prepare users & developers for further change, as it now becomes
more critical to deliver shared objects in both 32-bit (ILP32) and
64-bit (LP64) versions.

I’d like to thank Ali, Gary, Margot, Rod, and Jeff for their feedback
on this post, and most especially to Sharon for helping rework it to get to the
most important bits first.

What do you need to do?

You need to do what you have always done — ensure you have both
32-bit and 64-bit versions of any shared objects. This requirement is being
highlighted because the consequences of not providing both binary versions
is becoming more disruptive in each subsequent Solaris release.

Development requirements

If you develop software for Solaris, the requirement is to provide both
32-bit and 64-bit versions of any shared objects you provide for other software
to use, whether as libraries to link their programs against or as loadable
objects for frameworks such as localized input methods, PAM service modules,
custom crypt(3C) password hashes, or dozens of other shared object uses in
Solaris.

User requirements

Administrators and users should verify that the software that they install
and use provides both 32-bit & 64-bit shared objects. If they are not
provided, contact the developers to provide both binary versions.

You should also keep an eye on the
End of Features (EOF) Planned for Future Releases of Oracle Solaris
page to see if something you may still need is being removed because it was
determined not to be useful any more when it was reviewed for LP64 conversion.
The page is updated regularly as new items make their way through the internal
obsolescence review processes. If something appears there that would
cause you major grief, let Solaris development know through your sales or
support channels, so we can supply a better transition or replacement plan
where possible.

And of course, if you find bugs in the Solaris 64-bit converted software, or
find that you need a 64-bit version of a particular Solaris library or shared
object that’s not already available, file bugs via
Oracle Support or
the Oracle Partner Network for Solaris.

Delivery of X commands as LP64 binaries

Because Solaris 11 no longer requires programs run on a 32-bit kernel,
and the minimum supported system has 64 times as much RAM as the first Ultra 1,
Oracle Solaris can now ship programs directly as 64-bit binaries, which better
equips them to run on modern sized data sets, while utilizing the full
capabilities of today’s hardware, and have started doing so.

For instance, before Solaris 11 shipped, I switched the default build flags
for the X Window System programs in Solaris to 64-bit (with a few exceptions).
Solaris has long shipped
all the non-obsolete public libraries for X as both 32-bit and 64-bit, and
the upstream open source versions of this software were made 64-bit clean
long ago, originally by DEC for their Alpha workstations, and maintained
since then by the BSD & Linux platforms that delivered 64-bit only
distributions instead of multilib models. For the most part, this was
just an implementation detail for the X programs - their functionality
didn’t change, nor did their interfaces.

One case with a visible impact from becoming LP64-only was
the Xorg server, which uses dlopen() to load shared object drivers for the
specific hardware in use. Solaris 10 8/07
moved from Xorg 6.9 to 7.2 and started delivering Xorg as LP64 binaries
for SPARC & x64, since video cards would soon have more VRAM than a
32-bit X server could access. This was also the first delivery of Xorg for
SPARC, and did not need to support 32-bit SPARC platforms, so on
SPARC Xorg was LP64 from the beginning. On x86, since Solaris was still
supporting 32-bit platforms, the 64-bit Xorg was added alongside the existing
32-bit version.
As of the 64-bit only release of Solaris 11, we could drop the 32-bit version
of Xorg in Solaris, but because Solaris wasn’t the only source
of the loadable driver modules (for instance, nvidia & VirtualBox
both provided drivers for their graphics), we ensured that 64-bit versions
were available, before
announcing
the end of support for 32-bit driver modules.

One less visible impact was in xdm, the old style login GUI for X, which Solaris still
ships, though most Solaris systems use the more modern GNOME display manager
(gdm) instead. In order to authenticate users, xdm uses the PAM framework, which allows administrators to configure a variety of login
methods, such as Kerberos or SmartCards. Administrators can also install
additional PAM methods to work with authentication systems Solaris doesn’t
have built-in support for, or additional pluggable crypt modules
to handle other password hashing methods.
While PAM has supported 64-bit modules since Solaris 7, and the crypt framework
has supported 64-bit modules since it was
introduced in Solaris 9,
most programs calling PAM and the crypt framework have been 32-bit.
Installing only the 32-bit version of a crypt or PAM module thus worked
most of the time. However, now that xdm is 64-bit, a 32-bit-only module
will generate failure messages for users trying to login via xdm, because
the system won’t find a 64-bit module to load. While xdm and PAM
show that a non-existent binary can prevent system login, other less
prominent 64-bit shared objects are going to be required over time, so
providers of shared objects need to ensure they are installing both
32-bit & 64-bit versions of their software going forward.

Some users of custom input methods for different languages may also notice
that their input methods are not available in 64-bit programs, since input
methods are similarly provided as shared objects that are loaded via
dlopen() and thus also have to exist in the same ABI variants
as the calling programs.

Conversion of Solaris commands to LP64 binaries

This effort isn’t limited to X11 software — engineers in all areas
of Solaris are evaluating the various programs
and determining what needs to be done. They have two tasks - to ensure the
program itself is 64-bit clean, and to ensure that the surrounding ecosystem (such as the PAM modules
example above) is 64-bit ready. In some cases, they’ve found
software Solaris doesn’t need to ship any longer, like the
gettable tool for maintaining Internet host tables in the days before
DNS, and are publishing End of Support notices for them.
But for most cases, they’re working to deliver 64-bit versions into
Solaris as time allows.

The results of this work are already visible — the number of LP64
programs in /usr/bin and /usr/sbin in the full Oracle Solaris
package repositories has climbed each release:

In Solaris 11.0, X11 programs provided the bulk of the LP64 programs, but
there were some from other subsystems, such as gdb, emacs, and the NSS crypto
commands. Solaris 11.1 added LP64 crypto commands, including digest, decrypt,
elfsign, pktool, and tpmadm; as well as other commands like top & gzip.
Solaris 11.2 added a number of LP64 GNU programs, including the GNU coreutils
and groff. New tools added in 11.2 were 64-bit in their first delivery,
such as mlocate, the Intel GPU tools, and the jsl JavaScript Lint package.
The bzip2 compression program was made LP64 in 11.2 at the request of a
customer who uses it on large files and wanted the extra performance on
x64.

Solaris 11.2 also adds Java 8 packages, alongside Java 7, and the now deprecated
Java 6 packages. Java 8 for Solaris is 64-bit-only,
dropping the 32-bit binary option found in previous Java releases. As noted
under the Removal of 32-bit Solaris item in the Features Removed From JDK8 list, the Java plugin for web browsers was also removed
from Java 8 for Solaris, because a 64-bit Java plugin cannot be run in the
32-bit Firefox browser.

Solaris Data Model History

Solaris 11.2 represents the latest step in a 20 year journey.

The original Solaris 2.x ABI used a data model referred to as ILP32, in
which the size of the C language types “int”, “long”,
and pointers are all 32-bit numbers. This data model matched the SPARC &
x86 CPUs available at the time.

In 1995, Sun introduced its first SPARC v9 CPU’s, the UltraSPARC I,
which offered 64-bit integers and addresses. Solaris 7 followed,
bringing a second ABI to Solaris, using the LP64 model, in which
“int” remained 32-bits wide, but “long” and
pointer sizes doubled to 64-bits.

This affected both the kernel and user space code, and Solaris 7 delivered
both 32-bit (ILP32) and 64-bit (LP64) kernels for UltraSPARC systems.
The Solaris kernel implementation only allowed running 64-bit user space
processes if a 64-bit kernel was loaded, but 32-bit user space software
could be used with either a 32-bit or 64-bit kernel.

For user-space programs, the class (32 or 64-bit) of libraries must match
the program that links to them. In order to support both 32 and 64-bit
programs, it was necessary to provide both 32 and 64-bit versions of
libraries. To preserve binary compatibility with existing 32-bit
software, the libraries in directories such as /usr/lib were left as
the 32-bit versions, and the 64-bit versions were added in a new
sparcv9 subdirectory. On modern Linux systems, this approach is now
called “multilib.”

Because the UltraSPARC CPUs did not impose any significant performance
penalty when running existing 32-bit code on a 64-bit CPU, Sun continued
to ship 32-bit user-space programs in the ILP32 ABI so that the same binary
could be used on both 32-bit-only and 64-bit-capable CPUs, reducing
development, testing, and support costs. It also reduced by half the memory
requirements for pointers and long ints (thus more easily fitting them in the
CPU caches) in programs which wouldn’t benefit from the larger sizes, an
important consideration
in systems with only 32MB of RAM.
For the small number of programs that had to run 64-bit to be able to read
64-bit kernel structures or debug other 64-bit binaries, Solaris shipped
both 32-bit & 64-bit versions, with isaexec
used to execute the version matching the kernel in use.

On the x86 side, nearly a decade later AMD’s first AMD64 CPUs provided
similar hardware support, and Solaris 10 introduced a matching LP64 ABI for
x86 platforms in 2005, with the 64-bit libraries delivered in
amd64 subdirectories. Even though 32-bit binaries did not run as
fast as fully 64-bit binaries, Solaris followed the same model of providing
mostly ILP32 programs to get the release to market faster; to save on
development, test, and support costs; and to be consistent with Solaris on
SPARC platforms.

In these transitional phases, 32-bit & 64-bit software and hardware
coexisted. For SPARC, support for 32-bit-only CPUs was phased out in
Solaris 10, when support for the last pre-sun4u platforms was dropped.
Support for the 32-bit kernel was dropped at the same time, because all
remaining supported hardware could run the 64-bit kernel. For x86,
support for 32-bit-only CPUs was dropped in Solaris 11.

Therefore, as of the shipping of Oracle Solaris 11 in November 2011,
the supported set of platforms have 64-bit kernels that can run 32-bit
or 64-bit user space binaries.

Other Differences Between 32-bit & 64-bit ABIs

While the size of the types is the defining difference between the two ABI
models, the opportunity to introduce a fresh new ABI after learning of the
mistakes and limitations of the old ABI was hard to resist, and other changes
were made as well. Significant differences include:

stdio interfaces in libc support file descriptors > 255 without any
compatibility issues or interface extensions

Some other platforms followed a different strategy. For example, Linux
introduced “x32”,
a new 32-bit ABI, and is
considering a proposal for year-2038-safe ABIs. Engineers at Sun long ago
debated a “large time” extension to the 32-bit ABI like the large
file interfaces, but decided to concentrate efforts on LP64 instead.
Because Solaris is not trying to maintain 32-bit kernel support for embedded
devices, that is not a problem we have to solve as we move forward.
The result should be a simpler system, which is always
a benefit for developers and ISVs.

We don’t know yet when we’ll finish this journey, but hopefully
we’ll get there before the industry starts converting software to run on
CPUs with 128-bit addressing.

Disclaimer: The preceding is intended to outline
our general product direction. It is intended for information purposes only,
and may not be incorporated into any contract. It is not a commitment to
deliver any material, code, or functionality, and should not be relied upon in
making purchasing decisions. The development, release, and timing of any
features or functionality described for Oracle’s products remains at the
sole discretion of Oracle.

One other change that showed up when gathering data for this list was that
the Oracle Database 12c prerequisites package was renamed between beta &
GA to better match the database naming style - previously it was called
group/prerequisite/oracle/oracle-rdbms-server-12cR1-preinstall
but is now
group/prerequisite/oracle/oracle-rdbms-server-12-1-preinstall.
Fortunately, you don't have to type in the whole FMRI to install it,
pkg install oracle-rdbms-server-12-1-preinstall is enough.

Detailed list of changes

This table shows most of the changes to the bundled packages between
the 11.2 beta released in April, and the 11.2 GA release in July.

As before, some were excluded for clarity, or to reduce noise and
duplication. All of the bundled packages which didn’t change the version number
in their packaging info are not included, even if they had updates to fix bugs,
security holes, or add support for new hardware or new features of Solaris.

Monday Jun 09, 2014

In Solaris 11.1,
I updated the system headers to enable use of several attributes on
functions, including noreturn and printf format, to give
compilers and static analyzers more information about how they are used to
give better warnings when building code.

In Solaris 11.2, I've gone back in and added one more attribute to
a number of functions in the system headers:
__attribute__((__deprecated__)). This is used to warn people
building software that they’re using function calls we recommend
no longer be used. While in many cases the
Solaris Binary Compatibility Guarantee means we won't ever remove
these functions from the system libraries, we still want to discourage
their use.

I made passes through both the POSIX and C
standards, and some of the Solaris architecture review cases to come up with
an initial list which the Solaris architecture review committee accepted to
start with. This set is by no means a complete list of Obsolete function
interfaces, but should be a reasonable start at functions that are well
documented as deprecated and seem useful to warn developers away from.
More functions may be flagged in the future as they get deprecated, or if
further passes are made through our existing deprecated functions to flag
more of them.

To See or Not To See

To see these warnings, you will need to be building with either gcc
(versions 3.4, 4.5, 4.7, & 4.8 are available in the 11.2 package repo),
or with Oracle Solaris Studio 12.4 or later (which like Solaris 11.2, is
currently in beta testing).
For instance, take this oversimplified (and obviously buggy) implementation of
the cat command:

The exact warning given varies by compilers, and the compilers also have a
variety of flags to either raise the warnings to errors, or silence them.
Of couse, the exact form of the output is Not An Interface that can
be relied on for automated parsing, just shown for example.

gets(3C) is actually a special case — as noted above, it is no
longer part of the C Standard Library in the C11 standard, so when compiling in
C11 mode (i.e. when __STDC_VERSION__ >= 201112L), the
<stdio.h> header will not provide a prototype for it, causing
the compiler to complain it is unknown:

The gets(3C) function of course is still in libc, so if you ignore the
error or provide your own prototype, you can still build code that calls it,
you just have to acknowledge you’re taking on the risk of doing so yourself.

Tuesday Apr 29, 2014

When Solaris 11.1 came out in October 2012, I posted about the
changes to the included FOSS packages. With the publication today of
Solaris 11.2 beta, I thought it would be nice to revisit this and
see what’s changed in the past year and a half. This time around, I’m
including some bundled packages that aren’t necessarily covered by a
free software or open source license, but are of interest to Solaris users.

Removing software in updates

Last time I discussed how IPS allowed us to make a variety of
changes in update releases much more easily than in the Solaris 10 package
system. One of these changes is obsoleting packages, and we’ve done that
in a couple rare cases in both Solaris 11.1 and 11.2 where the software is
abandoned by the upstream, and we’ve decided it would be worse to keep it
around, potentially broken, than to remove it on upgrade.

When we do this, notices will be posted to the End of Features for Solaris 11 web page, alongside the list of
features that have been declared deprecated and may be removed in
future releases. As you can see there, in Solaris 11.1 the Adobe
Flash Player and tavor HCA driver packages were removed.

In Solaris 11.2, three more packages have been removed. slocate
was a “secure” version of the locate utility, which wouldn’t show
a user any files that they didn’t have permission to access. Unfortunately,
this utility was broken by changes in the AST library, and since there is no
longer an upstream for it, we decided to follow the lead of several Linux
distros and moved to mlocate
instead, which is added in this release.

The other two removed packages are both Xorg video drivers - the nv
driver for NVIDIA graphics, and the trident driver for old
Trident graphics chipsets. Most users will not notice these removals, but
if you had manually created an xorg.conf file specifying one of these drivers,
you may need to edit it to use the vesa driver instead.

NVIDIA had previously supported the nv open source driver and
contributed updates to X.Org to support new chipsets in it, but in 2010, they
announced they would no
longer do so, and considered nv deprecated, recommending the use of
the VESA driver for those who had no better driver to use. While we had
continued to ship the nv driver in Solaris, it led to an increasing number of
crashes, hangs, and other bugs for which the resolution was to remove the nv
driver and use vesa instead, so we are removing it to end those issues.
For systems with graphics devices new enough to be supported by the bundled
nvidia closed-source driver, this will have no effect. For those
with older devices, this will cause Xorg autoconfiguration to load the vesa
driver instead, until and unless the user
downloads & installs an appropriate NVIDIA legacy driver.

The trident driver was still in Solaris even after we dropped
32-bit support on x86, and years after Trident Microsystems exited the
graphics business and sold its graphics product line to XGI, as the
Sun Fire V20z server included a Trident chipset for the console video device.
Unfortunately, the upstream driver has been basically unmaintained since
then, and Oracle has had to apply patches to port to new Xorg releases.
Meanwhile, in order to resolve bugs that caused system hangs, the
trident driver was modified to not load on V20z systems, which left us
shipping an unmaintained driver solely for a system that could not use it,
but uses the vesa driver instead, so we decided to remove it as well.

If you had either of these Xorg driver packages installed, then when you
update to 11.2, then pkg update will inform you there are release
notes for these drivers, to warn you of the possibility you may need to edit
your xorg.conf.

System Management Stack

The popular Puppet
system for automating configuration changes across machines has been included
in Solaris, and updated to support several Solaris features in both the
framework and in individual configuration provideers. For instance,
configuration changes made via Puppet will be recorded in the Solaris audit
logs as part of a puppet session, and Puppet’s configuration file is generated
from SMF properties using the new SMF stencil facilities. Providers are included that can configure IPS
publishers, SMF properties, ZFS datasets, Solaris boot environments, and
a variety of Solaris NIC, VNIC, and VLAN settings.

Another addition is the Oracle Hardware Management Pack (HMP), a set of tools that work with the ILOM,
firmware, and other components in Sun/Oracle servers to configure low-level
system options. Previously these needed to be downloaded and installed
separately, now they are a simple pkg install away, and kept up to date with
the rest of the OS.

A collaboration with Intel led to the integration of a Solaris port of
Intel’s numatop tool for observing
memory access locality across CPUs.

The Java 8development kit & runtime environment are both available
as well. The default installation clusters will only install Java 7, but
you can install the Java 8 runtime with “pkg install jre-8”
or get both the runtime & development kits with
“pkg install jdk-8”. The /usr/java mediated link,
through which all the links in /usr/bin for the java,
jar, javac, etc. commands flow will be set by default to
the most recent version installed, so installing Java 8 will make that version
default. You can see this via “ls -l /usr/java” reporting:

If you want to choose a different version to be default, you can manually set
the mediator to that version with
“pkg set-mediator -V 1.7 java”. Of course, for many
operations, you can directly access any installed java version via the
full path, such as /usr/jdk/instances/jdk1.8.0/bin/java instead of
relying on the /usr/bin symlinks.

One caveat to be aware of is that Java 8 for Solaris is only provided as
64-bit binaries, as all Solaris 11 and later machines are running 64-bit now.
This means that any JNI modules you rely on will need to be compiled as 64-bit
and any programs that try to load Java must be 64-bit. There is also no
64-bit version provided of either the Java plugin for web browsers, or the
Java Webstart program for starting Java client applications from web pages.

Desktop Stack

There were some feature updates in the X Window System layers of the
desktop stack though – most notably the Xorg server was upgraded from
1.12 to 1.14, and the accompanying Mesa library was upgraded
to version 9.0.3, which includes support for OpenGL 3.1 and GLSL 1.40
on Intel graphics. The bundled version of NVIDIA’s graphics driver was
also updated, to NVIDIA’s latest “long lived branch” - 331.
For users with older graphics cards which are no longer supported in this
branch, legacy branches are available from NVIDIA’s Unix driver download
site.

OpenStack

And last, but certainly not least, especially in the number of packages
added to the repository, is the addition of OpenStack support in Solaris.
The Cinder Block Storage Service, Glance Image Service, Horizon Dashboard,
Keystone Identity Service, Neutron Networking Service, and Nova Compute
Service from the OpenStack Grizzly (2013.1) release are all provided, in versions
tested and integrated with Solaris features.
Between the Open Stack packages themselves and all the python modules required
for them, there’s over 100 new FOSS packages in this release.

Detailed list of changes

This table shows most of the changes to the bundled packages between
the original Solaris 11.1 release, the latest Solaris 11.1 support
repository update (SRU18, released April 14, 2014), and the Solaris
11.2 beta released today.

As with last time, some were excluded for clarity, or to reduce noise and
duplication. All of the bundled packages which didn’t change the version number
in their packaging info are not included, even if they had updates to fix bugs,
security holes, or add support for new hardware or new features of Solaris.

Saturday Mar 29, 2014

Fifteen years ago today, March 29, 1999, I showed up at Sun’s MPK29
building for my first day of work as a student intern. I was off-cycle from
the normal summer interns, since I wasn’t enrolled Spring semester but was
going to finish my last two credits (English & Math) during the Summer
Session at Berkeley. A friend from school convinced his manager at Sun to
give me a chance, and after interviewing me, they offered to bring me on
board for 3 months full-time, then dropping down to half-time when classes
started.

I joined the Release Engineering team as a tools developer and backup RE
in what was then the Power Client Desktop Software organization
(to differentiate it from the Thin Client group) in the Workstation Products
Group of Sun’s hardware arm. The organization delivered the
X Window System, OpenWindows, and CDE into Solaris,
and was in the midst of two big projects for the Solaris 7 8/99 & 11/99
releases: a project called “SolarEZ” to make the CDE desktop more usable
and to add support for syncing CDE calendar & mail apps with Palm Pilot PDAs;
and the project to merge the features from X11R6.1 through X11R6.4
(including Xinerama, Display Power Management, Xprint, and LBX) into
Sun’s proprietary X11 fork, which was still based on the X11R6.0 release.

I started out with some simple bug fixes to learn the various build
systems, before starting to try to write the scripts they wanted to
simplify the builds. My first bug was to find out why every time they
did a full build of Sun’s X gate they printed a copy of the specs.
(To save trees, they’d implemented a temporary workaround of setting
the default printer queue on the build machines to one not connected to
a printer, but then they had to go delete all the queued jobs every
few days to free disk space.) After a couple days of digging, and
learning far too much about Imake for my own good, I found the cause
was a merge error in the Imake configuration specs. The TroffCmd
setting for how to convert troff documents to PostScript had been
resynced to the upstream setting of psroff, but the flags were still
the ones for Sun’s troff. These flags to Sun’s troff generated a
PostScript file on disk, while they made psroff send the PostScript to
the printer - switching TroffCmd back to Solaris troff solved the
issue, so I filed Sun bug 4227466 and made my first commit to the
Solaris X11 code base.

Six months later, at the end of my internship, diploma in hand, Sun offered
to convert my position to a regular full-time employee, and I signed on.
(This is why I always get my anniversary recognition in September, since they
don’t count the internship.) Six months after that, the X11R6.4 project
was finishing up and several of the X11 engineers decided to move on, making
an opening for me to switch to the X11 engineering team as a developer.
Like many junior developers joining new teams, I started out doing a lot of
bug fixes, and some RFE’s, such as integrating Xvfb & Xnest into Solaris 9.

A couple years in, Sun’s attempts to reach agreement amongst all of the CDE
co-owners to open source CDE had failed, resulting only in the
not-really-open-source release of OpenMotif, so Sun chose to move on once
again, to the GNOME Desktop.
This required us to provide a lot of infrastructure in the X server and
libraries that GNOME clients needed, and I got to take on some more
challenging projects such as trying to port the Render extension from
XFree86 to Xsun, assisting on the STSF font rendering project,
adding wheel mouse support to Xsun,
and working to ensure Xsun had the underlying support needed by GNOME’s
accessibility projects. Unfortunately, the results of some of those
projects were not that good, but we learned a lot about how hard it was to
retrofit Xsun with the newly reinvigorated X11 development work and how
maintaining our own fork of all the shared code was simply slowing us down.

Then one fateful day in 2004, our manager was on vacation, so I got a call
from the director of the Solaris x86 platform team, who had been meeting with
video card vendors to decide what graphics to include in Sun’s upcoming AMD64
workstations. The one consistent answer they’d gotten was that the time and
cost to produce Solaris drivers went up and the feature set went down if they
had to port to Xsun as well, instead of being able to reuse the XFree86 driver
code they were already shipping for Linux. He asked what it would take to
ship XFree86 on Solaris instead, and we discussed the options, then I talked
with the other X engineers, and soon I was the lead of a project to replace
the Solaris X server.

This was right at the time XFree86 was starting to come apart while the
old X.Org industry consortium was trying to move X11 to an open development
model, resulting in the formation of the X.Org Foundation. We chose to just
go straight to Xorg, not XFree86, and not to fork as we’d done with
Xsun, but instead to just apply any necessary changes as patches, and try to
limit those to not changing core areas, so it was easier to keep up with new
releases.

Thus, right at the end of the Solaris 10 development cycle we integrated the
Xorg 6.8 X server for x86 platforms, and even made it the default for that
release. We had no SPARC drivers ported yet, just all the upstream x86 drivers,
so only provided Xsun in the SPARC release. The SPARC port, along with a
64-bit build for x64 systems, came in later Solaris 10 update releases.
This worked out much better than our attempts to retrofit Xsun ever did.

Along with the Solaris 10 release, Sun announced that Solaris was going to
be open sourced, and eventually as much of the OS source as we could release
would be. But for the Solaris 10 release we only had time to add the Xorg
server and the few libraries we needed for the new extensions. Most of the
libraries and clients were still using the code from Sun’s old X11R6 fork,
and we needed to figure out which changes we could publish and/or contribute
back upstream. We weren’t ready to do this yet on the opening day of
OpenSolaris.org, but I put together a plan for us to do so
going forward, combing the tasks of migrating from our fork to a patched
upstream code base, and the work of excising proprietary bits so it could
be released as fully open source.

Due to my work with the X.Org open source community, my participation in the
existing Solaris communities on Usenet and Yahoo Groups, and interest in X
from the initial OpenSolaris participants, I was pulled into the larger
OpenSolaris effort, culminating in serving 2 years on the OpenSolaris Governing
Board. When the OpenSolaris distro effort (“Project Indiana”) kicked off, I
got involved in it to figure out how to integrate the X11 packages to it. Between
driving the source tree changes and the distro integration work for X, I became
the Tech Lead for the Solaris X Window System for the Solaris 11 release.

Solaris 11 also included a few bug fixes I’d found via experimenting with the
Parfait static analysis tool created by Sun Labs. Since shortly
after starting in the X11 group, I’d been handling the security bugs in X.
Having tools to help find those seemed like a good idea, and that was
reinforced when we used Coverity to find a privilege escalation bug in Xorg. When the Sun Labs team came to demo parfait to us, I decided to
try it out, and found and fixed a number of bugs in X with it (though not in
security sensitive areas, just places that could cause memory leaks or
crashes in code not running with raised privileges).

After the Oracle acquisition, part of adopting Oracle’s security assurance
policies was using static analysis on our code base. With my
experience in using static analyzers on X, I helped create the overall
Solaris plan for using static analysis tools on our entire code base.
This plan was accepted and I was asked to take on the role of leading our
security assurance efforts across Solaris.

Another part of integrating Sun into Oracle was migrating all the bugs from
Sun’s bug database into Oracle’s, so we could have a unified system, allowing
our Engineered Systems teams to work together more productively, tracking bugs
from the hardware through the OS up to the database & application level in a
single system. Valerie Fenwick & Scott Rotondo had managed the Solaris bug
tracking for years, and I’d worked with them, representing both the X11 and
larger Desktop software teams. When Scott decided to move to the Oracle
Database Security team, Valerie asked me to replace him, just as the effort
to migrate the bugs was beginning. That turned into 2 years of planning,
mapping, updating tools and processes, coordinating changes across the
organization, traveling to multiple sites to train our engineers on the
Oracle tools, and lots of communication to prepare our teams for the changes.

As we looked to the finish line of that work, planning was beginning for the
Solaris 12 release. All of the above experiences led to me being named as the
Technical Lead for the Solaris 12 release as a whole, covering the entire OS,
coordinating and working with the tech leads for the individual consolidations.
We’re two years into that now, and while I can’t really say much more yet than
the official roadmap, I think we’ve got some solid plans in place
for the OS.

So here it is now 15 years after starting at Sun, 14 after joining the X11
team. I still do a little work on X in my spare time (especially around the
area of X security bugs), and officially am still in the X11 group, but most
of my day is spent on the rest of Solaris now, between planning Solaris 12,
managing our bug tracking, and ensuring our software is secure.

In those 15 years, I’ve learned a lot, accomplished a bit, and made a lot of
friends. I’ve also:

worked under 3 CEO’s (Scott, Jonathan, Larry), and 4 managers, but more VP’s and Directors than I can count thanks to Sun’s love of regular reorgs.

been part of the Workstation Products Group, the Webtop Applications group, the Software Globalization group, the User Experience Engineering group, the x64 Platform group, the Solaris Open Source group, and a few I’ve forgotten, again thanks to Sun’s love of regular reorgs.

been seen by my co-workers in a suit exactly 4 times (my initial job interview, and funerals for 3 colleagues we’ve lost over the years).

Where will I be in 15 more years? Hard to guess, but I bet it involves making
sure everything is ready for Y2038, so I can retire in peace when
that arrives.

For now, I offer my heartfelt thanks to everyone whose help made the last
15 years possible - none of it was done alone, and I couldn't have got here
without a lot of you.

Sunday Feb 23, 2014

There has been much discussion online this weekend of what happens when you
introduce an extra “goto fail” line into your code, such as
making a mistake using a merge tool to combine your changes into those made in
parallel to a code repository under active development by other engineers.
(That is the most common cause I’ve seen of such duplication in source
code, and thus the one I see most likely —
others guess similarly and discuss that side more in depth — but that’s not what this post is about.)

which he pointed out “If I compile with -Wall (enable all warnings),
neither GCC 4.8.2 or Clang 3.3 from Xcode make a peep about the dead code.” Others pointed out that the -Wunreachable-code option is
one of many that neither compiler includes in -Wall by default, since
both compilers treat -Wall not as “all possible warnings”
but rather more of “the subset of warnings that the compiler authors
think are suitable for use on most software”, leaving out those that
are experimental, unstable, controversial, or otherwise not quite ready for
prime time.

Since this has a simple test case, and I’ve got a few compilers handy on my
Solaris x86 machine for doing various builds with, I did some testing myself:

Solaris Studio 12.3

Both Studio’s default compiler flags and old-school lint static
analyzer caught this by default. Though, as I noted last year, warnings about
some unreachable code chunks will be seen in some releases but not others,
as the information available to the compiler about the control flow evolves.

As anyone who has compiled much software with more than one of the above
(including just different versions of the same compiler) can attest, the
warnings generated on real code bases by various compilers will almost always
be different. There’s many cases where gcc & clang warn about things
Studio doesn’t, and vice versa, as every compiler & analyzer has been
fed different test cases over the years of bad code that developers wanted them
to find, or specific problems they needed to solve. This is why when
compiling the Solaris kernel, we run two compilation passes on all the
sources - once with Studio cc to both get its code analysis and to make the
binaries we ship, and once with gcc just to get its code analysis, plus running
it through Oracle’s internal parfait static analyser as well.

Of course, it’s too late to stop the bug that caused this discussion from
slipping into the code base, and easy to second guess after the fact how it may
have been prevented. We have lots of tools at our disposal and they continue to
grow and improve, including human code review (though human review is much less
reliable and repeatable than most tools, humans are much more flexible at
finding things the automated tools weren't prepared to catch), and we need to
use multiple ones (more on the most sensitive code, such as security protocols)
to improve our chances of finding and fixing the bugs before we integrate them.
Hopefully we can learn from the ones that slip through how to improve our checks
for the future, such as adding -Wunreachable-code compiler flags where
appropriate.

Since it was only $5, I picked it up to see if it was worth recommending to others. When not on sale, the ebook is only $15, since it’s not a large book - the PDF is 168 pages, the epub on my iPad was 199 pages - and in both forms that includes an index of about 25 pages. That also made it a quick enough read that I could get through in an afternoon, skimming over the examples and reference materials.

This book is intended to get existing Solaris admins up to speed quickly on Solaris 11 - it’s not going to introduce the basics of system administration, but will tell you what commands to run now. If you don’t know what routers, subnets, or tunnels are, you probably want a more introductory book - if you know what they are, and need to know what to run instead of ifconfig or editing /etc/hostname.e1000g0 to configure them on Solaris 11, this book will help.

Phil’s biases as a long time server admin are obvious in some sections, such as the introduction to NWAM, the Network AutoMagic feature, which he suggests “from a server sysadmin perspective, it might perhaps be better named "Never Wake A Monster"” though he admits on a laptop it can be useful to adjust to networks in different locations. There’s also a few areas where you can tell the book was written before Solaris 11.1 was out and didn’t get updated for the latest changes.

As the author of the classic pkg-get tool for installing Solaris SVR4 packages from network repositories, he has plenty to say about the new IPS packaging system in Solaris 11 as well, providing some useful tips on finding packages and setting up local repositories, though he does discourage use of many of the more advanced pkg subcommands that can help admins take more control over exactly what gets installed and updated on their systems.

Overall it’s a decent aide, and something I may refer to in the future, as I don’t do system administration that often these days, and often need a refresher, especially when old habits no longer work. It’s more detailed in many sections that the official Transitioning From Oracle Solaris 10 to Oracle Solaris 11.1 manual, which mainly points off to the other Solaris 11 manuals for details; but concentrated only on the areas a typical system administrator will be configuring, not developer or end-user visible changes. I’d recommend it to experienced admins looking for a hands on guide for dealing with Solaris 11 systems, but it’s not the right level for those trying to plan a migration or learn Solaris administration from scratch.

As always, the above is solely my personal opinion, not an official Oracle corporate position or endorsement.

Friday Nov 01, 2013

A bit over twenty years ago, Sun formed an Architecture Review Committee (ARC)
that evaluates proposals to change interfaces between components in Sun software
products. During the OpenSolaris days, we opened many of these discussions to
the community. While they’re back behind closed doors, and at a different
company now, we still continue to hold these reviews for the software from
what’s now the Sun Systems Group division of Oracle.

Recently one of these reviews was held (via e-mail discussion) to review
a proposal to update our GNU
findutils package to the latest upstream release.
One of the upstream changes discussed was the addition of an
“oldfind” program. In findutils 4.3, find was modified to use
the fts() function to walk the directory tree, and
oldfind was created to provide the old mechanism in case there were
bugs in the new implementation that users needed to workaround.

In Solaris 11 though, we still ship the find descended from SVR4 as
/usr/bin/find and the GNU find is available as either
/usr/bin/gfind or /usr/gnu/bin/find. This raised the
discussion of if we should add oldfind, and if so what should we call it.
Normally our policy is to only add the g* names for GNU commands
that conflict with an existing Solaris command – for instance, we ship
/usr/bin/emacs, not /usr/bin/gemacs. In this case however,
that seemed like it would be more confusing to have /usr/bin/oldfind
be the older version of /usr/bin/gfind not of /usr/bin/find.
Thus if we shipped it, it would make more sense to call it
/usr/bin/goldfind, which several ARC members noted read more naturally
as “gold find” than as “g old find”.

One of the concerns we often discuss in ARC is if a change is likely to be
understood by users or if it will result in more calls to support. As we hit
this part of the discussion on a Friday at the end of a long week, I couldn’t
resist putting forth a hypothetical support call for this command:

“Hello, Oracle Solaris Support, how may I help you?”

“My admin is out sick, but he sent an email that he put the findutils
package on our server, and I can run goldfind now. I tried it, but
goldfind didn’t find gold.”

“Did he get the binutils package too?”

“No he just said findutils, do we need binutils?”

“Well, gold comes in the binutils package, so goldfind would be able to
find gold if you got that package.”

“How much does Oracle charge for that package?”

“It’s free for Solaris users.”

“You mean Oracle ships packages of gold to customers for free?”

“Yes, if you get the binutils package, it includes GNU gold.”

“New gold? Is that some sort of alchemy, turning stuff into gold?”

“Not new gold, gold from the GNU project.”

“Oracle’s taking gold from the GNU project and shipping it to me?”

“Yes, if you get binutils, that package includes gold along with
the other tools from the GNU project.”

“And GNU doesn’t mind Oracle taking their gold and giving it to customers?”

“No, GNU is a non-profit whose goal is to share their software.”

“Sharing software sure, but gold? Where does a non-profit like GNU get gold anyway?”

“Oh, Google donated it to them.”

“Ah! So Oracle will give me the gold that GNU got from Google!”

“Yes, if you get the package from us.”

“How do I get the package with the gold?”

“Just run pkg install binutils and it will put it on your disk.”

“We’ve got multiple disks here - which one will it put it on?”

“The one with the system image - do you know which one that is?

“Well the note from the admin says the system is on the first disk
and the users are on the second disk.”

“Okay, so it should go on the first disk then.”

“And where will I find the gold?”

“It will be in the /usr/bin directory.”

“In the user’s bin? So thats on the second disk?”

“No, it would be on the system disk, with the other development tools,
like make, as, and what.”

“So what’s on the first disk?”

“Well if the system image is there the commands should all be there.”

“All the commands? Not just what?”

“Right, all the commands that come with the OS, like the shell, ps, and who.”

“So who’s on the first disk too?”

“Yes. Did your admin say when he’d be back?”

“No, just that he had a massive headache and was going home after I tried to get him to explain this stuff to me.”

Needless to say, we decided this might not be the best idea. Since the GNU
package hasn’t had to release a serious bug fix in the new find in the
past few years, the new GNU find seems pretty stable, and we always have the
SVR4 find to use as a fallback in Solaris, so it didn’t seem that adding
oldfind was really necessary, so we passed on including it when we
update to the new findutils release.

[Apologies to Abbott, Costello, their fans, and everyone who read this far.
The Gold (linker) page on Wikipedia may explain some of the above,
but can’t explain why goldfind is the old GNU find, but gold is the new GNU ld.]

Sunday Aug 04, 2013

One of the areas the X.Org Foundation has been working on in recent years
is trying to bring new developers on board. Programs like Google Summer of Code and
The X.Org Endless Vacation of Code (EVoC)
help with this somewhat, but we’ve also been trying to find ways to lower the
barrier to entry, both so the students in those programs can learn more and so
that other developers have an easier time joining us.

Some of our efforts here have been technological - one of the driving
efforts of the conversions from Imake to automake and from CVS to git was
to make use of tools developers would already be familiar and productive
with from other projects. The Modularization project, which broke up
X.Org from one giant tree into over 200 small ones, had the goal of making
it possible to fix a bug in a single library or driver without having to
download and build many megabytes of software & fonts that were not
being changed.

We finally ended up holding a first session as a virtual gathering
over the internet one weekend in March 2012. Bart, Matt Dew, Keith Packard,
Stéphane Marchesin, and I used Gobby
and Mumble to agree on an
outline and write several chapters, along with creating some accompanying
diagrams using graphviz dot.
Unfortunately, the combination of the work from home setting, in which people
kept dropping in and out as their home life intervened, the small number of
people, and lack of an experienced organizer made this not as productive as
other book sprints have been for other projects.

After the first weekend we had drafts of about 7 chapters and a preface,
and outlines of several more. We also had a large chunk of prewritten material
on graphics drivers that Stéphane had been working on already to link with.
Over the next few months, Bart edited and formatted the chapters we had,
while we got a little more written, including recruiting additional authors
such as Peter Hutterer.

We then gathered
again, this time in person, hosted by Matthias Hopf at the
Georg-Simon-Ohm-University in Nürnberg, Germany, over the two days prior
to the 2012 X.Org Developer’s
Conference. We had some additional people join us this time, including
Martin Peres, Lucas Stach, Ben Skeggs, and probably more I’ve forgotten.
At this session, Bart, Matt & I worked on polsihing the rough edges on
the guide we’d been creating, while the others worked more on the driver
guide that Stéphane had started. Bart figured out the tools needed to
generate ebook formats of the guide, such as epub, and made a cover and
pulled together an overall structure for the guide, and by the end of that
sprint, we had something we could read on the Android ebook reader he'd
brought with him.

And then, we waited. Bart tried to figure out how to make the setup
maintainable and reproducible, and how to make it available to our users,
but his day job as a University professor kept taking time away from that.
Bart also gave a lightning talk on our book sprint experience
at the Write the Docs conference
in Portland earlier this year covering what worked and what we learned didn’t
work in our virtual sprint, and asking for ideas or help on how to go forward.

Meanwhile, the burden of fighting the spammers on the X.Org & FreeDesktop.Org
wikis had gotten overwhelming on the current wiki software, so the freedesktop
sitewranglers evaluated different solutions, and after looking at how well
ikiwiki had been working for the past few
years on the XCB wiki decided to
move the rest of the FreeDesktop hosted wikis to ikiwiki as well. One major
advantage is that it let us use the existing freedesktop authentication
infrastructure for wiki accounts instead of having to find different ways to
let our users in while keeping the spammers out. Another benefit is that
it uses Markdown
and git to update the wiki, so we can easily
push files in Markdown format to the wiki to publish them now.

In a shocking coincidence, the geeks who wrote the X.Org Developer's Guide
also used Markdown & git to author it, so with just a very little conversion
to make links work between sections, it was possible to publish the guide to
the wiki, where you can find it now:

It’s not perfect, it’s not even fully finished, but it’s better than nothing,
and since it’s now on a wiki, it’s even easier for others to fill in the missing
pieces or improve the existing bits.

Stephane is still working on the companion guide to graphics drivers,
covering the stack from DRM and KMS in the kernel up to Xorg & Mesa.
He posted the PDF of the draft
as it was at the end of the March book sprint, but not yet a version with
the contributions added from the September followup.

When working on applications higher up the stack, not hacking on the
graphics stack itself, we refer developers to the documentation for their
toolkits, or to general
guides to OpenGL programming, as those are beyond what we can document
ourselves in X.Org.

And for other open source projects, if you’d like a chance to have a
professionally organized, in person doc sprint for your project,
applications
are being taken until August 7 for 2-4 projects to hold sprints as part of
the Google Summer of Code Doc Camp in October, including travel
expenses for up to 5 participants from the sponsors.

Sunday Nov 11, 2012

While Solaris 11.1 was under development, we started seeing some errors in
the builds of the upstream X.Org git master sources, such as:

"Display.c", line 65: Function has no return statement : x_io_error_handler
"hostx.c", line 341: Function has no return statement : x_io_error_handler

from functions that were defined to match a specific callback definition
that declared them as returning an int if they did return, but these were
calling exit() instead of returning so hadn't listed a return value.

These had been generating warnings for years which we'd been ignoring, but
X.Org has made enough progress in cleaning up code for compiler warnings
and static analysis issues lately, that the community turned up the default
error levels, including the gcc flag -Werror=return-type and the
equivalent Solaris Studio cc flags -v -errwarn=E_FUNC_HAS_NO_RETURN_STMT,
so now these became errors that stopped the build.
Yet on Solaris, gcc built this code fine, while Studio errored out.
Investigation showed this was due to the Solaris headers, which during
Solaris 10 development added a number of annotations to the headers when
gcc was being used for the amd64 kernel bringup before the Studio amd64
port was ready. Since Studio did not support the inline form of these
annotations at the time, but instead used #pragma for them,
the definitions were only present for gcc.

To resolve this, I fixed both sides of the problem, so that it would work
for building new X.Org sources on older Solaris releases or with older
Studio compilers, as well as fixing the general problem before it broke
more software building on Solaris.

To the X.Org sources, I added the traditional Studio
#pragma does_not_return to recognize that functions like exit()
don't ever return, in patches such as this Xserver patch. Adding a dummy return statement was
ruled out as that introduced unreachable code errors from compilers and
analyzers that correctly realized you couldn't reach that code after a
return statement.

And on the Solaris 11.1 side, I updated the annotation definitions in
<sys/ccompile.h> to enable for Studio 12.0 and later compilers
the annotations already existing in a number of system headers for functions
like exit() and abort(). If you look in that file you'll
see the annotations we currently use, though the forms there haven't gone
through review to become a Committed interface, so may change in the future.

Actually getting this integrated into Solaris though took a bit more work
than just editing one header file. Our ELF binary build comparison tool,
wsdiff,
actually showed a large number of differences in the resulting binaries due
to the compiler using this information for branch prediction, code path
analysis, and other possible optimizations, so after comparing enough of the
disassembly output to be comfortable with the changes, we also made sure to
get this in early enough in the release cycle so that it would get plenty of
test exposure before the release.

It also required updating quite a bit of code to avoid introducing new lint
or compiler warnings or errors, and people building applications on top of
Solaris 11.1 and later may need to make similar changes if they want to keep
their build logs similarly clean.

Previously, if you had a function that was declared with a non-void return
type, lint and cc would warn if you didn't return a value, even if you called
a function like exit() or panic() that ended execution.
For instance:

would previously require a never executed return 0; after the
exit() to avoid lint warning "function falls off bottom without
returning value".

Now the compiler & lint will both issue "statement not reached"
warnings for a return 0; after the final exit(),
allowing (or in some cases, requiring) it to be removed. However, if
there is no return statement anywhere in the function, lint will warn
that you've declared a function returning a value that never does so,
suggesting you can declare it as void. Unfortunately, if your
function signature is required to match a certain form, such as in a
callback, you not be able to do so, and will need to add a
/* LINTED */ to the end of the function.

If you need your code to build on both a newer and an older release, then you
will either need to #ifdef these unreachable statements, or, to keep
your sources common across releases, add to your sources the corresponding
#pragma recognized by both current and older compiler versions, such as:

#pragma does_not_return(exit)
#pragma does_not_return(panic)

Hopefully this little extra work is paid for by the compilers & code
analyzers being able to better understand your code paths, giving you
better optimizations and more accurate errors & warning messages.

When you’re ready to upgrade to the packages from either this repo,
or the support repository, you’ll want to first read
How to Update to Oracle Solaris 11.1 Using the Image Packaging System by Pete Dennis,
as there are a couple issues you will need to be aware of to do that upgrade,
several of which are due to changes in the Free and Open Source Software (FOSS)
packages included with Solaris, as I’ll explain in a bit.

Solaris 11 can update more readily than Solaris 10

In the Solaris 10 and older update models, the way the updates were built
constrained what changes we could make in those releases. To change an
existing SVR4 package in those releases, we created a Solaris Patch, which
applied to a given version of the SVR4 package and replaced, added or deleted
files in it. These patches were released via the support websites (originally
SunSolve, now My Oracle Support)
for applying to existing Solaris 10 installations, and were also merged into
the install images for the next Solaris 10 update release.
(This Solaris
Patches blog post from Gerry Haskins dives deeper into that subject.)

Some of the restrictions of this model were that package refactoring, changes
to package dependencies, and even just changing the package version number,
were difficult to do in this hybrid patch/OS update model. For instance,
when Solaris 10 first shipped, it had the Xorg server from X11R6.8. Over the
first couple years of update releases we were able to keep it up to date by
replacing, adding, & removing files as necessary, taking it all the way
up to Xorg server release 1.3 (new version numbering begun after
the X11R7 split of the X11 tree into separate modules gave each module its
own version).
But if you run pkginfo on the SUNWxorg-server package, you’ll
see it still displayed a version number of 6.8, confusing users as to which
version was actually included.

We stopped upgrading the Xorg server releases in Solaris 10 after 1.3, as
later versions added new dependencies, such as
HAL,
D-Bus,
and libpciaccess,
which were very difficult to manage in this patching model. (We later got
libpciaccess to work, but HAL & D-Bus would have been much harder due to
the greater dependency tree underneath those.) Similarly, every time the
GNOME team looked into upgrading Solaris 10 past GNOME 2.6, they found these
constraints made it so difficult it wasn’t worthwhile, and eventually
GNOME’s dependencies had changed enough it was completely infeasible.
Fortunately, this worked out for both the X11 & GNOME teams, with our
management making the business decision to concentrate on the
“Nevada” branch for desktop users - first as Solaris Express
Desktop Edition, and later as OpenSolaris, so we didn’t have to fight
to try to make the package updates fit into these tight constraints.

Meanwhile, the team designing the new packaging system for Solaris 11 was
seeing us struggle with these problems, and making this much easier to manage
for both the development teams and our users was one of their big goals for
the IPS design they were working on.
Now that we’ve reached the first update release to Solaris 11, we can
start to see the fruits of their labors, with more FOSS updates in 11.1 than
we had in many Solaris 10 update releases, keeping software more up to date
with the upstream communities.

Of course, just because we can more easily update now, doesn’t always
mean we should or will do so, it just removes the package system limitations
from forcing the decision for us. So while we’ve upgraded the X Window
System in the 11.1 release from X11R7.6 to 7.7, the Solaris GNOME team decided
it was not the right time to try to make the jump from GNOME 2 to GNOME 3,
though they did update some individual components of the desktop, especially
those with security fixes like Firefox. In other parts of the system,
decisions as to what to update were prioritized based on how they affected
other projects, or what customer requests we’d gotten for them.

So with all that background in place, what packages did we actually update
or add between Solaris 11.0 and 11.1?

Core OS Functionality

One of the FOSS changes with the biggest impact in this release is the
upgrade from Grub Legacy (0.97) to Grub 2 (1.99) for the x64 platform boot
loader. This is the cause of one of the upgrade quirks, since to go from
Solaris 11.0 to 11.1 on x64 systems, you first need to update the Boot
Environment tools (such as beadm) to a new version that can handle boot
environments that use the Grub2 boot loader. System administrators can
find the details they need to know about the new Grub in the
Administering the GRand Unified Bootloader chapter of the
Booting and Shutting Down Oracle Solaris 11.1 Systems
guide. This change was necessary to be able to support new hardware coming
into the x64 marketplace, including systems using UEFI firmware or booting off
disk drives larger than 2 terabytes.

For both platforms, Solaris 11.1 adds
rsyslog as an optional alternative to
the traditional syslogd, and OpenSCAP
for checking security configuration settings are compliant with site policies.

Note that the support repo actually has newer versions of BIND & fetchmail
than the 11.1 release, as some late breaking critical fixes came through from
the community upstream releases after the Solaris 11.1 release was frozen, and
made their way to the support repository. These are responsible for the other
big upgrade quirk in this release, in which to upgrade a system which already
installed those versions from the support repo, you need to either wait for
those packages to make their way to the 11.1 branch of the support repo, or
follow the steps in the aforementioned upgrade walkthrough
to let the package system know it's okay to temporarily downgrade those.

Developer Stack

While Solaris 11.0 included Python 2.7, many of the bundled python modules
weren’t packaged for it yet, limiting its usability. For 11.1, many more
of the python modules include 2.7 versions (enough that I filtered them out of
the below table, but you can always search on
the package repository server for them.

For other language runtimes and development tools, 11.1 expands the use of
IPS mediated links
to choose which version of a package is the default when the packages are
designed to allow multiple versions to install side by side.

For instance, in Solaris 11.0, GNU automake 1.9 and 1.10 were provided, and
developers had to run them as either automake-1.9 or
automake-1.10. In Solaris 11.1, when automake 1.11 was added, also
added was a /usr/bin/automake mediated link, which points to the
automake-1.11 program by default, but can be changed to another
version by running the pkg set-mediator command.

Mediated links were also used for the Java runtime & development kits in
11.1, changing the default versions to the Java 7 releases (the 1.7.0.x
package versions), while allowing admins to
switch
links such as /usr/bin/javac back to Java 6
if they need to for their site, to deal with
Java 7 compatibility
or other issues, without having to update each usage to use the full versioned
/usr/jdk/jdk1.6.0_35/bin/javac paths for every invocation.

Desktop Stack

As I mentioned before, we upgraded from X11R7.6 to X11R7.7, since a pleasant
coincidence made the X.Org release dates line up nicely with our feature &
code freeze dates for this release. (Or perhaps it wasn’t so
coincidental, after all, one of the benefits of being
the person making the release
is being able to decide what schedule is most convenient for you, and this one
worked well for me.) For the table below, I’ve skipped listing the
packages in which we use the X11 “katamari” version for the Solaris
package version (mainly packages combining elements of multiple upstream
modules with independent version numbers), since they just all changed from
7.6 to 7.7.

In the graphics drivers, we worked with Intel to update the Intel Integrated
Graphics Processor support to support 3D graphics and kernel mode setting on
the Ivy Bridge chipsets, and updated Nvidia’s non-FOSS graphics driver
from 280.13 to 295.20.

Higher up in the desktop stack,
PulseAudio was added for audio support,
and liblouis for Braille support,
and the GNOME applications were built to use them.

The Mozilla applications, Firefox & Thunderbird moved to the current
Extended Support
Release (ESR) versions, 10.x for each, to bring up-to-date security fixes
without having to be on Mozilla’s agressive 6 week feature cycle release
train.

Detailed list of changes

This table shows most of the changes to the FOSS packages between Solaris 11.0
and 11.1. As noted above, some were excluded for clarity, or to reduce noise
and duplication. All the FOSS packages which didn't change the version number
in their packaging info are not included, even if they had updates to fix bugs,
security holes, or add support for new hardware or new features of Solaris.

Package

11.0

11.1

archiver/unrar

3.8.5

4.1.4

audio/sox

14.3.0

14.3.2

backup/rdiff-backup

1.2.1

1.3.3

communication/im/pidgin

2.10.0

2.10.5

compress/gzip

1.3.5

1.4

compress/xz

not included

5.0.1

database/sqlite-3

3.7.6.3

3.7.11

desktop/remote-desktop/tigervnc

1.0.90

1.1.0

desktop/window-manager/xcompmgr

1.1.5

1.1.6

desktop/xscreensaver

5.12

5.15

developer/build/autoconf

2.63

2.68

developer/build/autoconf/xorg-macros

1.15.0

1.17

developer/build/automake-111

not included

1.11.2

developer/build/cmake

2.6.2

2.8.6

developer/build/gnu-make

3.81

3.82

developer/build/imake

1.0.4

1.0.5

developer/build/libtool

1.5.22

2.4.2

developer/build/makedepend

1.0.3

1.0.4

developer/documentation-tool/doxygen

1.5.7.1

1.7.6.1

developer/gnu-binutils

2.19

2.21.1

developer/java/jdepend

not included

2.9

developer/java/jdk-6

1.6.0.26

1.6.0.35

developer/java/jdk-7

1.7.0.0

1.7.0.7

developer/java/jpackage-utils

not included

1.7.5

developer/java/junit

4.5

4.10

developer/lexer/jflex

not included

1.4.1

developer/parser/byaccj

not included

1.14

developer/parser/java_cup

not included

0.10

developer/quilt

0.47

0.60

developer/versioning/git

1.7.3.2

1.7.9.2

developer/versioning/mercurial

1.8.4

2.2.1

developer/versioning/subversion

1.6.16

1.7.5

diagnostic/constype

1.0.3

1.0.4

diagnostic/nmap

5.21

5.51

diagnostic/scanpci

0.12.1

0.13.1

diagnostic/wireshark

1.4.8

1.8.2

diagnostic/xload

1.1.0

1.1.1

editor/gnu-emacs

23.1

23.4

editor/vim

7.3.254

7.3.600

file/lndir

1.0.2

1.0.3

image/editor/bitmap

1.0.5

1.0.6

image/gnuplot

4.4.0

4.6.0

image/library/libexif

0.6.19

0.6.21

image/library/libpng

1.4.8

1.4.11

image/library/librsvg

2.26.3

2.34.1

image/xcursorgen

1.0.4

1.0.5

library/audio/pulseaudio

not included

1.1

library/cacao

2.3.0.0

2.3.1.0

library/expat

2.0.1

2.1.0

library/gc

7.1

7.2

library/graphics/pixman

0.22.0

0.24.4

library/guile

1.8.4

1.8.6

library/java/javadb

10.5.3.0

10.6.2.1

library/java/subversion

1.6.16

1.7.5

library/json-c

not included

0.9

library/libedit

not included

3.0

library/libee

not included

0.3.2

library/libestr

not included

0.1.2

library/libevent

1.3.5

1.4.14.2

library/liblouis

not included

2.1.1

library/liblouisxml

not included

2.1.0

library/libtecla

1.6.0

1.6.1

library/libtool/libltdl

1.5.22

2.4.2

library/nspr

4.8.8

4.8.9

library/openldap

2.4.25

2.4.30

library/pcre

7.8

8.21

library/perl-5/subversion

1.6.16

1.7.5

library/python-2/jsonrpclib

not included

0.1.3

library/python-2/lxml

2.1.2

2.3.3

library/python-2/nose

not included

1.1.2

library/python-2/pyopenssl

not included

0.11

library/python-2/subversion

1.6.16

1.7.5

library/python-2/tkinter-26

2.6.4

2.6.8

library/python-2/tkinter-27

2.7.1

2.7.3

library/security/nss

4.12.10

4.13.1

library/security/openssl

1.0.0.5 (1.0.0e)

1.0.0.10 (1.0.0j)

mail/thunderbird

6.0

10.0.6

network/dns/bind

9.6.3.4.3

9.6.3.7.2

package/pkgbuild

not included

1.3.104

print/filter/enscript

not included

1.6.4

print/filter/gutenprint

5.2.4

5.2.7

print/lp/filter/foomatic-rip

3.0.2

4.0.15

runtime/java/jre-6

1.6.0.26

1.6.0.35

runtime/java/jre-7

1.7.0.0

1.7.0.7

runtime/perl-512

5.12.3

5.12.4

runtime/python-26

2.6.4

2.6.8

runtime/python-27

2.7.1

2.7.3

runtime/ruby-18

1.8.7.334

1.8.7.357

runtime/tcl-8/tcl-sqlite-3

3.7.6.3

3.7.11

security/compliance/openscap

not included

0.8.1

security/nss-utilities

4.12.10

4.13.1

security/sudo

1.8.1.2

1.8.4.5

service/network/dhcp/isc-dhcp

4.1

4.1.0.6

service/network/dns/bind

9.6.3.4.3

9.6.3.7.2

service/network/ftp (ProFTPD)

1.3.3.0.5

1.3.3.0.7

service/network/samba

3.5.10

3.6.6

shell/conflict

0.2004.9.1

0.2010.6.27

shell/pipe-viewer

1.1.4

1.2.0

shell/zsh

4.3.12

4.3.17

system/boot/grub

0.97

1.99

system/font/truetype/liberation

1.4

1.7.2

system/library/freetype-2

2.4.6

2.4.9

system/library/libnet

1.1.2.1

1.1.5

system/management/cim/pegasus

2.9.1

2.11.0

system/management/ipmitool

1.8.10

1.8.11

system/management/wbem/wbemcli

1.3.7

1.3.9.1

system/network/routing/quagga

0.99.8

0.99.19

system/rsyslog

not included

6.2.0

terminal/luit

1.1.0

1.1.1

text/convmv

1.14

1.15

text/gawk

3.1.5

3.1.8

text/gnu-grep

2.5.4

2.10

web/browser/firefox

6.0.2

10.0.6

web/browser/links

1.0

1.0.3

web/java-servlet/tomcat

6.0.33

6.0.35

web/php-53

not included

5.3.14

web/php-53/extension/php-apc

not included

3.1.9

web/php-53/extension/php-idn

not included

0.2.0

web/php-53/extension/php-memcache

not included

3.0.6

web/php-53/extension/php-mysql

not included

5.3.14

web/php-53/extension/php-pear

not included

5.3.14

web/php-53/extension/php-suhosin

not included

0.9.33

web/php-53/extension/php-tcpwrap

not included

1.1.3

web/php-53/extension/php-xdebug

not included

2.2.0

web/php-common

not included

11.1

web/proxy/squid

3.1.8

3.1.18

web/server/apache-22

2.2.20

2.2.22

web/server/apache-22/module/apache-sed

2.2.20

2.2.22

web/server/apache-22/module/apache-wsgi

not included

3.3

x11/diagnostic/xev

1.1.0

1.2.0

x11/diagnostic/xscope

1.3

1.3.1

x11/documentation/xorg-docs

1.6

1.7

x11/keyboard/xkbcomp

1.2.3

1.2.4

x11/library/libdmx

1.1.1

1.1.2

x11/library/libdrm

2.4.25

2.4.32

x11/library/libfontenc

1.1.0

1.1.1

x11/library/libfs

1.0.3

1.0.4

x11/library/libice

1.0.7

1.0.8

x11/library/libsm

1.2.0

1.2.1

x11/library/libx11

1.4.4

1.5.0

x11/library/libxau

1.0.6

1.0.7

x11/library/libxcb

1.7

1.8.1

x11/library/libxcursor

1.1.12

1.1.13

x11/library/libxdmcp

1.1.0

1.1.1

x11/library/libxext

1.3.0

1.3.1

x11/library/libxfixes

4.0.5

5.0

x11/library/libxfont

1.4.4

1.4.5

x11/library/libxft

2.2.0

2.3.1

x11/library/libxi

1.4.3

1.6.1

x11/library/libxinerama

1.1.1

1.1.2

x11/library/libxkbfile

1.0.7

1.0.8

x11/library/libxmu

1.1.0

1.1.1

x11/library/libxmuu

1.1.0

1.1.1

x11/library/libxpm

3.5.9

3.5.10

x11/library/libxrender

0.9.6

0.9.7

x11/library/libxres

1.0.5

1.0.6

x11/library/libxscrnsaver

1.2.1

1.2.2

x11/library/libxtst

1.2.0

1.2.1

x11/library/libxv

1.0.6

1.0.7

x11/library/libxvmc

1.0.6

1.0.7

x11/library/libxxf86vm

1.1.1

1.1.2

x11/library/mesa

7.10.2

7.11.2

x11/library/toolkit/libxaw7

1.0.9

1.0.11

x11/library/toolkit/libxt

1.0.9

1.1.3

x11/library/xtrans

1.2.6

1.2.7

x11/oclock

1.0.2

1.0.3

x11/server/xdmx

1.10.3

1.12.2

x11/server/xephyr

1.10.3

1.12.2

x11/server/xorg

1.10.3

1.12.2

x11/server/xorg/driver/xorg-input-keyboard

1.6.0

1.6.1

x11/server/xorg/driver/xorg-input-mouse

1.7.1

1.7.2

x11/server/xorg/driver/xorg-input-synaptics

1.4.1

1.6.2

x11/server/xorg/driver/xorg-input-vmmouse

12.7.0

12.8.0

x11/server/xorg/driver/xorg-video-ast

0.91.10

0.93.10

x11/server/xorg/driver/xorg-video-ati

6.14.1

6.14.4

x11/server/xorg/driver/xorg-video-cirrus

1.3.2

1.4.0

x11/server/xorg/driver/xorg-video-dummy

0.3.4

0.3.5

x11/server/xorg/driver/xorg-video-intel

2.10.0

2.18.0

x11/server/xorg/driver/xorg-video-mach64

6.9.0

6.9.1

x11/server/xorg/driver/xorg-video-mga

1.4.13

1.5.0

x11/server/xorg/driver/xorg-video-openchrome

0.2.904

0.2.905

x11/server/xorg/driver/xorg-video-r128

6.8.1

6.8.2

x11/server/xorg/driver/xorg-video-trident

1.3.4

1.3.5

x11/server/xorg/driver/xorg-video-vesa

2.3.0

2.3.1

x11/server/xorg/driver/xorg-video-vmware

11.0.3

12.0.2

x11/server/xserver-common

1.10.3

1.12.2

x11/server/xvfb

1.10.3

1.12.2

x11/server/xvnc

1.0.90

1.1.0

x11/session/sessreg

1.0.6

1.0.7

x11/session/xauth

1.0.6

1.0.7

x11/session/xinit

1.3.1

1.3.2

x11/transset

0.9.1

1.0.0

x11/trusted/trusted-xorg

1.10.3

1.12.2

x11/x11-window-dump

1.0.4

1.0.5

x11/xclipboard

1.1.1

1.1.2

x11/xclock

1.0.5

1.0.6

x11/xfd

1.1.0

1.1.1

x11/xfontsel

1.0.3

1.0.4

x11/xfs

1.1.1

1.1.2

P.S. To get the version numbers for this table, I ran a quick perl script over
the output from:

Thursday Oct 25, 2012

One of the first places you can see Solaris 11.1 changes are in the docs,
which have now been posted in
the Solaris 11.1 Library
on docs.oracle.com.
I spent a good deal of time reviewing documentation for this release,
and thought some would be interesting to blog about, but didn't review
all the changes (not by a long shot), and am not going to cover all
the changes here, so there's plenty left for you to discover on your own.

Just comparing the
Solaris 11.1 Library list of
docs against the
Solaris 11 list
will show a lot of reorganization and refactoring of the doc set,
especially in the system administration guides. Hopefully the new break down
will make it easier to get straight to the sections you need when a task is
at hand.

Packaging System

Unfortunately, the excellent in-depth guide for how to build packages for the
new Image Packaging System (IPS) in Solaris 11 wasn't done in time
to make the initial Solaris 11 doc set. An interim version was
published shortly after release, in PDF form on the
OTN IPS page. For Solaris 11.1 it was included
in the doc set, as Packaging and Delivering Software With the Image Packaging System in Oracle Solaris 11.1, so should be easier to find, and
easier to share links to specific pages the HTML version.

Also added in this release is a document containing the
lists of
all the packages in each of the major package groups in Solaris 11.1
(solaris-desktop, solaris-large-server, and solaris-small-server).
While you can simply get the contents of those groups from the package
repository, either via the web interface or the pkg command line, the
documentation puts them in handy tables for easier side-by-side comparison,
or viewing the lists before you've installed the system to pick which one
you want to initially install.

Security

One of the things Oracle likes to do for its products is to publish security
guides for administrators & developers to know how to build systems that
meet their security needs. For Solaris, we started this with Solaris 11,
providing a guide for sysadmins to find where the security relevant
configuration options were documented. The
Solaris 11.1 Security Guidelines
extend this to cover new security features, such as
Address Space Layout Randomization (ASLR) and
Read-Only Zones, as well as adding additional guidelines for existing
features, such as
how to limit the size of tmpfs filesystems, to avoid users
driving the system into swap thrashing situations.

In parallel, we updated the
Solaris C Libary Functions security considerations list with
details of Solaris 11 enhancements such as FD_CLOEXEC flags, additional *at()
functions, and new stdio functions such as
asprintf() and
getline().
A number of code examples throughout the Solaris 11.1 doc set were updated to
follow these recommendations, changing unbounded strcpy() calls to strlcpy(),
sprintf() to snprintf(), etc. so that developers following our examples start
out with safer code. The Writing Device Drivers guide even had the
appendix updated to list which of these utility functions,
like snprintf() and strlcpy(), are now available via the Kernel DDI.

Little Things

Of course all the big new features got documented, and some major efforts
were put into refactoring and renovation, but there were also a lot of smaller
things that got fixed as well in the nearly a year between the Solaris 11
and 11.1 doc releases - again too many to list here, but a random sampling
of the ones I know about & found interesting or useful:

The sample dcmd sources in /usr/demo/mdb were updated to
include ::help output, so that developers like myself who follow the
examples don't forget to include it (until a helpful code reviewer pointed it
out while reviewing the mdb module changes for Xorg 1.12).
The README file in that directory was updated to show the correct paths for
installing both kernel & userspace modules, including the 64-bit variants.

Saturday Mar 31, 2012

As you probably know by now, a few months ago, we released
Solaris 11 after years of development.
That of course means we now need to figure out what comes next -
if Solaris 11 is “The First Cloud OS”, then what do
we need to make future releases of Solaris be, to be modern and
competitive when they're released? So we've been having planning
and brainstorming meetings, and I've captured some notes here from
just one of those we held a couple weeks ago with a number of the
Silicon Valley based engineers.

Now before someone sees an idea here and calls their product rep
wanting to know what's up, please be warned what follows are rough
ideas, and as I'll discuss later, none of them have any committment,
schedule, working code, or even plan for integration in any possible
future product at this time. (Please don't make me force you to read
the full Oracle future product disclaimer here, you should know it by
heart already from the front of every Oracle product slide deck.)

To start with, we did some background research, looking at ideas
from other Oracle groups, and
competitive OS'es. We examined what was hot in
the technology arena and where the interesting startups were heading. We
then looked at Solaris to see where we could apply those ideas.

Making Network Admins into Socially Networking Admins

We all know an admin who has grumbled about being the only one stuck late
at work to fix a problem on the server, or having to work the weekend alone
to do scheduled maintenance. But admins are humans (at least most are), and
crave companionship and community with their fellow humans. And even when
they're alone in the server room, they're never far from a network connection,
allowing access to the wide world of wonders on the Internet.

Our solution here is not building a new social network - there's enough of
those already, and Oracle even has its
own Oracle Mix social network already. What we proposed is integrating
Solaris features to help engage our system admins with these social networks,
building community and bringing them recognition in the workplace, using
achievement recognition systems as found in many popular gaming platforms.

For instance, if you had a Facebook account, and a group of admin friends
there, you could register it with our Social Network Utility For Facebook,
and then your friends might see:

Alan earned the
achievement Critically Patched (April 2012) for patching all his
servers.Matt is only at 50% -
encourage him to complete this achievement today!

To avoid any undue risk of advertising who has unpatched servers that are
easier targets for hackers to break into, this information would be tightly
protected via Facebook's world-renowned privacy settings to avoid it falling
into the wrong hands.

A related form of
gamification we
considered was replacing simple certfications with role-playing-game-style Experience
Levels. Instead of just knowing an admin passed a test establishing a
given level of competency, these would provide recruiters with a more
detailed level of how much real-world experience an admin has. Achievements such
as the one above would feed into it, but larger numbers of experience
points would be gained by tougher or more critical tasks - such as recovering
a down system, or migrating a service to a new platform. (As long as it
was an Oracle platform of course - migrating to an HP or IBM platform would
cause the admin to lose points with us.)

Unfortunately, we couldn't figure out a good way to prevent (if you will)
“gaming” the system. For instance, a disgruntled admin might
decide to start ignoring warnings from FMA that a part is beginning to fail
or skip preventative maintenance, in the hopes that they'd cause a
catastrophic failure to earn more points for bolstering their resume as they
look for a job elsewhere, and not worrying about the effect on your business
of a mission critical server going down.

More Z's for ZFS

Our suggested new feature for ZFS was inspired by the worlds most successful
Z-startup of all time: Zynga.

Using the Social Network Utility For Facebook described above,
we'd tie it in with ZFS monitoring to help you out when you find yourself
in a jam needing more disk space than you have, and can't wait a month to
get a purchase order through channels to buy more. Instead with the click
of a button you could post to your group:

Alan can't find any
space in his server farm! Can you help?

Friends could loan you some space on their connected servers for a few
weeks, knowing that you'd return the favor when needed. ZFS would
create a new filesystem for your use on their system, and securely
share it with your system using Kerberized NFS.

If none of your friends have space, then you could buy temporary use space
in small increments at affordable rates right there in Facebook, using your
Facebook credits, and then file an expense report later, after the urgent
need has passed.

Universal Single Sign On

One thing all the engineers agreed on was that we still had far too many
"Single" sign ons to deal with in our daily work. On the web, every web
site used to have its own password database, forcing us to hope we could
remember what login name was still available on each site when we signed
up, and which unique password we came up with to avoid having to disclose
our other passwords to a new site.

In recent years, the web services world has finally been reducing the number
of logins we have to manage, with many services allowing you to login using
your identity from Google, Twitter or Facebook. So we proposed following
their lead, introducing PAM modules for web services - no more would you have
to type in whatever login name IT assigned and try to remember the password
you chose the last time password aging forced you to change it - you'd
simply choose which web service you wanted to authenticate against, and
would login to your Solaris account upon reciept of a cookie from their
identity service.

Pinning notes to the cloud

We also all noted that we all have our own pile of notes we keep in our daily
work - in text files in our home directory, in notebooks we carry
around, on white boards in offices and common areas, on sticky notes on our
monitors, or on scraps of paper pinned to our bulletin boards. The contents
of the notes vary, some are things just for us, some are useful for our groups,
some we would share with the world.

For instance, when our group moved to a new building a couple years ago,
we had a white board in the hallway listing all the NIS & DNS servers,
subnets, and other network configuration information we needed to set up
our Solaris machines after the move. Similarly, as Solaris 11 was finishing
and we were all learning the new network configuration commands, we shared
notes in wikis and e-mails with our fellow engineers.

Users may also remember one of the popular features of Sun's old BigAdmin
site was a section for sharing scripts and tips such as these. Meanwhile,
the online "pin board" at Pinterest is
taking the web by storm. So we thought, why not mash those up to solve
this problem?

We proposed a new BigAddPin site where users could “pin”
notes, command snippets, configuration information, and so on. For instance,
once they had worked out the ideal Automated Installation manifest for their
app server, they could pin it up to share with the rest of their group, or
choose to make it public as an example for the world. Localized data,
such as our group's notes on the servers for our subnet, could be shared
only to users connecting from that subnet. And notes that they didn't want
others to see at all could be marked private, such as the list of phone
numbers to call for late night pizza delivery to the machine room, the
birthdays and anniversaries they can never remember but would be sleeping
on the couch if they forgot, or the list of automatically generated
completely random, impossible to remember root passwords to all their servers.

For greater integration with Solaris, we'd put support right into the
command shells — redirect output to a pinned note, set your path to include
pinned notes as scripts you can run, or bring up your recent shell history
and pin a set of commands to save for the next time you need to remember how
to do that operation.

Location service for Solaris servers

A longer term plan would involve convincing the hardware design groups
to put GPS locators with wireless transmitters in future server designs.
This would help both admins and service personnel trying to find servers
in todays massive data centers, and could feed into location presence apps
to help show potential customers that while they may not see many Solaris
machines on the desktop any more, they are all around. For instance,
while walking down Wall Street it might show “There are
over 2000 Solaris computers in this block.”

[Note: this proposal was made before the recent
media coverage of
a
location service aggregrator app with less noble intentions, and in hindsight,
we failed to consider what happens when such data similarly falls into the
wrong hands. We certainly wouldn't want our app to be misinterpreted as
“There are over $20 million dollars of SPARC servers in this building,
waiting for you to steal them.” so it's probably best it was rejected.]

Harnessing the power of the GPU for Security

Most modern OS'es make use of the widespread availability of high powered
GPU hardware in today's computers, with desktop environments requiring 3-D
graphics acceleration, whether in Ubuntu Unity, GNOME Shell on Fedora, or
Aero Glass on Windows, but we haven't yet made Solaris fully take advantage
of this, beyond our basic offering of Compiz on the desktop.

Meanwhile, more businesses are interested in increasing security by using
biometric authentication, but must also comply with laws in many countries
preventing discrimination against employees with physical limations such
as missing eyes or fingers, not to mention the lost productivity when
employees can't login due to tinted contacts throwing off a retina scan
or a paper cut changing their fingerprint appearance until it heals.

Fortunately, the two groups considering these problems put their heads
together and found a common solution, using 3D technology to enable
authentication using the one body part all users are guaranteed to have -
pam_phrenology.so, a new PAM module that uses an array USB attached web
cams (or just one if the user is willing to spin their chair during login)
to take pictures of the users head from all angles, create a 3D model and
compare it to the one in the authentication database. While
Mythbusters
has shown how easy it can be to fool common fingerprint scanners, we have
not yet seen any evidence that people can impersonate the shape of
another user's cranium, no matter how long they spend beating their head
against the wall to reshape it.

Unfortunately, there are still some unsolved technical challenges we haven't
figured out how to overcome.
Currently, a visit to the hair salon causes your existing authentication to
expire, and some users have found that shaving their heads is the only way
to avoid bad hair days becoming bad login days.

Reaction to these ideas

After gathering all our notes on these ideas from the engineering
brainstorming meeting, we took them in to present to our management.
Unfortunately, most of their reaction cannot be printed here, and they chose
not to accept any of these ideas as they were, but they did have some
feedback for us to consider as they sent us back to the drawing board.

They strongly suggested our ideas would be better presented if we
weren't trying to decipher ink blotches that had been smeared by the
condensation when we put our pint glasses on the napkins we were taking
notes on, and to that end let us know they would not be approving any
more engineering offsites in Irish themed pubs on the Friday of a Saint
Patrick's Day weekend. (Hopefully they mean that situation specifically and
aren't going to deny the funding for travel to this year's
X.Org Developer's
Conference just because it happens to be in Bavaria and ending on the
Friday of the weekend Oktoberfest
starts.)

They also mentioned that Oracle hadn't fully adopted some of
Sun's common practices and we might have to try
harder to get those to be accepted now that we are one unified company.

So as I said at the beginning, don't pester your sales rep just yet for
any of these, since they didn't get approved, but if you have better ideas,
pass them on and maybe they'll get into our next batch of planning.

Monday Dec 19, 2011

Last week, Brendan Gregg, who wrote the book on DTrace, published a new tool for visualizing hot sections of code by sampling the process stack every few milliseconds and then graphing which stacks were seen the most during the run: Flame Graphs. This looked neat, but my experience has been that I get the most understanding out of things like this by trying them out myself, so I did.

Fortunately, it's very easy to setup, as Brendan provided the tools as two standalone perl scripts you can download from the Flame Graph github repo. Then the next step is deciding what you want to run it on and capturing the data from a run of that.

This was due to two of the probes I'd defined - when the Xserver processes a request from a client, there's a request-start probe just before the request is processed, and a request-done probe right after the request is processed. If you just want to see what requests a client is making you can trace either one, but if you want to measure the time taken
to run a request, or determine if something is happening while a request is being processed, you need both of the probes.
When they first got integrated, the code was simple:

#ifdef XSERVER_DTRACE

XSERVER_REQUEST_START(GetRequestName(MAJOROP), MAJOROP,

((xReq *)client->requestBuffer)->length,

client->index, client->requestBuffer);

#endif

// skipping over input checking and auditing, to the main event,

// the call to the request handling proc for the specified opcode:

result =(* client->requestVector[MAJOROP])(client);

#ifdef XSERVER_DTRACE

XSERVER_REQUEST_DONE(GetRequestName(MAJOROP), MAJOROP,

client->sequence, client->index, result);

#endif

The compiler sees XSERVER_REQUEST_START and XSERVER_REQUEST_DONE as simple function calls, so it does whatever work is necessary to set up their arguments and then calls them. Later, during the linking process, the actual
call instructions are replaced with noops and the addresses recorded so that when a dtrace user enables the probe the call
can be activated at that time. In these cases, that's not so bad, just a bunch of register access and memory loads of things that are going to be needed nearby. The one outlier is GetRequestName(MAJOROP) which looks like a function call, but was really just a macro that used the opcode as the index an array of strings and returned the string name for the opcode so that DTrace probes could see the request names, especially for extensions which don't have static opcode mappings. For that the compiler would just load a register with the address of the base of the array and then add the offset of the entry specified by MAJOROP in that array.

All was well and good for a bit, until a later project came along during the Xorg 1.5 development cycle to unify all the different lists of protocol object names in the Xserver, as there were different ones in use by the DTrace probes, the security extensions, and the resource management system. That replaced the simple array lookup macro with a function call. While the function doesn't do a lot more work, it does enough to be noticed, and thus the performance hit was taken in the hot path of request dispatching. Adam's patch to fix this simply uses is-enabled probes to only make those function calls when the probes are actually enabled. x11perf testing showed the win on a Athlon 64 3200+ test system running Solaris 11:

To explain this command, I'll start at the end. For xinit, everything after the double dash is a set of arguments to the Xserver it starts, in this case, Xorg is told to look in the normal config paths for xorg.conf.dummy, which it would find is this simple config file in /etc/X11/xorg.conf.dummy setting the driver to the Xvfb-like “dummy” driver, which just uses RAM as a frame buffer to take graphics driver considerations out of this test:

Section "Device"
Identifier "Card0"
Driver "dummy"
EndSection

Since I'm using a modern Xorg version, that's all the configuration needed, all the unspecified sections are autoconfigured.
xinit starts the Xserver, waits for the Xserver to signal that it's finished its start up, and then runs the first half of the command line as a client with the DISPLAY set to the new Xserver. In this case it runs the dtrace command, which sets up the probes based on the examples in the Flame Graphs README, and then runs the command specified as the -c argument, the x11perf benchmark tool. When x11perf exits, dtrace stops the probes, generates its report, and then exits itself, which in turn causes xinit to shut down the X server and exit.

The resulting Flame Graphs are, in their native SVG interactive form:

Before

After

You can see in the first one a bar showing a little over 10% of the time was in stacks involving LookupMajorName, which is completely gone in the second patch. Those who saw Adam's patch series come across the xorg-devel list last week may also notice the presence of XaceHook calls, which Adam optimized in another patch. Unfortunately, while I did build with that patch as well, we don't get the benefits of it since the XC-Security extension is on by default, and those fill in the hooks, so it can't just bypass them as it does when the hooks are empty.

I also took measurements of what Xorg did as gdm started up and a test user logged in, which produced the much larger flame graph you can see in SVG or
PNG. As you can see the recursive calls in the font catalogue scanning functions make for some really tall flame graphs. You can also see that, to no one's surprise, xf86SlowBCopy is slow, and a large portion of the time is spent “blitting” bits from one place to another. Some potential areas for improvement stand out - like the 5.7% of time spent rescanning the font path because the Solaris gnome session startup scripts make xset fp calls to add the fonts for the current locale to the legacy font path for old clients that still use it, and another nearly 5% handling the ListFonts and ListFontsWithInfo calls, which dtracing with the request-start probes turned out to be the Java GUI for the Solaris Visual Panels gnome-panel applet.

Now because of the way the data for these is gathered, from looking at them alone you can't tell if a wide bar is one really long call to a function (as it is for the main() function bar in all these) or millions of individual calls (as it was for the ProcNoOperation calls in the x11perf -noop trace), but it does give a quick and easy way to pick out which functions the program is spending most of its time in, as a first pass for figuring out where to dig deeper for potential performance improvements.

Brendan has made these scripts easy to use to generate these graphs, so I encourage you to try them out as well on some sample runs to get familiar with them, so that when you really need them, you know what cases they're good for and how to capture the data and generate the graphs for yourself. Trying really is the best method of learning.