Posted
by
timothy
on Friday October 07, 2011 @09:49PM
from the dplane-dplane-no-that's-fantasy-island dept.

mvar writes "Dtrace co-author Adam Leventhal writes on his blog about Dtrace for Linux: 'Yesterday (October 4, 2011) Oracle made the surprising announcement that they would be porting some key Solaris features, DTrace and Zones, to Oracle Enterprise Linux. As one of the original authors, the news about DTrace was particularly interesting to me, so I started digging. Even among Oracle employees, there's uncertainty about what was announced. Ed Screven gave us just a couple of bullet points in his keynote; Sergio Leunissen, the product manager for OEL, didn't have further details in his OpenWorld talk beyond it being a beta of limited functionality; and the entire Solaris team seemed completely taken by surprise. Leunissen stated that only the kernel components of DTrace are part of the port. It's unclear whether that means just fbt or includes sdt and the related providers. It sounds certain, though, that it won't pass the DTrace test suite which is the deciding criterion between a DTrace port and some sort of work in progress.'"

As long as the leave strace in place. Apple replaced ktrace with dtrace and I've been hating it ever since.It's not that dtrace is bad, it's just that they have different purposes, and dtruss has several problems ktrace/strace does not:

It's asynchronous. Meaning it won't output write(1, "foo\n", 4); next to the actual output of foo.

So, are they porting Solaris functionality to OEL as a precursor to phasing out Solaris entirely? It would suck to see Solaris go from a nostalgia point of view, but it never made much sense to me why one company would continue to develop two Unix-like operating systems.

Phase out SPARC: probably exactly what they are planning. Add the features that are cool to OEL and then discontinue SPARC and Solaris. If you really really want Solaris you can still got for Open Solaris, otherwise go for OEL that would be my guess on their strategy.

Oh crap so they did. That is what you get when you don't admin systems for a couple years:-) That stinks. Open Solaris was a bit crippled if I recall a lot of the features that were in the latest releases of Solaris weren't in the open version.

Solaris on x86 is a sad joke. The name "slowlaris" came about because of that hideous port.

Ever tried to run Solaris x86 on a normal PC? If you think Linux can't handle basic I/O on slow devices (really, try copying from multiple USB sticks at once and try to use your desktop to the same time!), Solaris lags up with the slightest bit of HDD I/O.

So, are they porting Solaris functionality to OEL as a precursor to phasing out Solaris entirely? It would suck to see Solaris go from a nostalgia point of view, but it never made much sense to me why one company would continue to develop two Unix-like operating systems.

I think it's more that they want to continue to differentiate OEL from RHEL and provide a direct migration path for RedHat customers to a full Oracle system.

Linux just doesn't make sense in my mind for the space Oracle's software competes in. It's not enterprise friendly. No stable driver ABI. No system interface stability standards. Nothing like Projects, iostat doesn't show tape drives, kernel and userland lack cohesion, to name few of my personal nitpicks, but overall... very little progress. A lo

Just some ideas that come to mind and show that actually Solaris isn't as innovative as the Linux camp...

Linux kernel innovations which are more flexible than most alternatives:

SELinux - label-based security which allows flexible controls over the privileges of objects and how they may interact with other labelled objects [Trusted Solaris]TOMOYO - pathname+history-based security which provides flexible controls over the privileges of processes and how they may interact with objectsAppArmor - pathname based

Yeah, Solaris has some pretty awesome features, but at the end of the day all that may be irrelevant in the face of Market Pressures. Sun for many years shot themselves in the foot by failing to deliver useful tools for things like patching/updating, mass installation of Solaris servers (yes, there is jumpstart/wanboot, but it is clearly deficient), and failing to deliver a decent native volume manager (ZFS) until Too Late, and then not having it support root filesystems until Way Too Late.

The reality of Solaris is that there are all these features that look awesome in theory, until you actually have to implement them and discover the practical implications. Take Zones. Zone sounds great, in theory. But, ever tried to patch a server with zones? It's a nightmare. And heaven help you if you actually have a server with zones from multiple, different apps and you need to get outage windows from all the different app groups in order to patch. Or LDoms. Again, they sound awesome. That is, until you realize that there are no tools to manage migrations when a server goes down hard (the most common case for which you would want to do a migration!) So, you end up having to write a bunch of scripts to duplicate LDom xml files etc. to do this, because Sun/Oracle didn't really think through how their technology would be used in a real environment. I also use AIX virtualization technology, and it's much better, and VMWare (which is what we use for Linux servers) blows them both out of the water.

Things like this are why a lot of major companies, including the one I work at, are leaving Solaris as fast as they can. The reality is that it takes twice as many SA's per server on Solaris as it does for any other platform, we have lower virtualization densities, and it therefore costs a lot more money to run. For the kind of money we're talking about, we can deal with a few echoes in the interface for SAN's.

Yeah, Solaris has some pretty awesome features, but at the end of the day all that may be irrelevant in the face of Market Pressures

Sad but true. I remember the days when most freeware built and ran out of the box on SunOS/Solaris - now Linux-centrism is rampant, and a growing number of packages are difficult or impossible to build elsewhere.

Sun for many years shot themselves in the foot by failing to deliver useful tools for things like patching/updating

ORLY? Live Upgrade. It has its issues, but so far I've seen nothing close to it in the Linux world. When doing a big yum update, one has to cross fingers and hope everything still works afterwards, as there's no going back.

mass installation of Solaris servers (yes, there is jumpstart/wanboot, but it is clearly deficient)

Other than wanboot only working on (newer) SPARC systems and not x86/x64

ORLY? Live Upgrade. It has its issues, but so far I've seen nothing close to it in the Linux world. When doing a big yum update, one has to cross fingers and hope everything still works afterwards, as there's no going back.

Running LiveUpgrade in a large-scale production environment is kind of like baby-sitting someone else's four-year-old. When everything's going well you say to yourself, "wow, this is pretty nice! Maybe I'll even have one of these of my own someday!" Then the four-year-old has a meltdo

Running LiveUpgrade in a large-scale production environment is kind of like baby-sitting someone else's four-year-old.

Perhaps. I did say that it has its issues, eg. the expansion of sparse files that I submitted a bug for. I ran an LU on a specific system that had.dir/.pag files, and was startled when the new/var filled. Sun eventually put out an LU patch to address that with a new cpio, and then months later finally followed with x86.

When it doesn't work, heaven help you, because Oracle can't/won't fix it or give you any reasonable support in any reasonable period of time.

AS per most of the problems listed: too little, too late. The boat for solaris sailed long ago. I'm keen to see whether or not Opensolaris goes anywhere, but unfortunately the free unix market is probably fairly well covered.

Linux is released under version 2 of the GNU General Public License. This imposes a few restrictions and says that the code may not be distributed linked to any code that imposes more restrictions, nor can any derived works impose any more restrictions than are present in the license.

FreeBSD is released under the 2-clause BSD license, which says, basically, do what you want with this, just don't sue me if it doesn't work and don't claim you wrote it.

OpenSolaris was released under the CDDL, which is generally less restrictive than the GPL (no restrictions on what you can link it to), but adds some anti-patent clauses that are not present in the GPL. Because these restrictions are not present in the GPL, the GPL prevents CDDL code from being linked against it. This means that if ZFS or DTrace were ever ported to Linux by anyone other than the copyright holder they would not be allowed to distribute Linux along with their port.

In FreeBSD, ZFS and DTrace are optional kernel modules, so you can still build a system without them, but they are loadable if you are happy to accept the terms of the CDDL when you distribute FreeBSD. There's no technical reason why either couldn't be ported to any system (well, the Linux storage stack is a mess, so adding ZFS would be a bit harder, but it could be done), but few people are motivated to produce a port when they are not legally allowed to redistribute it.

as does the disclaimer on zfsonlinux.org, but I've read before Sun was hired to do the actual work. The purpose was to host Lustre on ZFS, and Sun owned both properties at the time. I can't find a specific citation in the 5 minutes I gave it.

Well, whether or not that has any bearing on ZFS, unless they change the license on ZFS, they wouldn't have the ability to do it anyways. Or at least not without violating the terms of at least one of the involved licenses.

This is quite true... because frankly, the entire fs/lvm/raid stack sucks
with big disks.

ZFS solution is a really good answer to a lot of problems -- scalability, manageability, reliability.

It would be a very good thing if Oracle would execute a port of ZFS to Linux (under the GPL, of course),
and while they are at it... port AVS and Open HA/Cluster, as a superior alternative to DRBD,
port the SMF as a replacement for init, the fault man

Agreed, and it reminds me of the #1 argument against ZFS: BUT BUT BUT IT BREAKS LAYERS, OH NOES!

And the second one: ZFS doesn't have fsck!! What's one going to do without a proper fsck? ZFS IS A JOKE!

Once you get familiar with ZFS you realize how much sense it makes, even if it violates "rules". For example, someone complained about the fact that zfs does LOTS of checksums, wasting CPU cycles; and doesn't have fsck. Well, it doesn't have fsck BECAUSE it does lots of checksums. Do you have time to wait for

The worst thing about the first complaint is that ZFS actually does have very clean layering. At the bottom you have the storage pool allocator at the bottom is basically malloc() for persistent storage (equivalent to the block device layer, but with a more convenient interface), the data management unit sits on top of this and provides transactional I/O to the underlying storage, and the ZFS POSIX layer sits on top of this and provides POSIX filesystem semantics. You could replace the ZPL with something

ZFS has two major issues missing. You can't scrub a file as in "I have a legal obligation to overwrite the contents of this file", you just can't ever do it on ZFS. The other problem with ZFS is that there is no way to tell it "This file is magic for the boot process, please put it in the first N physical sectors on the physical disk Y"

ZFS has two major issues missing. You can't scrub a file as in "I have a legal obligation to overwrite the contents of this file"

You're right that there are some extremely unusual requirements that ZFS won't meet.
You also can't use SSDs with wear levering or any decently modern hard drive, because
modern hard drives have block relocation functions; sometimes a sector is taken
out of service and replaced with a "spare block"; if it is found to be failing or predicted to fail.
The result is that on moder

For example, someone complained about the fact that zfs does LOTS of checksums, wasting CPU cycles; and doesn't have fsck. Well, it doesn't have fsck BECAUSE it does lots of checksums. Do you have time to wait for that 20TB arrayto finish fscking?

Which is a nice idea until something (for example a ZFS bug or a hardware glitch) causes part of the ZFS metadata to become corrupted, and the only way to get your 20 TB array working again is to dump all of the data somewhere else, recreate the array from scratch, and reload all the data.

Which is a nice idea until something (for example a ZFS bug or a hardware glitch) causes part of the ZFS metadata to become corrupted

The ZFS pool root metadata is protected by having 3 copies of it on the file system, and metadata blocks are checksumed just as the rest of the data blocks.

"dump all of the data somewhere else, recreate the array from scratch, and reload all the data"
is really something you should never have to do with ZFS, unless you actually have a storage device failure, and you don'

For example, someone complained about the fact that zfs does LOTS of checksums

Yes it does.

However, thanks to a new enhancement in the Xeon 5500 CPUs, called SSE4.2 instruction set,
there is actually a CPU instruction for CRC32 accumulation.

Oh wow... we're going to use a couple extra CPU clock cycles per read and write to protect our actual data integrity, to allow us to do "online scrubs" of the filesystem to check for any surface errors instead of some limited arcane "filesystem metadata level con

Yes I forgot about new CPU instructions for checksumming and stuff, but even without them, CPUs today are SO powerful that I doubt you can peg all CPU cores. And if you're running a 20TB+ array I doubt you'll be running it in the same physical box as the DB server. You probably have a DB server with a ZFS backing store, connected by 10gbe or infiniband, or some other means.

If you have a single box, probably even multiple multi-core CPUs. So hitting a 100% CPU usage would be unusual (I mean, you're supposed

Some folks get confused because ZFS runs on Solaris (or any other compatible OS), so people think it's a filesystem. So yes, it is, and it can perfectly run as a filesystem for a SMALL database and appserver on the same host. Small as in "test server".

But ZFS is more suited to dedicated storage cabinets rather than a local FS. You couldn't run Oracle on your Netapp storage server, this is the same. One box (or more) for storage, another box (or more) for db, app, etc.

Developing something like ZFS is pretty simple. Just get some of the world's best programmers, have them work in a project, with good management and leadership, and test and fix bugs. And tell them they're free to do it ANY way they want, there's no "right" or "wrong". There's no obligation in using md, LVM and an FS on top. In fact, md and LVM were created to overcome the filesystem's limitations. If you're creating a fs from the ground up, you might as well skip md and LVM. But some people in mailing list

Until we realize that some network topologies and setups, not to mention connection requirements, well, require a session layer - say, SIP. If you haven't noticed, the IT world has, more or less standardized on a presentation layer protocol - HTTP. Oh, and the application layer is just the semantics the app assigns to the data received by HTTP. So, in short, those who do not know their history, are doomed to repeat it. That aside, I believe that ZFS has layering, just not identical to the classical Unix im

If you're creating a filesystem and you can make it aware of its own backing storage (and adjust stuff like block size - cause you know, there are disks with 4K sectors now), and have it manage caching by itself (and thus, be aware of how much memory the has, and how much of it is actually RAM and not virtual), and have it check for redundancy and do online checks and repairs - which you realize it's just awesome if you ever try to do fsck on a 20TB filesystem (and because it knows how much data it's actually used, have it only check what's used, instead of blindly regenerating blank space for an array of disks). And variable strip size, and thin provisioning, CoW and free snapshots and clones, and a lot of other stuff ZFS does because it doesn't need to "respect its elders" LVM and md.

ZFS basically does everything I would want in a server file system (hell: any filesystem) and it's a crime that it hasn't been ported to Linux (and that OpenSolaris is sufficiently dead now that you can't run it on newer hardware at all).

We live in an age of 4-core CPUs being commodity items - ANY system can spare the cycles to ensure it's own data integrity since what else are you going to do with them? (alternately: if you need them, then why does that particular machine deal with having a hard disk anywa

OpenSolaris may be dead, but FreeBSD now ships with version 28 of ZFS, which includes nice things like deduplication. iXSystems, which sells storage appliances based on FreeBSD, is funding ongoing development work, so ZFS on FreeBSD is going to stay actively maintained irrespective of whether Oracle makes future versions of the code available. FreeBSD also got the ability to boot from ZFS before Solaris and integrates ZFS very nicely into the existing storage stack (it works as a GEOM consumer and provide

I've been using Nexenta to get ZFS for about 5 years, however earlier this year I switched my new installs to Linux. ZFS on Linux has been working great for me and my clients. This is not to be confused with fuse ZFS.

Oh stop it already. The GPL is just as toxic. All these damn viral copyleft licenses are toxic. Issues like these project an image of "free software" and "open source" advocates as squabbling children. As long as this keeps up you are limiting scope for acceptance.

It is toxic to an extent. It can in some cases prevent a bit of cooperation that would aide just about everyone involved were it possible. However on the other hand MIT and BSD could be classified as weak in not giving patent licences, and not being copyleft. Every license has it's own costs and advantages. MIT and BSD can be very advantageous if the rate of development can outpace closed efforts or your project has universal interoperability as a goal. Don't get me wrong I think the GPL is a great legal ha

This is a great technology story - even if only for one version of Linux so far. DTrace will bring tremendous value for troubleshooting and performance analysis, and is a technology I use (almost) every day.

For example, yesterday I had a CPU bound workload with an unexpected level of variation, and used DTrace to measure the effect of CPU thread affinity and interrupt activity on that workload. I used DTrace to pull the runtime along with other details: number of scheduling events for that thread, along with the CPUs that the thread ran on; also, for preemption, the pre-emptor thread (to see why) along with both its user-level and kernel stack traces; also the interrupt thread and device. I fairly quickly showed that the runtime variation was caused by network interface interrupts from an entirely different application. This analysis would take quite a lot longer without DTrace, and may be prohibitively difficult to complete.

Many of my uses of DTrace are much more straightforward than that; including identifying file system latency for applications, application response time, and CPU dispatcher queue latency. I've listed many more examples in the DTrace book (http://www.dtracebook.com). It should be a great resource of ideas for those looking to use DTrace on Linux - since the hardest part for people has been knowing where to start, given the ability to see everything.

I suspect Oracle is trying for another cash grab. Port the parts of DTrace that have to be in the kernel and open source them, then sell an add-on package (perhaps only for their Linux) with the rest of the functionality. Let's face it -- Oracle is much more focused and effective at monetizing technology than Sun ever was.

I don't think they'll convince anyone at the Linux foundation to maintain their hooks in the kernel, So I only see it as an extra feature you get should you use their long-term kernel release and buy their support.

I guess Oracle wants to sell complete Boxes with Hardware to Database included. If the customer asks why the box does not perform ask predicted, with the total price, there may be a reliable answer required. So if the customer wants to run the box on linux, the mechanisms to answer these questions must be available in linux.

The main question is, is it really feasible? And the other question is, what would dtrace give us that we don't already have with systemtap, besides compatibility with slowlaris? The first question tells us if we should pay attention and the second one tells us if we should care.

You obviously haven't had to use both in anger. SystemTap is another "me too" project like so many things on Linux, where the only people saying it's as good are the people who haven't used the product it's an imitation of. Oh, and then there's the RMS type who will say it's "better because freedom has value" or something to that effect. Doesn't help you when you're actually trying to tune an application for performance.

You know what's even more annoying than Linux's "me too" projects? All the stuff they COULD imitate but don't. I have no idea why Linux admins still have to grovel through logs or use stuff like splunk to guess at what's wrong with their hardware, but they do. Even a lousy knockoff is better than pretending the problem doesn't exist and leaving people to cobble together inferior workarounds.

This story appeared yesterday on Linux Today. And it's not even close to the first time this has happened. If we can read about this first on Linux Today then what's the point of coming to Slashdot? Especially 24 hours late.

Btrfs seems to have been in development forever, and the developers on the one hand say that it's mostly stable, but on the other there are still some pretty scare bugs. It doesn't make a terrible amount of sense for Oracle to develop two next-gen CoW filesystems.

Porting ZFS would take more effort than completing Btrfs, while Dtrace provides functionality that doesn't exist at all. I wouldn't mind a ZFS port, since after using both, I kinda prefer it to Btrfs, but it's better to first port what's new and doesn't exist at all, and then go to porting things that are already there.

Google ran into problems because they didn't want to touch the GPL'ed sun code. If google had just used that code, modified it for their own use and re-named it they would have been protected by the implicit patent license in the GPL code and wouldn't have been sued. It was Andy Rubin's fear of the GPL that put Google in the position of being sued over Java.

Yesterday (October 4, 2011) Oracle made the surprising announcement that they would be porting some key Solaris features, DTrace and Zones, to Oracle Enterprise Linux. As one of the original authors, the news about DTrace was particularly interesting to me, so I started digging. Even among Oracle employees, there's uncertainty about what was announced.

This sounds like a typical PHB decision: make a crazy choice without consulting the engineers as to whether it's a good idea, possible or even wanted, and c

I wouldn't expect this to see the light of kernel mainline ever, or at least not until Oracle stops selling their Enterprise Linux offering.

DTrace on Linux will probably be something like Ksplice where it's available only to paying customer (last I checked, correct me if I'm wrong).

Good thing this opens the doors within Oracle to consider migrating more of the Solaris features to Linux, even if it's only for OEL for the time being. Personally, while being a Solaris sysadmin, I'm not wasting my time on

It has been available for Linux since 2008. 02-Aug-2008 Work in progress port of Sun's DTrace system for Linux. It is actively maintained. http://www.crisp.demon.co.uk/tools.html [demon.co.uk] I don't see anything new to the table outside of keyboard, mouse, and framebuffer recording. I'm not sure a lot of Linux users would find that an attractive addition.

Let's be clear here. The Sun division of Oracle is being run by Mark Hurd, who was last seen gutting HP and screwing his staff member. Oracle will kill off all things Sun, either now or later. Solaris and Java are the only things they seem to care about, and both of those are still rather endangered.

Solaris still has some great advantages over Linux--enough to actually keep a handful of people on it despite Oracle. I assume that they're going to get those necessary features into Linux, and then dump Solaris

I've used this... it has been very helpul for development, but I would not run it on a production machine. In most cases when I've run it, it has managed to eventually crash the machine, usually just after obtaining the information I was after. For driver development, that is fine. But not for production.