Oracle Blog

Blog for martinm

Tuesday Aug 25, 2009

The first (small) success!

Last time I worked on this I was left with close to nothing: neither debian nor the gentoo install CDs could boot (precisely: they did boot but couldn't mount the install image) Same problem on a native SPARC Niagara2 system (T5240). Only an old UltraSPARC-T1box was able to get further than booting a then useless CD, I could track it down to a squashfs mount issue and opened bug 279472 against gentoo install. After it has been resolved (it was caused by an incompatibility between squahfs and the kernel...) install-sparc64-minimal-20090817.iso does and the install works passed booting of the CD:

>> Activating mdev
>> Making tmpfs for /newroot
>> Looking for the cdrom
>> Attempting to mount media:- /dev/vdiskb1
>> Media found on /dev/vdiskb1
>> Determining root device...
>> Determining looptype ...
>> Mounting squashfs filesystem
>> Copying read-write image contents to tmpfs
>> No cdupdate.sh script found, skipping...
>> Booting (initramfs)..
INIT: version 2.86 booting
Gentoo Linux; http://www.gentoo.org/
Copyright 1999-2007 Gentoo Foundation; Distributed under the GPLv2
\* Mounting proc at /proc ... [ ok ] \* Mounting sysfs at /sys ... [ ok ] \* Mounting /dev ... [ ok ] \* Starting udevd ... [ ok ] \* Populating /dev with existing devices through uevents ... [ ok ] \* Waiting for uevents to be processed ... [ !! ] \* Mounting devpts at /dev/pts ... [ ok ] \* Mounting local filesystems ... [ ok ] \* Mounting USB device filesystem (usbfs) ... [ ok ] \* Activating (possible) swap ... [ ok ] \* Setting system clock using the hardware clock [UTC] ... [ ok ] \* Configuring kernel parameters ... [ ok ] \* Updating environment ... [ ok ] \* Cleaning /var/lock, /var/run ... [ ok ] \* Updating inittab ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Setting hostname to livecd ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Starting lo
\* Bringing up lo
\* 127.0.0.1/8
[ ok ] \* Adding routes
\* 127.0.0.0/8 ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Initializing random number generator ... [ ok ]INIT: Entering runlevel: 3
\* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Starting syslog-ng ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Hardware detection started ...
\* Detected 8 active T2 (Niagara2) CPU(s) of 8 total
[ ok ] \* Unpacking firmware ...tar: ./ql2400_fw.bin.4.02.02-MID: time stamp 2009-08-17 16:32:14 is 294342652.02588504 s in the future
tar: ./ql2100_fw.bin.1.19.38-TP: time stamp 2009-08-17 16:32:14 is 294342652.020644702 s in the future
tar: ./ql2400_fw.bin.4.00.26-IP: time stamp 2009-08-17 16:32:14 is 294342652.002011365 s in the future
tar: ./whiteheat_loader.fw: time stamp 2009-08-17 18:18:59 is 294349057.001625306 s in the future
tar: ./edgeport/boot2.fw: time stamp 2009-08-17 18:18:59 is 294349057.001033736 s in the future
tar: ./edgeport/boot.fw: time stamp 2009-08-17 18:18:59 is 294349057.000764341 s in the future
tar: ./edgeport/down2.fw: time stamp 2009-08-17 18:18:59 is 294349056.999340397 s in the future
tar: ./edgeport/down3.bin: time stamp 2009-08-17 18:18:59 is 294349056.998421595 s in the future
tar: ./edgeport/down.fw: time stamp 2009-08-17 18:18:59 is 294349056.997250881 s in the future
tar: ./edgeport: time stamp 2009-08-17 18:18:59 is 294349056.997046251 s in the future
tar: ./ql2400_fw.bin.4.00.27-IP: time stamp 2009-08-17 16:32:14 is 294342651.978127465 s in the future
tar: ./ql2400_fw.bin.4.00.22-IP: time stamp 2009-08-17 16:32:14 is 294342651.619182933 s in the future
tar: ./emi26/loader.fw: time stamp 2009-08-17 18:18:59 is 294349056.618648762 s in the future
tar: ./emi26/firmware.fw: time stamp 2009-08-17 18:18:59 is 294349056.61824269 s in the future
tar: ./emi26/bitstream.fw: time stamp 2009-08-17 18:18:59 is 294349056.608059889 s in the future
tar: ./emi26: time stamp 2009-08-17 18:18:59 is 294349056.607868344 s in the future
tar: ./ql2322_fw.bin.3.03.18: time stamp 2009-08-17 16:32:14 is 294342651.596939813 s in the future
tar: ./ql2322_fw.bin.3.03.20-IPX: time stamp 2009-08-17 16:32:14 is 294342651.585373532 s in the future
tar: ./keyspan_pda/xircom_pgs.fw: time stamp 2009-08-17 18:18:59 is 294349056.585084344 s in the future
tar: ./keyspan_pda/keyspan_pda.fw: time stamp 2009-08-17 18:18:59 is 294349056.584870918 s in the future
tar: ./keyspan_pda: time stamp 2009-08-17 18:18:59 is 294349056.584716098 s in the future
tar: ./whiteheat.fw: time stamp 2009-08-17 18:18:59 is 294349056.582405129 s in the future
tar: ./ql6312_fw.bin.3.03.18: time stamp 2009-08-17 16:32:15 is 294342652.572845455 s in the future
tar: ./ql2100_fw.bin.1.17.38: time stamp 2009-08-17 16:32:14 is 294342651.566923053 s in the future
tar: ./emi62/loader.fw: time stamp 2009-08-17 18:18:59 is 294349056.566665534 s in the future
tar: ./emi62/spdif.fw: time stamp 2009-08-17 18:18:59 is 294349056.564197985 s in the future
tar: ./emi62/midi.fw: time stamp 2009-08-17 18:18:59 is 294349056.561709436 s in the future
tar: ./emi62/bitstream.fw: time stamp 2009-08-17 18:18:59 is 294349056.551455273 s in the future
tar: ./emi62: time stamp 2009-08-17 18:18:59 is 294349056.551267465 s in the future
tar: ./ql2400_fw.bin.4.00.16: time stamp 2009-08-17 16:32:14 is 294342651.206974394 s in the future
tar: ./ql2400_fw.bin.4.00.18-IP: time stamp 2009-08-17 16:32:14 is 294342651.189752357 s in the future
tar: ./sun/cassini.bin: time stamp 2009-08-17 18:18:59 is 294349056.189298893 s in the future
tar: ./sun: time stamp 2009-08-17 18:18:59 is 294349056.189123841 s in the future
tar: ./ti_5052.fw: time stamp 2009-08-17 18:18:59 is 294349056.187469867 s in the future
tar: ./ql2300_fw.bin.3.03.20-IPX: time stamp 2009-08-17 16:32:14 is 294342651.176112833 s in the future
tar: ./ql2300_fw.bin.3.03.18: time stamp 2009-08-17 16:32:14 is 294342651.166202945 s in the future
tar: ./ql2500_fw.bin.4.02.02-MID: time stamp 2009-08-17 16:32:14 is 294342651.146737233 s in the future
tar: ./ti_3410.fw: time stamp 2009-08-17 18:18:59 is 294349056.145086666 s in the future
tar: ./ql2200_fw.bin.2.02.08-TP: time stamp 2009-08-17 16:32:14 is 294342650.990960402 s in the future
tar: ./kaweth/new_code_fix.bin: time stamp 2009-08-17 18:18:59 is 294349055.990153646 s in the future
tar: ./kaweth/trigger_code.bin: time stamp 2009-08-17 18:18:59 is 294349055.989936921 s in the future
tar: ./kaweth/trigger_code_fix.bin: time stamp 2009-08-17 18:18:59 is 294349055.989728222 s in the future
tar: ./kaweth/new_code.bin: time stamp 2009-08-17 18:18:59 is 294349055.989515785 s in the future
tar: ./kaweth: time stamp 2009-08-17 18:18:59 is 294349055.989361296 s in the future
tar: ./ql2500_fw.bin.4.02.02: time stamp 2009-08-17 16:32:14 is 294342650.974604057 s in the future
tar: ./ql2400_fw.bin.4.00.23-IP: time stamp 2009-08-17 16:32:14 is 294342650.958950448 s in the future
tar: .: time stamp 2009-08-17 18:18:59 is 294349055.958335788 s in the future
[ ok ]
l\* Not Loading APM Bios support ...
\* Not Loading ACPI support ...
\* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Network device eth0 detected, DHCP broadcasting for IP ...
\* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Starting portmap ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* ERROR: cannot start nfsmount as rpc.statd could not start
\* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
\* Caching service dependencies ... [ ok ] \* Auto-scrambling root password for security ... [ ok ] \* One of the files in /etc/{conf.d,init.d} or /etc/rc.conf
\* has a modification time in the future!
F\* Caching service dependencies ... [ ok ] \* Starting local ... [ ok ]
Welcome to the Gentoo Linux Minimal Installation CD!
The root password on this system has been auto-scrambled for security.
If any ethernet adapters were detected at boot, they should be auto-configured
if DHCP is available on your network. Type "net-setup eth0" to specify eth0 IP
address settings by hand.
Check /etc/kernels/kernel-config-\* for kernel configuration(s).
The latest version of the Handbook is always available from the Gentoo web
site by typing "links http://www.gentoo.org/doc/en/handbook/handbook.xml".
To start an ssh server on this system, type "/etc/init.d/sshd start". If you
need to log in remotely as root, type "passwd root" to reset root's password
to a known value.
Please report any bugs you find to http://bugs.gentoo.org. Be sure to include
detailed information about how to reproduce the bug you are reporting.
Thank you for using Gentoo Linux!
livecd ~ #

Ladies and Gentlemen: Gentoo install environment is up and running.

Next I'll follow the gentoo docs to the letter and install to the local (virtual) disk.

Trying the mount command from the command line fails too. Next try: same setup but use 2008.0 Gentoo minimal CD image:

>> Making tmpfs for /newroot
>> Looking for the cdrom
!! Media not found
>> No bootable medium found. Waiting for new devices...
>> Looking for the cdrom
!! Media not found
!! Could not find CD to boot, something else needed!
>> Determining root device...
!! Could not find the root block device in .
Please specify another value or: press Enter for the same, type "shell" for a shell, or "q" to skip...
root block device() :: shell
To leave and try again just press <Ctrl>+D
BusyBox v1.7.4 (2008-06-12 10:01:44 UTC) built-in shell (ash)
Enter 'help' for a list of built-in commands.
/bin/ash: can't access tty; job control turned off

Well, again no luck. This time the failure has any easy explanation: Looks like this CD image does not a vdisk driver in it, the "/dev" directory does not list any vdisk devices...

I'll go check with Dr. Google once more, although seemingly no one has tried before.

Since Linux is an ongoing hype, and since I found some spare time I decided to explore an unknown land (at least to me): How can one install a whole Linux system on recent SPARC gear? And, to not reinvent the wheel, the platform shall be a Logical Domain.

The first research targeted the available distributions. The Linux Kernel itself does have LDOMs support since 2.6.2x (IIRC it was 2.6.23), but the Kernel by itself is of little to no use at all. (The support for the sun4v architecture has been in 2.6 for a longer time, but besides the sun4v support one has to have support for the vnet and vdisk drivers.) Ubuntu had dropped support for sun4v/SPARC systems, my employer does not maintain it's own distribution or extend the support of existing (commercial) distros, so the most important distributions (SuSE and Redhat) were no choice.

This left me with free distros, namely

Gentoo, which is a little freaky because one usually recomiles everything. But this recompilation makes it very "portable" and should provide optimal performance.

Debian, which is well known in Linux land for being able to run virtually everything that is based on silicon and is easy to administer in daily operation

(I will not explore LFS which is even too freaky for me. Well, maybe I'll try that later too...)

Testing ground was a Niagara 2 based system in Sun-internal lab, right now it's actually a T5440 hosting my playground LDOMs (thanks to the simplicity of moving LDOMs around the actual systemdoesn't matter much)

Tuesday Oct 09, 2007

With the second incarnation of the true CMT systems just being announced I would like to take the chance to propose a deployment of Niagara 2 based systems that adresses security as well as high performance. The T5x20 systems both use the Niagara 2, this CPU can be viewed as a true system-on-chip:

Eight cores, each providing eight hardware threads (and a lot more), these translate to the CPUs in a conventional SMP system. Each core carries a floating point unit and crypto unit that is able to do symmetrical block and asymmetrical (public key) cipher algorithms. All cores are connected to a crossbar to communicate to each other and the other components.

Eight banks of level two cache, which translate to main memory in a SMP. These banks of cache on the one end are connected to the same crossbar that was mentioned above and on the other end to (chip-)external main memory.

a x8 PCI-E root complex and

a "network interface unit" (NIU) providing two 10Gbit ports.

One powerful use case for this Architecture is high performance Webservers that require integrated security.
Potential attacks on the server can be reduced such as

Exploiting (undiscovered) security flaws

Denial of service attacks by willingly or accidently overloading it

The whole scenario to be described below needs a T5120 or T5220, with any number of cores and a decent amount of memory. The memory amount is of course governed by the applications to be run, but we will at least deploy three logical domains, so one should have 16GB of RAM in the system.
Logical domains (LDOMs) are a partitioning or hardware virtualization technology of sun4v based systems. Up to now these are systems based on Niagara and Niagara2 CPU (their official name being Sun UltraSPARC-T1 and -T2 CPU). The LDoms are implemented by an hypervisor running on the CPU governing the access to the physical hardware. The partitioning is realized by grouping physical resources into guests, these guests are the above mentioned LDoms. The resources that can be distributed among the LDoms are the 64 hardware threads, the main memory, the PCIe root complex, and the NIU. One distinguishes three main kinds of LDoms:

Control Domain, the only domain that can change the hyepervisor configuration

Service or I/O Domains have access to physical I/O devices, and provide I/O services to other guests

The picture (click it to enlarge) gives an overview of the configuration, from left to right one has

The control domain which is also a service domain delivering the boot devices as virtual disks to all other LDoms in the system. It runs a virtual switch private to the LDoms inside the system with access to the outside world. The control domain runs the virtual switch but does not have access to the virtual network the switch provides. All administration is done through this domain.

A regular logical domain in the middle, which is meant to host the application. There may be more logical domains of that kind, i.e. a multi-part application or test environments.

A frontend domain that is the central idea of the whole proposal: The on-chip NIU is assigned to this LDom, and all external traffic is handled by the 10Gbit interfaces. The frontend domain routes or "firewalls" the external traffic to the application domains from above.

The frontend domain shields the system from the traffic from the outside, the on-chip, per-core crypto units can be used for a simple webserver terminating SSL connections there and serving static content. The physical assignment of the NIU protects the hypervisor from denial-of-attacks which could severely impact the hypervisor if one chooses a "classical" LDom deployment:
In a classical deployment a service domain transports incoming traffic through the hypervisor to the LDom the traffic is meant for. If the incoming interface is a 10Gbit interface that is hit by a denial-of-service attack the hypervisor could end up in only handling the malicious traffic and the traffic would impose a severe load on the service domain running the virtual switch infrastructure driving the incoming interface.
The frontend domain will need quite a few cores, although that of course depends on the load on the external interfaces.

Thursday Jul 26, 2007

After being asked many times "How do I install a guest LDom" I decided to give a recipe that does not need an external jumpstart or netboot server. You will not find a 101 on LDoms here, the target audience are experienced Solaris admins with a basic knowledge on LDoms. And, please, do read this recipe as a recipe: if I cook I take a recipe as an idea on how to prepare a meal, it's not the law, but don't blame the author if you do something different.
I recommend to have a look at the Logical Domains Page on sun.com for additional information and links to all required software
The main ingredients we'll use are

Make sure, that the framework works and check the configuration of your control domain (your system might have been LDomified before and have a "more-than-one guest" already running, if so, do a "ldm use-set factory-default" to return to default)

Reduce the control domain to 1 core, 1GB, and no MAUs. You might go for more resources on the service and control domain.

Setup virtual I/O infrastructure: One private virtual (installation) switch (will generate a vsw0 interface on the control domain) and a virtual switch on every physical interfaces and at least one virtual disk service. vsw0 will become the interface we'll run our JET server on

"init 6" the control domain

Check if the virtual infrastructure is set up properly, pay special attention to the instance the private is running on (the private switch can be identified my the "mode=routed" line), I'll assume it'll be vsw0

Install JET, make rarpd listen on vsw0. Add the Solaris DVD to JET. JET is no prerequisite, but administering without JET is not the easiest thing to do. Get it here

Configure guest LDoms, all having one network interface in our private switch. Life becomes easier if you configure that interface first. It then becomes the first network interface in your LDoms OBP. The rest is up to you, but I'd like to recommend two things: give it 1G at least to make Solaris happy, and choose the MAC addresses for the externally reachable interfaces manually from the 00:14:4F:FC:00:00 ~ 00:14:4F:FF:FF:FF range. Choosing the MAC adresses manually makes moving a LDom from one system to another easier, as the MAC address can stay the same

Prepare a JET template for the guest, with the MAC address of the interface on the private network

net-install the guest LDom, some people define devaliases for all virtual interfaces although that's not necessary (and I'm usually to lazy to do it) If you followed my recommendation the install interface is the first one in the show-nets listing on the OBP, choose this one, and start installation boot \^y - install

After the installation has finished the private virtual network interface can be removed from the guest

About

Before Sun was acquired by Oracle I was about 12 yrs in pre-sales covering SPARC and Solaris. Today I work in a field role in Oracle Microelectronics and focus on SPARC performance, including working and presenting at customer sites all over EMEA