Monday, October 31, 2011

// on 2011-02-20.VLC Player (highly recommended) - play lots of different video formats, including avi, mpeg, rm, rmvb, wmv, dvd, vob, and more!

// on 2011-02-20.Adium - a free instant messaging application for Mac OS X that can connect to AIM, MSN, Jabber, Yahoo, and more.

// on 2011-02-20.Google Chrome - Google Chrome is a browser that combines a minimal design with sophisticated technology to make the web faster, safer, and easier.

// on 2011-03-26witch - Command-Tab is great…if you only ever have one window open in each of your applications. With more than one window, though, it's a hassle to find the one you want. Witch solves that problem by taking you directly to the window you want to reach.

// on 2011-03-26Navicat - Navicat for MySQL is a powerful Database administration and development tool for MySQL. It works with any MySQL Database Server from version 3.21 or above, and supports most of the latest MySQL features including Trigger, Stored Procedure, Function, Event, View, and Manage User, etc.

//teamviewer - is our solution for easy and friendly desktop sharing. You can remote control a partner’s desktop to give online assistance, or you can show your screen to a customer - all without worrying about firewalls, IP addresses and NAT.

//Chicken of the VNC - is a VNC client for Mac OS X. A VNC client allows one to display and interact with a remote computer screen.

//JollysFastVNC - is a secure ARD and VNC client. Its aim is to be the best and most secure VNC client on the Mac. TaoofMac actually thinks it already has reached this goal.

Imagine you have thousands of users on your FreeBSD server and for some reason you don’t want them to see each other’s files under any circumstances.Normally you’d use complicated ACLs and/or nested groups to solve this problem but there’s a much simpler approach to all of this.

Using the MAC security framework you can initialize the BSDEXTENDED module which will give you access to a very handy tool called ugidfw. This software module is basicly a file system firewall, the system interates through the list as a certainsubject is trying to access an object.

Firstly, you have to compile MAC support into the kernel by adding theoption MACoption to your kernel config file, after recompiling and rebooting you should be able to load the mac_bsdextended module using thekldload mac_bsdextendedcommand.Let’s add ugidfw_enable="YES" to /etc/rc.confAfter that we can load firewall rules by starting up the /etc/rc.d/ugidfw script, which is going to read the default rules set in /etc/rc.bsdextended.

Let’s assume that users which need complete separation from the rest of the bunch are between uid 3000 and 4000 with the sole exception of the www user, which is going to access all the files as their owners define it in the other permission field. To spice it up a little, I wanna handle group permissions as well, 2 different users in the same primary group should be able to practice their group rights on a shared file.

And the winner is:sysctl -w security.mac.bsdextended.firstmatch_enabled=1

${CMD} set 99 subject uid 3000:4000 object gid_of_subject mode arswx

${CMD} set 100 subject not uid www object uid 3000:4000 mode n

The first rule says everyone with the uid 3000..4000 shall only access group owned files without further restrictions. If you don’t want to allow group access replacegid_of_subject to uid_of_subject

Since we enabled first match, subjects between 3000 and 4000 are not staying for the second rule, which is set for everyone except the www user.This rule says that they have no access on objects owned by users between 3000 and 4000. Fortunately we set up the firstmatch directive and users between 3000 and 4000 will not be punished with this rule as they exit from the chain at their first match, rule 99

The following text is about to show you how to use the new feature of FreeBSD 8:VIMAGE in a multi-jail environment.

Compile VIMAGE support into your kernelAdd the “option VIMAGE” to your kernel config and make sure to remove theSCTP support. Lack of SCTP support is one of the reasons VIMAGE is still considered to be experimental.

If you don’t know how to build your own custom kernel image, follow the detailed instructions of the corresponding FreeBSD Handbook chapter .

Reboot with your new kernel

First let’s create a pair of epair interfaces then quickly start two VIMAGE jails. I’m using the same fs root to make it simple, but you should create your jails as you always do, you can even use ezjail to it. The only difference is the “vnet” jailparam which is passed as a command line argument to the jail binary. If you use rc.conf you could try adding the “vnet” parameter to your jail__flags variable for automatic startup.

Let’s put the host IP we set for epair0a earlier on the bridge interface instead and bring UP the host side of epair1. (Note: If you assign an IP to an interface, its state should automatically change to UP)

Now that FreeBSD 8 is out, among many changes we can find enhancements in the field of virtualization as well. A newly developed virtualization container calledVIMAGE has been implemented to enable virtualization of the FreeBSD network stack.

As you may know previous releases of FreeBSD had support only for jails with IP addresses of the main network stack; meaning once you configured IP/IPv6 addresses on your host system, a subset of those addresses could be associated to each one of your jails. As simple as it sounds, it actually doesn’t let you perform several networking related tasks inside of a jail, and you couldn’t separate your jails from each other with a firewall as there were no real interfaces present in your system.

With VIMAGE you have a jail with full instance of the host’s networking stack, including loopback interface, routing tables, etc. Network interfaces created on the host system can be moved to any VIMAGE jail to enable its connection to the outside world with a new option of ifconfig called “vnet”.

vnet jailMove the interface to the jail , specified by name or JID. If the jail has a virtual network stack, the interface will disap- pear from the current environment and become visible to the jail.

Note: Option “-vnet” does the opposite.

As you might not have as many network interfaces as jails, you might need some workarounds to tunnel network traffic between two interfaces of your system.

Forget TUN/TAP and VPNs. FreeBSD 8 has a special network device called epair , which lets you create a pair of interconnected ethernet interfaces. If you move one of them to a VIMAGE jail you are basicly done. Feel free to bridge them or useVLANs, they will still work. I don’t know about the overhead of epair, but if all you care about is security, this might be the best choice for you on FreeBSD.

To enable VIMAGE you have to add “option VIMAGE” to your kernel configuration file and recompile/reinstall it.

This page will document changes that will be included in FreeBSD 9, including those that might end up being committed to earlier branches. In other words, it describes differences between 8.0 and 9.0, no matter what happens to the versions in between.

Everyone is encouraged to download a snapshotCD image and try all the new features (as well as the old ones). Developers are very interested in bug reports. Note that FreeBSD 9.0 is not released yet and both the snapshots and the default source trees have debugging enabled by default (which results in dramatic slowdowns so don't benchmark them without removing the debugging options).

Overall system / architectural changes

Userland DTrace

The kernel parts of the DTrace system diagnostic framework were imported some time ago, but they are now completed with the support for userland tracing, making it usable in general userland software development and system administration. Userland DTrace is already used in some large well known software packages such as PostgreSQL and X.Org.

CLANG / LLVM compiler

As the GCC compiler suite was relicensed under GPLv3 after the 4.2 release, and the GPLv3 is a big dissapointment for some users of BSD systems (mostly commercial users who have no-gplv3-beyond-company-doors policy), having an alternative, non-GPL3 compiler for the base system has become highly desireable. Currently, the overall consensus is that GCC 4.3 will not be imported into the base system (the same goes for other GPLv3 code).

The LLVM and CLANG projects together offer a full BSD-licenesed C/C++ compiler infrastructure that is, performance and feature-wise close to, or better than GCC. The LLVM is the backend and the CLANG is the front-end part of the infrastructure.

Recent development has shown that not only is it possible to start using LLVM+CLANG right away, it is also very stable. The probability of replacing GCC for the base system in the near future is high. LLVM/CLANG will also add benefits to the overall system such as better error reporting, Apple's Grand Central Dispatch system for developing multithreaded applications and possibly JIT compiling some internal structures like firewall rules.

Note that this mostly affects the base system. There is too much third party software (e.g. ports) that depends on GCC to completely replace it.

Kernel & low level improvements

Large-scale SMP support

This work brings in support for large SMP systems, with more than 32 CPUs. Previously, the kernel structures were unable to account for such a large number of CPUs so the newest method implements extensible CPU accounting. This is not an improvement in scalability in itself but is a prerequisite for large-scale SMP work.

Network kernel core dumps (netdump)

Netdump is a framework that aims for handling kernel coredumps over the TCP/IP suite in order to dump to a separate machine than the running one. That may be used on an interesting number of cases involving disk-less workstations, disk driver debugging or embedded devices.

Initial NUMA support

As NUMA-like architectures have become almost ubiqutous, even in i386 / amd64 architectures, there are potentially big performance gains to be had in enabling its supports within operating systems. New development aims to adapt the physical page allocator to be NUMA-aware.

Modern event timer infrastructure

To better support the many sources of timer ticks present in todays system and to build the foundation for tickless kernel, a new unifying timer infrastructure was created. It currently supports LAPIC, HPETs, i8254, RTC.

Tickless kernel

To improve performance in virtual machines and power usage in laptops, the "dynamic tick mode" (also called, a bit inappropriately, "tickless mode") can replace the classic, strictly periodic hardware timer interrupt ticking with one-shot variable-time ticks. This will save some CPU time which would otherwise be spent handling timer interrupts which have no work assigned to them.

Networking improvements

More SMP-scalable TCP/IP

Improvements to the networking stack introduce better scalability strategies based on thework by Alan Cox and others. With these changes, it is expected that the connections will have more clear CPU affinity, less cache line contention and better use of modern hardware flow detection and handling.

New NFS client and server

The new NFS client and server introduce the support for NFSv4 as their biggest features, with ACL support, byte range locking and delegation support. It should also be easier to maintain and later upgrate do NFSv4.1

Five new TCP congestion algorithms

This commit marks the first formal contribution of the "Five New TCP Congestion ontrol Algorithms for FreeBSD" FreeBSD Foundation funded project. More details about the project are available at: http://caia.swin.edu.au/freebsd/5cc/.

SIFTR - Statistical Information for TCP Research

SIFTR logs a range of statistics on active TCP connections to a log file, providing the ability to make highly granular measurements of TCP connection state. The tool is aimed at system administrators, developers and researchers.

Storage subsystems' improvements

A move to support 4K drives

FreeBSD's GEOM and file systems have intrinsically supported large (or even arbitrary) sector sizes for a long time, but there is still the issue of detecting them and communicating this information across the layers. Some new development introduced SATA quirks to detect known 4K drives (with the ability for users to set their own quirks on non-detected drives), the gpart(8) utility will calculate the correct alignment or warn on misalignment, and the default fragment / block size for UFS was changed to 4K / 32K.

Generic GEOM IO schedulers

The new framework, integrated with GEOM, allows for multiple disk IO schedulers to be used, if necessary, on different IO providers (e.g. drives). The usage of some IO schedulers can increase responsiveness in certain kinds of IO workloads, for example a mix of sequential and random IO.

HAST - High Availability Storage

HAST is a userland-based (ggate) implementation of a distributed storage device concept, similar to Linux's DRBD. It allows over-the-network mirroring of any GEOM storage devices in a semi-synchronous way (writes suceed when the data is sent over the wire).

UFS SoftUpdates+Journal (SU+J)

A new feature added to existing UFS SoftUpdates code makes use of a small journal, technically an intent log, to keep track of metadata garbage collection which has upto now been left as a job for (background) fsck after an unclean shutdown. The intent behind this is to eliminate the requirement for fsck or background fsck on file systems with SoftUpdates enabled after unclean shutdown.

In effect, this feature combines the best of both worlds - the very fast operation of SoftUpdates with the removal of the need for fsck characteristic for journalling file systems. This is not a radical change - the well known SoftUpdates mechanism is still in its original form - but it completes the garbage collection step in a different way.

New driver for AHCI SATA drives

The new driver supports native AHCI via the CAM (common access method for storage) subsystem. AHCI drives are manipulated by camcontrol and support for new features like NCQ and port multipliers has been integrated. Among other features, performance has been significantly increased, port multipliers and hot-plugging are greatly improved.

ATA CAM implementation

The ATA disk drivers have all been moved to the CAM system, improving some features of them along the way. This makes CAM a very real central point and foundation of disk interfaces and management of (S)ATA, SCSI, USB and Firewire drives. Some SCSI controllers still have drivers outside CAM.

AES-XTS encryption mode in kernel

The XTS block cypher mode is specially suited for encrypting disk drives and other block devices. It avoids some security problems arising with using plain CBC chaining with addressible-sector encryption.

AES with XTS mode is used in GELI and is also supported when implemented via the AES-NI.

NFSv4 ACLs for UFS

The well known and loved UFS file system has for some time implemented POSIX.1e ACLs(access control lists) in addition to the classic Unix file permissions model. This file permission model greatly enhances the way files can be managed and allows new security models to be implemented. It is also a standard part of the FreeBSD kernel, ready to be used at any time.

However, the POSIX.1e standard apparently never became trully widespread in practice. Through market share domination (but not completly without technical merit) the NTFS (Microsoft Windows file system) ACL security model has become widely popular and implemented, even so that it directly inspired the ACL model in the NFS (Network File System) version 4. The POSIX model is simpler and more Unix-like but the NTFS/NFSv4 model is more expressive.

The two ACL models are incompatible - security parameters set in NFSv4 model cannot always be directly translated to the POSIX model. Due to this and considering that NFSv4 ACLs are already directly implemented in ZFS, the introduction of NFSv4 ACLs in UFS is simply a feature-completness step which makes both file systems similarily usable from NFSv4 clients.

The POSIX model still remains in the implementation, but is mutualy exclusive (at the mount-point level) with the NFSv4 model.

Other changes

The following is a list of smaller and / or more obscure changes that nevertheless deserve a special mention since they will be of interest to certain users:

The next major release of FreeBSD, version 8, was intended to be an "evolutional" release with few exciting changes. Of course by now it is obvious this will be another in a series of releases with groundbreaking changes.

This page will document changes that will be included in FreeBSD 8, including those that might end up being committed to earlier branches. In other words, it describes differences between 7.0 and 8.0, no matter what happens to the versions in between.

Everyone is encouraged to download a snapshotCD image and try all the new features (as well as the old ones). Developers are very interested in bug reports. Note that FreeBSD 8.0 is not released yet and both the snapshots and the default source trees have debugging enabled by default (which results in dramatic slowdowns so don't benchmark them without removing the debugging options).

Overall system / architectural changes

INET-less / IPv6-only kernel

As IPv6 development and deployment is progressing, at its own pace, there is interest in making it possible to run a FreeBSD system as IPv6-only (instead of the default configuration which is dual-hosted IPv4+IPv6).

Historically, BSD is the progenitor of all TCP/IP implementations and the IPv4 code in FreeBSD was sprawled across the network layers of the kernel, from device drivers to the higher socket layers. A recent initiative aims at fixing the layering violations in preparation to, at first, build a kernel without INET (i.e. IPv4) support, then build an IPv6-only kernel. This change involves large kernel subsystems such as the firewalls, bridging, NFS and others.

CLANG / LLVM compiler

As the GCC compiler suite was relicensed under GPLv3 after the 4.2 release, and the GPLv3 is a big dissapointment for some users of BSD systems (mostly commercial users who have no-gplv3-beyond-company-doors policy), having an alternative, non-GPL3 compiler for the base system has become highly desireable. Currently, the overall consensus is that GCC 4.3 will not be imported into the base system (the same goes for other GPLv3 code).

The LLVM and CLANG projects together offer a full BSD-licenesed C/C++ compiler infrastructure that is, performance and feature-wise close to, or better than GCC. The LLVM is the backend and the CLANG is the front-end part of the infrastructure.

Recent development has shown that not only is it possible to start using LLVM+CLANG right away, it is also very stable. The probability of replacing GCC for the base system in the near future is high, though it probably won't happen by default for the 8.x series.

Note that this mostly affects the base system. There is too much third party software that depends on GCC to completely replace it.

Parallel port builds

The ports infrastructure is the part of the FreeBSD operating system that's responsible for making thousands (actually close to 20,000) of third party packages available to FreeBSD users. It enables everyone to install custom software from either source code (the traditional and preffered way) or from analogous binary packages.

The port infrastructure for source builds has been enhanced to allow parallel builds of individual ports. In the age of multi-core CPUs this means package build times will be drastically decreased. By default, all available logical CPUs will be used.

This enhancement is not tied to the 8.0 release and is available now on all recent versions of FreeBSD. Port dependancy graphs will still be built serially (i.e. only one port at a time will be built, but each individual port will be built in parallel).

Kernel & low level improvements

Better handling of mounted device removals

Panics on "hot" removal of devices with file systems mounted from them (the canonical example is the removal of USB flash memory keys while the file system was mounted) were the most commonly reported problem from end-users. New development, funded by the FreeBSD foundation, has solved this issue.

Jails v2

The jails subsystem has been greatly enhanced and updated to support modern FreeBSD features. In addition to the support for multiple IP addresses per jail (or none), support for IPv6 and SCTP has been implemented, jails can be nested hierarhically and jails can now be restricted to certain CPUs. Jails are especially powerful when combined with ZFS, where system administrators can be allowed to create and manage their own file systems within the jails.

Xen dom-U support

Xen support has been integrated into FreeBSD, allowing it to be used as a 32-bit guest operating system on recent versions of Xen dom0 (not as a host!). A target for 8.0 is to make FreeBSD ready to be used on Amazon EC2. The project needs testing and sponsorship.

New USB stack

The USB stack received a significant overhaul and the new code fixes many standing problems. Some of the new features are full support for split transactions, isochronous transactions, removed dependency on Giant (MPSAFE), a new API and many more. See the SVN message for details.

The new USB stack will use old drivers' and kernel modules' names to increase backward compatibility.

MPSAFE TTY

The TTY layer is the traditional Unix interface to system users, providing them with interactive sessions to run shells, etc. The current TTY layer in FreeBSD is for the most part the traditional BSD TTY, which is integrated with the drivers and other layers in a way that, though efficient, makes it hard to maintain and extend. The initiative to rewrite the TTY layer aims to make it a true abstraction layer, operating on behalf of both sides of TTY. In addition, it will remove the TTY from the Giant lock, which will eliminate problems with lags and skippy user interface behaviour in the console and X.Org.

Kernel memory limit on AMD64 increased

Some modern features (of which the most notable currently is ZFS) require a large amount of kernel memory (this has nothing to do with traditional disk caches or the amount of memory visible to the system). Up to now, it was only possible to allocate up to 2 GB forkmem_max, which is becoming a bit cramped. This limit has recently been increased to 512 GB. Together with backpressure improvements for the ARC, this will make the users of ZFS happy.

procstat(1): A process inspection utility

Status: Committed to -CURRENTWill appear in 8.0: sureAuthor: Robert WatsonWeb:announcement

procstat combines functionality from the now-deprecated procfs(4) and adds several new functionalities. Some of the data procstat can provide are: process' command line arguments, file descriptor information, stacks of the kernel threads in the process, security credentials information from the process, thread information and virtual memory mappings. This is utility is mostly useful for debugging.

TextDumps: gathering information after kernel panic

Status: Committed to -CURRENT, MFCedWill appear in 8.0: sureAuthor: Robert WatsonWeb:Q&A on textdumps

The usual thing that happens after a kernel panic is a kernel memory dump, either full or (in 7.0 and later) a "minidump". The new "textdump" feature doesn't store the actual kernel memory dump, but extracts commonly needed information from it, stores it into a tar archive of text files, and deletes the dump file. This significantly reduces the size requirements of collecting such information, speeds up development, and enables people to collect debugging information after a crash without kernel developer experience.

ULE 3.0: New version of the SMP-optimized scheduler

Evolution of the ULE scheduler resulted in support for fine-grained CPU affinity calculations, taking into account the physical topology of the CPUs (caches, cores, sockets) and much improved support for binding threads to CPUs. This results in additional functionalities (opens up the possibility of assigning individual CPUs to jails) and noticeable performance improvements.

Superpages

Most general-purpose processors provide support for memory pages of large sizes, calledsuperpages. Superpages enable each entry in the translation lookaside buffer (TLB) to map a large physical memory region into a virtual address space. This dramatically increases TLB coverage, reduces TLB misses, and promises performance improvements for many applications. However, supporting superpages poses several challenges to the operating system, in terms of superpage allocation and promotion tradeoffs, fragmentation control, etc. The performance benefits are substantial, often exceeding 30%; these benefits are sustained even under stressful workload scenarios.

While they can be used on most x86 CPUs, benchmarking has shown that their greatest benefits are visible on quad-core and newer CPUs.

DTrace

DTrace is a tool and a language developed by Sun Microsystems to help debugging and profiling operating systems. It can aggregate information from different parts of kernel (userland tracing is not yet implemented) and analyze them in a ways that's meaningful to the user.

Networking improvements

802.11s D3.03 wireless mesh networking

A wireless mesh network, sometimes called WMN, is a wireless network using a mesh topology instead of more typical AP-client topology. These networks are often seen as special type of ad-hoc networks since there's no central node that will break connectivity (in contrast with common wireless networks where there's a central Access Point). 802.11s is an amendment to the 802.11-2007 wireless standard that describes how a mesh network should operate on top of the existing 802.11 MAC.

VirtNet / VIMAGE / Imunes / Network stack virtualization

The network stack virtualization project aims at extending the FreeBSD kernel to maintain multiple independent instances of networking state. This will allow for complete networking independence between jails on a system, including giving each jail its own firewall, virtual network interfaces, rate limiting, routing tables, and IPSEC configuration.

VIMAGE+Jails will be experimental in 8.0; the system might not work as advertised, especially with regards to security.

Zero-copy BPF

BPF is Berkeley Packet Filter, facility used to capture raw network packets from the lower layers of the network stack according to user-defined filters and forward them to an application, as well as insert raw packets to the network stack.

This improvement to BPF reduces the number of memory copy operations between the kernel and the application which improves performance in some cases.

Kernel NFS locking support

NFS lock manager in kernel improves performance and behaviour of NFS locking (used to synchronize file access on remote machines). New features include multithreaded operation, deadlock detection, and transparent interaction with local file locks on the server.

NFSv4 support

NFSv4 is a major overhaul of the NFS protocol and brings many new features like a stateful protocol, performance improvements and stronger security (ACLs, strong authentication). Until recently, NFSv4 support in FreeBSD was partial (client-only) and somewhat unstable. New development aims to complete this support.

The introduced NFSv4 infrastructure also replaces the old NFSv2 and NFSv3 servers and clients with the new ones.

Storage subsystems' improvements

Experimental new driver for AHCI

The new driver, present but not enabled by default in 8.0, supports native AHCI via the CAM (common access method for storage) system. AHCI drives are manipulated by camcontrol and support for new features like NCQ has been integrated.

gvinum 2

gvinum is a logical volume manager based on and compatible with vinum, the FreeBSD's long-standing and practically traditional volume manager. Its features include JBOD, RAID 0, RAID 1 and RAID 5 modes of combining storage devices into higher level volumes, and due to the new version's integration with GEOM it can use and be used by other GEOM devices and classes.

Gvinum 2 is significantly restructured version of gvinum and fixes many long-standing problems. The work done on gvinum makes it more usable and production ready, while maintaining compatibility with older versions. Gvinum exists in parallel with other GEOM classes like gmirror, gstripe and others.

Boot support for GPT partitions

Support for booting from GPT partitions has been committed to -CURRENT. This support includes the boot sector and loader that enable common i386 machines with a generic BIOS to boot from GPT-partitioned drives.

bsdlabel gets extended to 20 partitions

bsdlabel is (finally!) extended to support more than 8 partitions. The new limit of 20 partitions comes from the number of entries that fit in a single sector.

To make use of this change, GEOM_PART needs to be used instead of GEOM_BSD (this is default in 8.0 but will not work with older kernels). Old utilities like "bsdlabel" will not work with GEOM_PART; the new gpart utility must be used instead.

Security

ProPolice SSP (stack-smashing protection)

ProPolice helps prevent exploits that use stack-based buffer overflows by setting a random integer (called the "canary") in the stack right before the return address. It is set in the function's prologue and verified during the epilogue; if it has changed, then a buffer overflow has occured and the program commits suicide by killing itself with SIGABRT (or panic() in case it's the kernel). Both userland and kernel may be protected.

Other changes

The following is a list of smaller and / or more obscure changes that nevertheless deserve a special mention since they will be of interest to certain users:

User-controllable CPU/IRQ binding (jhb)

User-controllable CPU-thread binding with support for CPU sets (jeffr)