Monday, September 30, 2013

Running “everything” in a web browser used to be a bold statement.
However, due to the powerful HTML5/JavaScript stack, a web browser
increasingly becomes a dominant application delivery platform. Even the Linux kernel sandboxed in a web browser no longer sounds so crazy these days.
In this tutorial, I describe how to access an SSH terminal in a web browser on Linux. Web-based SSH is useful when the firewall you are behind is so restrictive that only HTTP(s) traffic can get through.Shell In A Box (or shellinabox) is a web-based terminal emulator which can run as a web-based SSH client. It comes with its own web server (shellinaboxd)
which exports a command line shell to a web-based terminal emulator via
AJAX interface. Shell In a Box only needs JavaScript/CSS support from a
web browser, and does not require any additional browser plugin.

Configure Shellinaboxd Web Server

By default shellinaboxd web server listens on 4200 TCP port
on localhost. In this tutorial, I change the default port to 443 for
HTTPS. For that, modify shellinabox configuration as follows.
Configure shellinaboxd On Debian, Ubuntu or Linux Mint:

# TCP port that shellinboxd's webserver listens on
PORT=443
# specify the IP address of a destination SSH server
OPTS="-s /:SSH:192.168.1.7"
# if you want to restrict access to shellinaboxd from localhost only
OPTS="-s /:SSH:192.168.1.7 --localhost-only"

Heads-up for Fedora users: According to the official document, some operations may not work out of the box when you run shellinaboxd in SELinux mode on Fedora. Refer to the document if you have any issue.

Provision a Self-Signed Certificate

During the installation of Shell In A Box, shellinaboxd attempts to create a new self-signed certificate (certificate.pem) by using /usr/bin/openssl if no suitable certificate is found on your Linux. The created certificate is then placed in /var/lib/shellinabox.
If no certificate is found in the directory for some reason, you can create one yourself as follows.

Cloud storage is everywhere in today’s multi-device environment where
people want to access content across multiple devices wherever they go.
Dropbox is the most widely used cloud storage service due to its
elegant UI and flawless multi-platform compatibility. There are numerous
official or unofficial Dropbox clients available on multiple platforms.
Linux has its own share of Dropbox clients; CLI clients as well as GUI-based clients. Dropbox Uploader is an easy-to-use Dropbox CLI client written in BASH scripting language. In this tutorial, I describe how to access Dropbox from the command line in Linux by using Dropbox Uploader.

Install and Configure Dropbox Uploader on Linux

Make sure that you have installed curl on your system, since Dropbox Uploader runs Dropbox APIs via curl.
To configure Dropbox Uploader, simply run dropbox_uploader.sh. When you run the script for the first time, it will ask you to grant the script access to your Dropbox account.

$ ./dropbox_uploader.sh

As instructed above, go to https://www2.dropbox.com/developers/apps
on your web browser, and create a new Dropbox app. Fill in the
information of the new app as shown below, and enter the app name as
generated by Dropbox Uploader.
After you have created a new app, you will see app key/secret on the next page. Make a note of them.
Enter the app key and secret in the terminal window where dropbox_uploader.sh is running. dropbox_uploader.sh will then generate an oAUTH URL (e.g., http://www2.dropbox.com/1/oauth/authorize?oauth_token=XXXXXXX).
Go to the oAUTH URL generated above on your web browser, and allow access to your Dropbox account.
This completes Dropbox Uploader configuration. To check whether
Dropbox Uploader is successfully authenticated, run the following
command.

There are many VNC clients available on
Linux, differing in their capabilities and operating system support.
If you are looking for a cross-platform VNC client, you have two
options: use either Java-based VNC viewers (e.g., RealVNC or TightVNC),
or web-based VNC clients.
VNC web clients are typically faster than Java-based VNC viewers, and
could easily be integrated into other third-party applications.
In this tutorial, I will describe how to access VNC remote desktop in web browser by using VNC web client called noVNC.noVNC is
a HTML5-based remote desktop web client which can communicate with a
remote VNC server via Web Sockets. Using noVNC, you can control a
remote computer in a web browser over VNC.
noVNC has been integrated into a number of other projects including OpenStack, OpenNebula, CloudSigma, Amahi and PocketVNC.

Web Browser Requirements

To run noVNC, your web browser must support HTML5, more specifically
HTML5 Canvas and WebSockets. The following browsers meet the
requirements: Chrome 8+, Firefox 3.6+, Safari 5+, iOS Safari 4.2+, Opera 11+, IE 9+, and Chrome Frame on IE 6-8. If your browser does not have native WebSockets support, you can use web-socket-js, which is included in noVNC package.
For more detailed browser compatibility, refer to the official guide.

Install noVNC on Linux

Launch Websockify WebSockets Proxy

The first step is to launch Websockify (which comes with noVNC
package) on local host. noVNC leverages Websockify to communicate with a
remote VNC server. Websockify is a WebSocket to TCP proxy/bridge,
which allows a web browser to connect to any application, server or
service via local TCP proxy.
I assume that you already set up a running VNC server somewhere. For
the purpose of this tutorial, I set up a VNC server at
192.168.1.10:5900 by using x11vnc.
To launch Websockify, use a startup script called launch.sh. This
script starts a mini-webserver as well as Websockify. The “--vnc”
option is used to specify the location of a remotely running VNC server.

At this point, you can open up a web browser, and navigate to the URL
shown in the output of Websockify (e.g.,
http://127.0.0.1:6080/vnc.html?host=127.0.0.1&port=6080).
If the remote VNC server requires password authentication, you will see the following screen in your web browser.
After you have successfully connected to a remote VNC server, you will be able to access the remote desktop as follows.
You can adjust the settings of a VNC session by clicking on the settings icon located in the top right corner.

Create Encrypted VNC Session with noVNC

By default a VNC session created by noVNC is not encrypted. If you
want, you can create encrypted VNC connections by using the WebSocket
‘wss://’ URI scheme. For that, you need to generate a self-signed
encryption certificate (e.g., by using OpenSSL), and have Websockify
load the certificate.
To create a self-signed certificate with OpenSSL:

Packet queues are a core component of any network stack or device. They
allow for asynchronous modules to communicate, increase performance and
have the side effect of impacting latency. This article aims to explain
where IP packets are queued on the transmit path of the Linux network
stack, how interesting new latency-reducing features, such as BQL, operate
and how to control buffering for reduced latency.
Figure 1. Simplified High-Level Overview of the Queues on the Transmit
Path of the Linux Network Stack

Driver Queue (aka Ring Buffer)

Between the IP stack and the network interface controller (NIC) lies the
driver queue. This queue typically is implemented as a first-in, first-out
(FIFO) ring buffer (http://en.wikipedia.org/wiki/Circular_buffer)—just think of it as a
fixed-sized buffer. The driver
queue does not contain the packet data. Instead, it consists of descriptors
that point to other data structures called socket kernel buffers (SKBs,
http://vger.kernel.org/%7Edavem/skb.html),
which hold the packet data and are used throughout the kernel.
Figure 2. Partially Full Driver Queue with Descriptors Pointing to SKBs
The input source for the driver queue is the IP stack that queues IP
packets. The packets may be generated locally or received on one NIC to be
routed out another when the device is functioning as an IP router. Packets
added to the driver queue by the IP stack are dequeued by the hardware
driver and sent across a data bus to the NIC hardware for transmission.
The reason the driver queue exists is to ensure that whenever the
system has data to transmit it is available to the NIC for immediate
transmission. That is, the driver queue gives the IP stack a location
to queue data asynchronously from the operation of the hardware. An
alternative design would be for the NIC to ask the IP stack for data
whenever the physical medium is ready to transmit. Because responding
to this request cannot be instantaneous, this design wastes valuable
transmission opportunities resulting in lower throughput. The opposite
of this design approach would be for the IP stack to wait after a packet
is created until the hardware is ready to transmit. This also is not
ideal, because the IP stack cannot move on to other work.

Huge Packets from the Stack

Most NICs have a fixed maximum transmission unit (MTU), which is the
biggest frame that can be transmitted by the physical media. For Ethernet,
the default MTU is 1,500 bytes, but some Ethernet networks support Jumbo
Frames (http://en.wikipedia.org/wiki/Jumbo_frame) of up to 9,000 bytes. Inside the IP network stack, the MTU can
manifest as a limit on the size of the packets that are sent to the
device for transmission. For example, if an application writes 2,000
bytes to a TCP socket, the IP stack needs to create two IP packets
to keep the packet size less than or equal to a 1,500 MTU. For large
data transfers, the comparably small MTU causes a large number of small
packets to be created and transferred through the driver queue.
In order to avoid the overhead associated with a large number of packets
on the transmit path, the Linux kernel implements several optimizations:
TCP segmentation offload (TSO), UDP fragmentation offload (UFO) and
generic segmentation offload (GSO). All of these optimizations allow the
IP stack to create packets that are larger than the MTU of the outgoing
NIC. For IPv4, packets as large as the IPv4 maximum of 65,536 bytes can
be created and queued to the driver queue. In the case of TSO and UFO,
the NIC hardware takes responsibility for breaking the single large
packet into packets small enough to be transmitted on the physical
interface. For NICs without hardware support, GSO performs the same
operation in software immediately before queueing to the driver queue.
Recall from earlier that the driver queue contains a fixed number of
descriptors that each point to packets of varying sizes. Since TSO,
UFO and GSO allow for much larger packets, these optimizations have the
side effect of greatly increasing the number of bytes that can be queued
in the driver queue. Figure 3 illustrates this concept in contrast with
Figure 2.
Figure 3. Large packets can be sent to the NIC when TSO, UFO or GSO
are enabled. This can greatly increase the number of bytes in the
driver queue.
Although the focus of this article is the transmit path, it is worth
noting that Linux has receive-side optimizations that operate
similarly to TSO, UFO and GSO and share the goal of reducing per-packet
overhead. Specifically, generic receive offload (GRO,
http://vger.kernel.org/%7Edavem/cgi-bin/blog.cgi/2010/08/30) allows the NIC
driver to combine received packets into a single large packet that is
then passed to the IP stack. When the device forwards these large packets,
GRO allows the original packets to be reconstructed, which is necessary
to maintain the end-to-end nature of the IP packet flow. However, there
is one side effect: when the large packet is broken up, it results in
several packets for the flow being queued at once. This
"micro-burst"
of packets can negatively impact inter-flow latency.

Starvation and Latency

Despite its necessity and benefits, the queue between the IP stack and
the hardware introduces two problems: starvation and latency.
If the NIC driver wakes to pull packets off of the queue for transmission
and the queue is empty, the hardware will miss a transmission opportunity,
thereby reducing the throughput of the system. This is referred to
as starvation. Note that an empty queue when the system does not
have anything to transmit is not starvation—this is normal. The
complication associated with avoiding starvation is that the IP stack
that is filling the queue and the hardware driver draining the queue run
asynchronously. Worse, the duration between fill or drain events varies
with the load on the system and external conditions, such as the network
interface's physical medium. For example, on a busy system, the IP stack
will get fewer opportunities to add packets to the queue, which increases
the chances that the hardware will drain the queue before more packets
are queued. For this reason, it is advantageous to have a very large
queue to reduce the probability of starvation and ensure high throughput.
Although a large queue is necessary for a busy system to maintain high
throughput, it has the downside of allowing for the introduction of a
large amount of latency.
Figure 4 shows a driver queue that is almost full with TCP segments
for a single high-bandwidth, bulk traffic flow (blue). Queued last is
a packet from a VoIP or gaming flow (yellow). Interactive applications
like VoIP or gaming typically emit small packets at fixed intervals that
are latency-sensitive, while a high-bandwidth data transfer generates a
higher packet rate and larger packets. This higher packet rate can fill
the queue between interactive packets, causing the transmission of the
interactive packet to be delayed.
Figure 4. Interactive Packet (Yellow) behind Bulk Flow Packets (Blue)
To illustrate this behaviour further, consider a scenario based on the
following assumptions:

A network interface that is capable of transmitting at 5 Mbit/sec or
5,000,000 bits/sec.

Each packet from the bulk flow is 1,500 bytes or 12,000 bits.

Each packet from the interactive flow is 500 bytes.

The depth of the queue is 128 descriptors.

There are 127 bulk data packets and one interactive packet queued last.

Given the above assumptions, the time required to drain the 127 bulk
packets and create a transmission opportunity for the interactive packet
is (127 * 12,000) / 5,000,000 = 0.304 seconds (304 milliseconds for those
who think of latency in terms of ping results). This amount of latency
is well beyond what is acceptable for interactive applications, and this
does not even represent the complete round-trip time—it is only the
time required to transmit the packets queued before the interactive one. As
described earlier, the size of the packets in the driver queue can be
larger than 1,500 bytes, if TSO, UFO or GSO are enabled. This makes the
latency problem correspondingly worse.
Large latencies introduced by over-sized, unmanaged queues is known as
Bufferbloat (http://en.wikipedia.org/wiki/Bufferbloat). For a more
detailed explanation of this phenomenon, see the
Resources for this article.
As the above discussion illustrates, choosing the correct size for
the driver queue is a Goldilocks problem—it can't be too small, or
throughput suffers; it can't be too big, or latency suffers.

Byte Queue Limits (BQL)

Byte Queue Limits (BQL) is a new feature in recent Linux kernels
(> 3.3.0) that attempts to solve the problem of driver queue sizing
automatically. This is accomplished by adding a layer that enables and
disables queueing to the driver queue based on calculating the minimum
queue size required to avoid starvation under the current system
conditions. Recall from earlier that the smaller the amount of queued
data, the lower the maximum latency experienced by queued packets.
It is key to understand that the actual size of the driver queue is not
changed by BQL. Rather, BQL calculates a limit of how much data (in bytes)
can be queued at the current time. Any bytes over this limit must be
held or dropped by the layers above the driver queue.
A real-world example may help provide a sense of how much BQL affects
the amount of data that can be queued. On one of the author's servers, the
driver queue size defaults to 256 descriptors. Since the Ethernet MTU is
1,500 bytes, this means up to 256 * 1,500 = 384,000 bytes can be queued
to the driver queue (TSO, GSO and so forth are disabled, or this would be much
higher). However, the limit value calculated by BQL is 3,012 bytes. As you
can see, BQL greatly constrains the amount of data that can be queued.
BQL reduces network latency by limiting the amount of data in the driver
queue to the minimum required to avoid starvation. It also has the
important side effect of moving the point where most packets are queued
from the driver queue, which is a simple FIFO, to the queueing discipline
(QDisc) layer, which is capable of implementing much more complicated
queueing strategies.

Queueing Disciplines (QDisc)

The driver queue is a simple first-in, first-out (FIFO) queue. It treats
all packets equally and has no capabilities for distinguishing between
packets of different flows. This design keeps the NIC driver software
simple and fast. Note that more advanced Ethernet and most wireless
NICs support multiple independent transmission queues, but similarly, each
of these queues is typically a FIFO. A higher layer is responsible for
choosing which transmission queue to use.
Sandwiched between the IP stack and the driver queue is the queueing
discipline (QDisc) layer (Figure 1). This layer implements the
traffic management capabilities of the Linux kernel, which include traffic
classification, prioritization and rate shaping. The QDisc layer is
configured through the somewhat opaque tc command. There are three key
concepts to understand in the QDisc layer: QDiscs, classes and filters.
The QDisc is the Linux abstraction for traffic queues, which are more
complex than the standard FIFO queue. This interface allows the QDisc to
carry out complex queue management behaviors without requiring the IP
stack or the NIC driver to be modified. By default, every network interface
is assigned a pfifo_fast QDisc
(http://lartc.org/howto/lartc.qdisc.classless.html), which
implements a simple three-band
prioritization scheme based on the TOS bits. Despite being the default,
the pfifo_fast QDisc is far from the best choice, because it defaults to
having very deep queues (see txqueuelen below) and is not flow aware.
The second concept, which is closely related to the QDisc, is the
class. Individual QDiscs may implement classes in order to handle subsets
of the traffic differently—for example, the Hierarchical Token Bucket
(HTB, http://lartc.org/manpages/tc-htb.html). QDisc allows
the user to configure multiple classes, each with a
different bitrate, and direct traffic to each as desired. Not all QDiscs
have support for multiple classes. Those that do are referred to as classful
QDiscs, and those that do not are referred to as classless QDiscs.
Filters (also called classifiers) are the mechanism used to direct
traffic to a particular QDisc or class. There are many different filters
of varying complexity. The u32 filter
(http://www.lartc.org/lartc.html#LARTC.ADV-FILTER.U32) is the most
generic, and the flow
filter is the easiest to use.

Buffering between the Transport Layer and the Queueing Disciplines

In looking at the figures for this article, you may have noticed that there
are no packet queues above the QDisc layer. The network stack places
packets directly into the QDisc or else pushes back on the upper layers
(for example, socket buffer) if the queue is full. The obvious question that follows
is what happens when the stack has a lot of data to send? This can occur
as the result of a TCP connection with a large congestion window or, even
worse, an application sending UDP packets as fast as it can. The answer
is that for a QDisc with a single queue, the same problem outlined in
Figure 4 for the driver queue occurs. That is, the high-bandwidth or
high-packet rate flow can consume all of the space in the queue causing
packet loss and adding significant latency to other flows. Because Linux
defaults to the pfifo_fast QDisc, which effectively has a single queue
(most traffic is marked with TOS=0), this phenomenon is not uncommon.
As of Linux 3.6.0, the Linux kernel has a feature called TCP Small
Queues that aims to solve this problem for TCP. TCP Small Queues adds
a per-TCP-flow limit on the number of bytes that can be queued in the
QDisc and driver queue at any one time. This has the interesting side
effect of causing the kernel to push back on the application earlier,
which allows the application to prioritize writes to
the socket more effectively. At the time of this writing, it is still possible for single
flows from other transport protocols to flood the QDisc layer.
Another partial solution to the transport layer flood problem, which is
transport-layer-agnostic, is to use a QDisc that has many queues, ideally
one per network flow. Both the Stochastic Fairness Queueing (SFQ,
http://crpppc19.epfl.ch/cgi-bin/man/man2html?8+tc-sfq) and
Fair Queueing with Controlled Delay (fq_codel,
http://linuxmanpages.net/manpages/fedora18/man8/tc-fq_codel.8.html) QDiscs fit this problem
nicely, as they effectively have a queue-per-network flow.

How to Manipulate the Queue Sizes in Linux

Driver Queue:
The ethtool command
(http://linuxmanpages.net/manpages/fedora12/man8/ethtool.8.html) is used to control the driver queue size for Ethernet
devices. ethtool also provides low-level interface statistics as well
as the ability to enable and disable IP stack and driver features.
The -g flag to ethtool displays the driver queue (ring) parameters:

You can see from the above output that the driver for this NIC defaults
to 256 descriptors in the transmission queue. Early in the Bufferbloat
investigation, it often was recommended to reduce the size of the driver
queue in order to reduce latency. With the introduction of BQL (assuming
your NIC driver supports it), there no longer is any reason to modify
the driver queue size (see below for how to configure BQL).
ethtool also allows you to view and manage optimization features, such
as TSO, GSO, UFO and GRO, via the -k and
-K flags. The -k flag displays
the current offload settings and -K modifies them.
As discussed above, some optimization features greatly increase the
number of bytes that can be queued in the driver queue. You should
disable these optimizations if you want to optimize for latency over
throughput. It's doubtful you will notice any CPU impact or throughput
decrease when disabling these features unless the system is handling
very high data rates.
Byte Queue Limits (BQL):
The BQL algorithm is self-tuning, so you probably don't need to modify
its configuration. BQL state and configuration can be found in a /sys
directory based on the location and name of the NIC. For example:
/sys/devices/pci0000:00/0000:00:14.0/net/eth0/queues/tx-0/byte_queue_limits.
To place a hard upper limit on the number of bytes that can be queued,
write the new value to the limit_max file:

echo "3000" > limit_max

What Is txqueuelen?
Often in early Bufferbloat discussions, the idea of statically reducing
the NIC transmission queue was mentioned. The txqueuelen field in the
ifconfig command's output or the qlen field in the ip command's output
show the current size of the transmission queue:

The length of the transmission queue in Linux defaults to 1,000 packets,
which is a large amount of buffering, especially at low bandwidths.
The interesting question is what queue does this value control? One might
guess that it controls the driver queue size, but in reality, it serves
as a default queue length for some of the QDiscs. Most important,
it is the default queue length for the pfifo_fast QDisc, which is the
default. The "limit" argument on the tc command line can be used to
ignore the txqueuelen default.
The length of the transmission queue is configured with the ip or
ifconfig commands:

ip link set txqueuelen 500 dev eth0

Queueing Disciplines:
As introduced earlier, the Linux kernel has a large number of queueing
disciplines (QDiscs), each of which implements its own packet queues and
behaviour. Describing the details of how to configure each of the QDiscs
is beyond the scope of this article. For full details, see the tc man page
(man tc). You can find details for each QDisc in
man tc qdisc-name
(for example, man tc htb or man tc
fq_codel).
TCP Small Queues:
The per-socket TCP queue limit can be viewed and controlled with the
following /proc file:
/proc/sys/net/ipv4/tcp_limit_output_bytes.
You should not need to modify this value in any normal situation.

Oversized Queues Outside Your Control

Unfortunately, not all of the over-sized queues that will affect your
Internet performance are under your control. Most commonly, the problem
will lie in the device that attaches to your service provider (such as DSL or
cable modem) or in the service provider's equipment itself. In the latter
case, there isn't much you can do, because it is difficult to control the
traffic that is sent toward you. However, in the upstream direction, you
can shape the traffic to slightly below the link rate. This will stop the
queue in the device from having more than a few packets. Many residential
home routers have a rate limit setting that can be used to shape below
the link rate. Of course, if you use Linux on your home gateway, you can
take advantage of the QDisc features to optimize further. There are many
examples of tc scripts on-line to help get you started.

Summary

Queueing in packet buffers is a necessary component of any packet network,
both within a device and across network elements. Properly managing the
size of these buffers is critical to achieving good network latency,
especially under load. Although static queue sizing can play a role in
decreasing latency, the real solution is intelligent management of the
amount of queued data. This is best accomplished through dynamic schemes,
such as BQL and active queue management (AQM,
http://en.wikipedia.org/wiki/Active_queue_management) techniques like Codel. This
article outlines where packets are queued in the Linux network stack,
how features related to queueing are configured and provides some guidance
on how to achieve low latency.

Tuesday, September 24, 2013

Video editing in Linux is a controversial topic. There are a number of
video editors for Ubuntu that works quite well. But are they any good
for serious movie editing? Perhaps not. But with the arrival of Linux
variants from many big-shots such as Lightworks, things are slowly
starting to change. Remember the kind of sweeping-change we witnessed in
the Linux gaming
scene once Valve released their much-touted Steam client for Linux. But
that's another story. Here, we'll discuss 5 of the most potent video
editors available for Ubuntu.

Lightworks is a top notch, professional-grade video/movie
editor which recently released a beta version for Linux as well.
Lightworks was perhaps one of the firsts to adopt computer-based
non-linear editing systems, and has been in development since 1989. The
release of an open source version, as well as ports for Linux and Mac OS
X were announced in May 2010. Lightworks beta video editor is free to
download and use, and their is a PRO paid plan offering which gives you
extra features and codec support at $60/year.

Kdenliveis an open-source, non-linear video editing
software available for FreeBSD, Linux and MAC OSX platforms. Kdenlive
was one of the earliest to develop a dedicated video editor for Linux
with the project starting as early as in 2002. Kdenlive 0.9.4 is
available in Ubuntu Software Center by default. But if you want the latest version (Kdenlive 0.9.6 instead), do the following in Terminal. Visit Kdenlive download page for more options.

OpenShot is perhaps one of the most active open source
video editing software projects out there. In my book, OpenShot is a
little more intuitive when compared to its competition. And after a
successful Kickstarter funding campaign recently, the team will be
launching a Windows and Mac version of OpenShot apart from the normal
Linux variant. Add the following PPA to install OpenShot in Ubuntu. More
download options here.

Flowblade Movie Editor is an open source, multitrack and
non-linear video editor for Linux. Flowblade is undoubtedly the hottest
new entrant into the Linux video editing scene. Project started only
last year and there has been just three releases so far. The latest
release was Flowblade version 0.10.0 and this happened just two weeks
ago. And it is already showing enormous amount of potential. Flowblade
is available in DEB packages only at the moment.

Cinelerra is a professional video editing and compositing
software for Linux which is also open source. Cinelerra was first
released August 1, 2002. Cinelerra includes support for very
high-fidelity audio and video. The latest version, Cinelerra 4.4, was
released more than an year ago and it featured a faster startup and
increased responsiveness among other improvements. Cinelerra has plenty
of download options. If you're an Ubuntu user, just do the following.

I have deliberately not included Blender here because, even though it
can do video editing, Blender is much more than that. Blender is a full
blown graphics processing software with advanced 3D modelling
capabilities (Tears of Steel
was the latest in a long list of official Blender made animation
movies). Did we miss out on any other good video editors for Linux? Let
us know in the comments. Thanks for reading.

Sunday, September 22, 2013

user data manifesto defining basic rights for people to control their own data in the internet age

1. Own the dataThe data that someone directly or indirectly creates belongs to the person who created it.

2. Know where the data is storedEverybody should be able to know: where their
personal data is physically stored, how long, on which server, in what
country, and what laws apply.

3. Choose the storage locationEverybody should always be able to migrate their
personal data to a different provider, server or their own machine at
any time without being locked in to a specific vendor.

4. Control accessEverybody should be able to know, choose and control who has access to their own data to see or modify it.

5. Choose the conditionsIf someone chooses to share their own data, then the owner of the data selects the sharing license and conditions.

6. Invulnerability of data
Everybody should be able to protect their own data against surveillance
and to federate their own data for backups to prevent data loss or for
any other reason.

7. Use it optimallyEverybody should be able to access and use their
own data at all times with any device they choose and in the most
convenient and easiest way for them.

8. Server software transparencyServer software should be free and open source
software so that the source code of the software can be inspected to
confirm that it works as specified.

Services, projects, software that respects the
user data rights and this manifesto. Contact us to have a software or
project added to this list.

GitHub is the most popular open source project hosting site. Earlier this year, GitHub reached a milestone
in the history of open source project management by hosting 6 million
projects over which 3.5 million people collaborate. You may wonder what
the hottest open source projects are among those 6 million projects.
In this post, I will describe 20 most popular open source projects that are hosted at GitHub. To rank projects, I use the number of “stars” received by each project as a “popularity” metric. At GitHub, “starring” a project
is a way to keep track of projects that you find interesting. So the
number of stars added to a project presumably indicates the level of
interests in the project among registered GitHub users.
I understand that any kind of “popularity” metric for open source
projects would be subjective at best. The value of open source code is
in fact very much in the eye of the beholder. With that being said,
this post is for introducing nice cool projects that you may not be
aware of, and for casual reading to those interested in this kind of
tidbits. It is NOT meant for a popularity contest or a competition
among different projects.

Bootstrap, which is developed by Twitter, is a powerful front-end
framework for web development. Bootstrap contains a collection of
HTML/CSS templates and JavaScript extensions to allow you to quickly
prototype the front-end UI of websites and web applications.

Node.js is a server-side JavaScript environment that enables
real-time web applications to scale by using asynchronous event-driven
model. Node.js uses Google’s V8 JavaScript engine to run its
JavaScript. Node.js is the hottest technology used in many production
environments including LinkedIn, PayPal, Walmart, Yahoo! and eBay.

jQuery is a cross-browser JavaScript library designed to simplify the
way you write client-side JavaScript. jQuery can handle HTML document
traversal, events, animation, AJAX interaction, and much more.
According to BuiltWith, jQuery is used by more than 60% of top-ranking websites today.

HTML5 Boilerplate is a professional looking front-end template for
building fast, robust, and adaptable web sites or web applications in
HTML5. If you want to learn HTML5 and CSS3, this is an excellent
starting point.

D3.js is a cross-browser JavaScript library for presenting documents
with dynamic and interactive graphics driven by data. D3.js can
visualize any digital data in W3C-compliant HTML5, CSS3, SVG or
JavaScript.

Impress.js is a CSS3-based presentation framework that allows you to
convert HTML content into a slideshow presentation with stunning
visualization and animation. Using impress.js, you can easily create
beautiful looking online presentations supported by all modern browsers.

Font Awesome is a suite of scalable vector icons that can be
customized in size, color or drop shadow by using CSS. It is designed to
be fully compatible with Bootstrap. Font Awesome is completely free for
commercial use.

AngularJS is a JavaScript framework developed by Google to assist
writing client-side web applications with model–view–controller (MVC)
capability, so that both development and testing become easier.
AngularJS allows you to properly structure web applications by using
powerful features such as directives, data binding, filters, modules and
scope.

Homebrew is package management software for MacOS X. It simplifies
installation of other free/open source software that Apples does not
ship with MacOS X. As of today, Homebrew has the second largest number
of contributors at GitHub (next to Linux kernel source tree by Linus Torvalds).

Foundation is a responsive front-end framework that allows you to
easily build websites or applications that run on any kind of mobile
devices. Foundation includes layout templates (like a fully responsive
grid), elements and best practices.

Three.js is a cross-browser JavaScript library that allows you to
create and display lightweight 3D animation and graphics in a web
browser without any proprietary browser plugin. It can be used along
with HTML5 Canvas, SVG or WebGL.

Brackets is a web code editor written in JavaScript, HTML and CSS,
which allows you to edit HTML and CSS. Brackets works directly in your
browser, and so you can instantly switch between code editor view and
browser view, all within web browser.

Oh My Zsh is a community-driven framework for managing ZSH
configurations, where contributors contribute their ZSH configurations
to GitHub, so that users can grab them. It comes bundled with more than
120 ZSH plugins, themes, functions, etc.

This year, the annual Bossie
Awards recognize 120 of the best open source software for data centers
and clouds, desktops and mobile devices, developers and IT pros

What do a McLaren Supercar, a refrigerator, a camera, a washing
machine, and a cellphone have to do with open source? They're all
examples of how a good pile of code can take on a new life when it's set
free with an open source license.

The same forces are turned
loose in every corner of business computing, from application
development and big data analytics to the software that runs our
desktops, data centers, and clouds. This year's edition of our annual
Best of Open Source Software Awards rounds up more than 120 top projects
in seven categories:

When
the Android developers started releasing their operating system in
2007, they just wanted to colonize the world of mobile phones. The
iPhone was incredibly popular, and attracting any attention was a
challenge. Choosing an open source license was an easy path to
partnerships with the phone manufacturers around the world. After all,
giving people something free is an easy way to make friends.

In 2013, something unexpected happened: The camera engineers noticed
the explosion of creative new software apps for taking photographs with
mobile phones. Someone asked, "What if we put Android on our camera?"
Now Android cameras with better lenses can leverage the fertile software
ecosystem of Android apps.

This is the way that open source
software is supposed to work. Developers share, and software
proliferates. Is it any surprise that the folks at Samsung are now
making an Android refrigerator?
Or an Android clothes washer? Or an Android watch? Or that McLaren, the
maker of overjuiced cars, wants the radio in its car to run Android?
Will there be Android in our doorbells, our cats, and our sofas? Only
time and the programmers know. The source code is out there and anyone
can install it.

As in years past, this year's collection of Bossie
Award winners celebrates this tradition of sharing and
cross-fertilization. The open source software ecosystem continues to
flourish and grow as old projects continue to snowball while new
projects emerge to tackle new needs.

The most successful projects,
like Android, are finding new homes in unexpected places, and there
seem to be more unexpected places than ever.(( Throughout the Web and
the enterprise, open source is less and less the exception and more and
more the rule. It's in the server stacks, it's on the desktop, and it's a
big, big part of the mobile ecology.

The server stack is growing
increasingly open. Much of the software for maintaining our collection
of servers is now largely open source thanks to the proliferation of Linux,
but the operating system is just the beginning. Almost everything built
on top of -- or below -- the operating system is also available as an
open source package.

OpenStack
is a collection of open source packages that let you build a cloud that
rivals Amazon's. If you want your cloud to work the same way Amazon's
does, using the same scripts and commands, open source offers that too:
Eucalyptus. The cloud companies are using open source's flexibility as a
way to lure people into their infrastructures. If things don't work out
and you want to leave, open source presumably provides the exit. As Eucalyptus is to Amazon,
an OpenStack cloud in your data center should behave like the OpenStack
clouds run by Rackspace and HP by answering to the same commands.

The
Bossie Awards also focus on an increasingly important layer, the one
that keeps all of these machines in the cloud in line. Orchestration
tools such as Puppet, Chef,
and Salt serve the needs of the harried sys admins who must organize
the various servers by making sure they're running the right combination
of software, patches, libraries, and extensions. These tools ensure the
code will be the right versions, the services will be initialized, and
everything will start and stop as it's supposed to do. They automate the
tasks that keep the entire cloud in harmony.

Once the machines
are configured, another popular layer for the enterprise gets the
servers working together on answers to big questions. The cloud is not
just for databases and Web serving because more and more complex
analytical work is being done by clusters of machines that get spun up
to handle big mathematical jobs. Hadoop
is a buzzword that refers to both the core software for running
massively parallel jobs and the constellation of fat packages that help
Hadoop find the answers. Most of these companions are open source too,
and a number of them made the list for our awards.

These tools for big data are often closely aligned with the world of NoSQL data stores
that offer lightweight storage for extremely large data sets. This next
generation of data storage is dominated by open source offerings, and
we've recognized several that grew more sophisticated, more stable, and
more essential this year. The information for the increasingly social
and networked Internet is stored in open source tools.

By the way,
the past roles for open source aren't forgotten -- they've simply begun
to morph. Some of the awards go to a core of old-fashioned tools that
continue to grow more robust. Python, Ruby, WordPress, and the old
standard OpenOffice (in a freshly minted version 4) are better and
stronger than ever. Firefox -- both the browser and the operating system
-- received Bossies, illustrating the enduring strength of the openness
of HTML and the World Wide Web.

Some of these new roles are
surprising. One of HTML's close cousins or partners in crime,
JavaScript, is continuing its bold rush to colonize the server. Node.js,
a clever hack designed to repurpose the V8 JavaScript engine by
bringing it to the server, is now a flourishing ecosystem of its own.
The raw speed of the tool has attracted an explosion of attention, and
developers have shared thousands of modules that revise and extend the
core server.

What is notable is that many of the newest open
source tools are already on the front lines in enterprise shops. The
open source ethic began in the labs, where it continues to serve an
important role, aligning groups in pre-competitive areas and allowing
them to work together without worrying about ownership. A number of
important areas of research are advancing through open source blocks of
code, and our list of winners includes several projects for studying
social networks (Gephi, Neo4j, Giraph, Hama) and constructing
statistical models of data (Drill).

Throughout this long list, there continues to be a healthy competition between the different licenses.
The most generous and least encumbering options such as the MIT and BSD
licenses are generally applied to the tools built by researchers who
often aren't ready to commercialize their work. The more polished,
productlike tools backed by professional programmers are increasingly
being released under tighter rules that force more disclosure. Use of
the GPL 3.0 and the AGPL is growing more common as companies look to
push more sharing on those who benefit from open source.Openly commercialThe
companies behind open source projects are also becoming more adept at
creating tools that exert control and dominance. Many who are drawn into
by the lure of open source quickly discover that not everything is as
free as it seems. While the code continues to be shared openly,
companies often hold something back. Some charge for documentation,
others charge for privacy, but all of the successful companies have some
secret sauce they use to ensure their role.

Google, for instance,
is increasingly flexing its muscle to exert more control by pushing
more features into the ubiquitous Play Services. The Android operating
system may be free and available under the generous BSD license, but
more and more features are appearing in the Play Services layer that's
hidden away. The phone companies can customize and enhance the Android
layer all they want, but Google maintains control over the Play
Services.

This has some advantages. Some developers of Android
apps complain about the "matrix of pain," a term that refers to the
impossibly wide range of Android devices on the market. Any app that
they build should be tested against all of the phones and tablets, both
small and large. The Play Services offer some stability in this sea of
confusion.

This stability is more and more common in the professional stack as
the companies behind the projects find ways to sustain the development.
When the software is powering the servers and the apps that power the
business, that's what the enterprise customers demand. When the software
is running on cars, refrigerators, washing machines, and even mobile
phones, that's what the rest of the world needs too.

Friday, September 20, 2013

Most Linux system administrators spend their days at the command line,
configuring and monitoring their servers through an SSH session. The
command line is extremely powerful, but it can be difficult to keep all
the options switches and tools in your head. Man pages are only a
command away, but they're often not written for quick consultation, so
when we're stuck for some of the more arcane options, we reach for the
collection of cheat sheets that we've curated over the years.

Even command line masters occasionally need a litte help, and we hope
that terminal beginners will find these concise lists useful too. All of
these tools are installed by default on a standard Linux box except for
Vim and Emacs, which may or may not be available (see the package
manager cheat sheets for how to get them).

Server Management

SSH is the standard tool for connecting securely to remote servers on the command line. (We hope you aren't using Telnet.)

Screen is a must-have application for those who SSH into multiple
servers or who want multiple sessions on the same server. Somewhat akin
to a window manager for terminals, screen lets users have multiple
command line instances open within the same window.

Bash is the default shell on most Linux distributions (except Ubuntu,
but Dash is almost completely compatible). It's the glue that holds
together all the other command line tools, and whether you're on the
command line or writing scripts, this Bash cheat sheet will help make
you more productive.

Cron is a tool for scheduling tasks. The notation is simple but if you
don't use it a lot it's easy to forget how to set it to the right times
and intervals.

Writing and Manipulating Text

Vim is a powerful editor, and you'll find it or its older brother Vi on
most Linux systems. Vim has a modal interface that can be a bit daunting
for newcomers, but once you get to grips with how it works, it's very
natural.

Emacs is a text editor that throws the "do one thing well" philosophy
out of the window. The range of things that Emacs can do is seemingly
endless, and a good cheat sheet is necessary for getting to grips with
its finger work-out keyboard commands.

As a bonus for the Emacs users out there: check out Org mode. It's a
flexible plain text outliner that integrates with Emacs and can be used
for planning, to-dos, and writing.

Getting to grips with grep is essential if you deal with a lot of text files (as almost everyone managing a Linux server will).

Together Sed and Awk can do just about anything you might want to do with a text file.

Package Management

RPM

Distributions that use RPM for package management, including Fedora, RHEL, and CentOS have a couple of tools to choose from: Yum for high-level package management, and the RPM tool itself for manipulating and querying the package database at a lower level.

Deb Package Management

Debian-based distros like Ubuntu and its derivatives use "apt-get" for general package management, and "dpkg" for direct manipulation of debs.

Cheaters

If you're a regular user of cheat sheets and manage your servers from a
Mac, you might want to take a look at Brett Terpstra's cheat sheet app. Cheaters is a collection of scripts that will display an Automator-based pop-up containing a configurable selection of cheat sheets.

Check out the instructions on his site to find out how to integrate the
cheat sheets we've covered in this article with Cheaters.