Tech Advent CalendarsCombination of popular tech advent calendarshttp://pipes.yahoo.com/pipes/pipe.info?_id=c9a42b2f23c108a801e1afa9ce7df012
Tue, 31 Mar 2015 20:55:30 +0000http://pipes.yahoo.com/pipes/[perf] Hardware Accelerated CSS: The Nice vs The Naughtyhttp://calendar.perfplanet.com/2014/hardware-accelerated-css-the-nice-vs-the-naughty/
Everyone is fascinated with the smooth animation at 60 frames/second. It is hardly a surprise that one of the most prevalent performance advices to web developers is to use hardware accelerated CSS. In some extreme cases, it is not uncommon to imply that forcing it via translate3d will automatically boost your application performance (hint: it [&#8230;]http://calendar.perfplanet.com/?p=1789Wed, 31 Dec 2014 19:00:52 +0000Everyone is fascinated with the smooth animation at 60 frames/second. It is hardly a surprise that one of the most prevalent performance advices to web developers is to use hardware accelerated CSS. In some extreme cases, it is not uncommon to imply that forcing it via translate3d will automatically boost your application performance (hint: it is not the case).

Modern web browsers can take advantage of the GPU (graphics processing unit) to accelerate page rendering. Among many different features of a GPU, it can hold a limited number of textures (a rectangle of pixels) and manipulate those textures efficiently, including applying a certain transformation (translation, scaling, rotating, etc). This is extremely useful to achieve a fluid animation. Instead of drawing the pixels for every animation frame, the browser will “snapshot” the DOM element and store it as a GPU texture (often called as layer). Later, the browser will simply tell the GPU to transform the said texture to give the perception of an animating DOM element. This is called GPU compositing, naturally referred to as “hardware acceleration”.

Unfortunately, a web browser is a complex piece of software (Firefox comprises millions of lines of code). Because of this, a blanket simplified statement such as “use translate3d for performance” is likely a hit-and-miss. It is thus imperative to understand a little more as to what happens under hood so that you can understand the situation better.

Imagine for a moment that using a GPU-accelerated animation is like Vin Diesel driving Dominic’s iconic car, the Dodge Charger. Sure, with its custom 900 hp engine, it goes from 0 to 60 in a blink of an eye. But what’s the use of it when you are crawling with many other cars in a packed freeway during the rush hour? Your choice of vehicle – in this case, the Charger – is just fine. The problem is that you are still at the mercy of the traffic condition.

The same goes for GPU compositing. Many aspects of the animation still require the intervention of the CPU. After all, this is where the browser code is being executed. The bus connecting the CPU and GPU has a finite bandwidth, hence it is important to pay attention to the data transfer between them to prevent a congested channel. In other words, you should always mind the pixel traffic.

The first and foremost thing to be aware of is the number of composited layers being created. Since every layer is mapped as a GPU texture, having too many layers will exhaust the memory. This may lead to an unexpected behavior, everything from frame skipping to a potential crash. Fortunately, you can easily check those layers using the web browser itself. With Firefox, go to about:config and toggle layers.draw-borders to true. If you are a Chrome user, open chrome://flags/#composited-layer-borders and enable it. For Safari fans, first run this on your OS X terminal: defaults write com.apple.Safari IncludeInternalDebugMenu 1. Relaunch Safari and there will be an additional top-level menu Debug where you can find Drawing/Compositing Flags, Show Compositing Borders. You can also get the memory consumption of every layer by looking at the Layer sidebar in Web Inspector.

When those web browsers are configured with the appropriate flag, every DOM element that is composited by the GPU will be marked with the additional colored border (as a quick test, try it on this Spinning Cube demo). This way, it is easy to verify whether your web page has too many layers.

Another important aspect of this GPU compositing business is to keep the traffic between GPU and CPU to the minimum. In other words, the amount of layer updates is ideally a constant. Every time there is an update, a new set of pixels potentially needs to be transferred to the GPU. Thus, for performance reasons, it is important to avoid any layer updates after the animation already starts. This is possible by carefully choosing the properties to be animated: transformation (translate, scale, rotate), opacity, or filters.

If you use Safari’s Web Inspector, the Layer sidebar reveals the layer updates in the Paints field. It indicates how many times Safari uploads a new texture to reflect the content of the layer. Try it on the Colorful Boxes demo where each box alternates its color between blue and green. Unfortunately, changing the box’s background color forces a layer update and hence is why the paints number keeps increasing during the animation. With just one box it might be fine, but a hundred boxes like that will bring the GPU to its knees. While this is a contrived case, it serves as a reminder that no amount of translate3d will save the day if your pixel traffic is disastrous to begin with!

Necessity is the mother of invention. The constraints of working with layers often lead to creative and surprising ways to exploit the system. A variant of “terraforming” can be achieved by having the initial and final portions of the UI reside in the same layer with a clipping rectangle showing one part and hiding the other. Another slightly similar illusion is employing two layers superimposed on top each other. The animation is carried out by simply changing the opacity of both layers so that the result is a tweening of both, as demonstrated in this Glowing Effect demo.

Another common practice to maintain a reasonable pixel traffic is by having a pool of layers. When some layers are not needed, they are not disposed completely as they can be moved off-screen or set to fully transparent. In some cases, the user interface design implicitly permits a finite number of layers. The following screenshot shows the Cover Flow example where only 9 (nine) images are visible at any given time. Even if it is supposed to display hundreds of book covers (as the user swipes left and right), you do not need to build tons of layers at once. With a little bit of trick, you can swap the content of a layer with the new image at the right time and the user will not notice it.

Also, never forget that you must always run a profiler to double check your theory. Performance is a serious matter and it will be a mistake to rely only on gut feelings. Chrome users should enable chrome://flags/#show-fps-counter. Its equivalent in Firefox is layers.acceleration.draw-fps from about:config. With the frame rate counter, run your animation and watch it carefully. If the frame rate drops below 60 fps (or whatever threshold you are aiming for), then it is time to investigate the issue. For this, Chrome’s Timeline feature or Safari’s Timeline panel will give a detailed insight of every rendering operations: layout, painting, and compositing.

To prevent performance regressions, an automatic variant of the above step is necessary. This is where browser-perf from Parashuram becomes extremely handy. As he already wrote a few weeks ago, browser-perf gathers some important rendering statistics from running the tests on the web page. For this context, metrics for layer counts and paint counts are extremely valuable. The data enables you to trigger an alert if those values regress over time.

While many articles have been written on the subject of hardware accelerated CSS, hopefully this post serves as another quick reference on how to (ab)use GPU composited animation in such a way that it does not land us in the naughty list of 2015. Stay out of trouble and happy accelerating!

]]>[perf] Web Beacon Speedup for Improved User Experiencehttp://calendar.perfplanet.com/2014/web-beacon-speedup-for-improved-user-experience/
Web Beacons have been used by site developers to understand the behavior of customers. These Beacons are used, among other things, to count the users who visit a web page, track scrolling within the page, or count clicks on a particular ad/video etc. The Problem Sometimes these beacons can either take too long or too [&#8230;]http://calendar.perfplanet.com/?p=1815Wed, 31 Dec 2014 18:37:43 +0000Web Beacons have been used by site developers to understand the behavior of customers. These Beacons are used, among other things, to count the users who visit a web page, track scrolling within the page, or count clicks on a particular ad/video etc.

The Problem

Sometimes these beacons can either take too long or too many beacons are fired from the page, thereby slowing down the performance of the site.

Most beacon servers normally have an proxy server fronting them, so all requests pass through them. Lets look at what we can do to minimize the impact of these Beacons on the site performance by using the proxy.

Figuring out what is Important

A simple version of a beacon is a tiny clear image that is the size of a pixel. When a web page with this image loads, it will make a call to a server for the image. These clear GIFs are invisible because they record specific activity on a web page rather than deliver content. So what is important is the “recording” part and what is irrelevant is the response from the beacon server (because it stays the same) back to the client.

What can we do to return this response faster?

The Secret Sauce

The first ingredient is by using stale-while-revalidate in the Cache-Control HTTP header. We can instruct the proxy to respond with a cached copy of the beacon response. We can specify a large value (in seconds) for stale-while-revalidate to ensure that this gives real benefits.

The returned Cache-Control header will contain the following: "stale-while-revalidate={big_number}". But what about the “recording”? If we specify stale-while-revalidate, the proxy will not call the server if the cached copy is still fresh (which is determined by “max-age”). But it will asynchronously call the server if the copy is stale. So the second ingredient is to specify the max-age in the Cache-Control header as 0, thereby making the copy stale always. The resulting Cache-Control header will look like the following:

Cache-Control: max-age=0, stale-while-revalidate={big_number}

So all calls to the proxy will result in 2 things:

Respond with a cached copy of the response right away

Asynchronously call the server and “record” the action

But wait there is more! This solution can be enhanced further by adding few more things

Stop overwriting cache entries

Since the HTTP response code from the beacon server on a successful call will be a 200 the cache entry will be continuously updated for each asynchronous call. This is unnecessary since we already know that the response doesn’t change. Besides that writing to the cache continuously can be costly. So once a copy of the response is cached, we can need to do something to trick the proxy into not updating the cache.

Use the same cache copy for all URLs to a particular beacon server

Calls to the same beacon server can be different based on the URL query parameters. We can enhance the solution by using the same cache key for all these calls. This saves space in the cache.

Cache pinning

We can prevent the old cached entries from cleanup in the proxy by using options such as Cache Pinning (in ATS) which ensures that certain objects stay put in the cache for a given amount of time.

Results

We did a test on one of the sites in Yahoo and saw an improvement close to 90% in the beacon response times.

As you can see, we can improve the performance of the site greatly by creatively using a proxy and making no change to the beacon servers themselves.

In conclusion

At Yahoo, we use Apache Traffic Server (ATS) as a proxy fronting almost all of our sites and beacon servers. ATS has the flexibility (via configurations and plugins) to allow us to implement all of the steps mentioned above. The results published above were tested against a Yahoo beacon server fronted by ATS.

Note: The open source stale-while-revalidate (SWR) plugin is currently not working, but Yahoo has a working version of this plugin. We intend to either contribute towards making SWR work in ATS or submit our plugin to open source soon.

]]>performance[perf] Leverage Browser Storage For a Faster Webhttp://calendar.perfplanet.com/2014/leverage-browser-storage-for-a-faster-web/
One of the cardinal rules for web performance is to reduce HTTP requests. The common interpretation of reducing HTTP requests limits the focus to bundling and minifying scripts, creating image sprites and eliminating unused resources. Developers often overlook AJAX requests. Many AJAX calls GET the same, unchanged data as previously made requests. Today&#8217;s rich web [&#8230;]http://calendar.perfplanet.com/?p=1814Tue, 30 Dec 2014 23:22:04 +0000One of the cardinal rules for web performance is to reduce HTTP requests. The common interpretation of reducing HTTP requests limits the focus to bundling and minifying scripts, creating image sprites and eliminating unused resources. Developers often overlook AJAX requests. Many AJAX calls GET the same, unchanged data as previously made requests. Today’s rich web applications, like single page applications, rely on numerous AJAX calls to retrieve data before it is rendered. Untamed these AJAX requests create unnecessary chatter between client instances and the server.

All modern browsers provide at least 2 ways to cache and persist data locally, localStorage and IndexDB (WebSQL is still around, but has become deprecated). Browser storage provides a mechanism to cache data, allowing us to avoid many costly HTTP requests. Eliminating these requests makes our applications perform much faster and helps server scale.

A few years ago Paul Irish publish an example using localStorage and jQuery’s ajaxPreFilter mechanism that caches AJAX GET request results. Paul’s ajaxPreFilter checks to see if the desired data exists in localStorage before the request to the server is made. If the data exist the preFilter cancels the AJAX call and executes the callback, passing the cached data to the callback. If the data is not cached locally the AJAX call is made and the response data is cached in localStorage for future requests.

The ajaxPreFilter is a simple, but effective mechanism to reduce AJAX calls. Production applications need various levels of sophistication because each piece of data needs different times to live (TTL) or may not need to be cached at all. When designing a local caching architecture developers should account for the application’s data needs.

The ajaxPreFilter is a great way to hook into the jQuery AJAX pipeline, but more and more applications do not rely on jQuery. In addition having a common caching library allows developers to reuse local caching across various data service modules and applications in a framework agnostic manner. A common module I use implements a known interface of 4 functions:

setItem(key, value, ttl)

setObject(key, value, ttl)

getItem(key)

getObject(key)

The setObject and getObject functions call the setItem and getItem functions internally, but abstract the JSON.stringify and JSON.parse functions because this functionality is repeated in many applications. The setItem function adds the ‘value’ to localStorage using the ‘key’ as the index. An additional value is added to localStorage, a time to live value if a ttlKey value has been provided to the cache module.

The ttlKey value is used to test if the data is stale. The value is a timestamp millisecond value and is set to a time in the future. The value I am using in this article’s example is 24 hours or 86400000 milliseconds. If the request time is made after the TTL time has passed the cache library returns null and the stale data is removed from localStorage.

The following code block demonstrates how to use a cache service to retrieve locally stored data or make an AJAX call when a local instance of the data is not available. The code uses the reqwest library to make AJAX calls and is part of larger module with values defined elsewhere. The example uses the Rotten Tomatoes API to retrieve a list of movies opening this week.

getNewMovies: function(callback){varmovie = this,
url = rtRoot + "lists/movies/opening.json?apikey=" +
apiKey + "&page_limit=" +
defaultPageLimit + "&page=1",
// use cache provider to see if the data is availablecached = movie.cache.getObject(newMovies);
// if a local version is available use it and make the callbackif(cached){if(callback){callback(cached);
}}else{// no local version of the data is available, make the ajax callreqwest({url: url,
type: "jsonp"})
.then(function(resp){// store data to local Cachemovie.cache.setObject(newMovies, resp.movies, moviesTTL);
// make success callbackif(callback){callback(resp.movies);
}})
.fail(function(e){console.error("just look at what you have done!");
});
}},

Using the developer tools you can check localStorage before an initial request is made. Here there are no values stored in localStorage:

Loading the page causes an AJAX request to be made:

Checking localStorage again shows we have the list of movies and a corresponding TTL value cached in localStorage:

Checking a 2nd page load we see there is no AJAX call made to retrieve the freshly cached data:

As you can see there is no additional request made to load the unchanged data. The client experience is virtually instant because the data is being retrieved from within the browser. Because the data is stored in localStorage it can also persist across sessions and does not increase the application’s memory footprint. If data needs to be purged when a user session ends it can be stored in sessionStorage instead. The TTL key gives the data a life cycle, causing the data to eventually become stale. When the data is stale a request to the server is made, retrieving fresh data to be cached locally.

Caching data locally not only creates faster applications, but applications that scale better. When the server does not need to respond to excess API calls it can process other calls more efficiently. It also means online services do not need as many servers. This reduces business overhead because hosting fees are lower and IT does not need to respond to as many scaling issues.

This example focuses on localStorage, but sessionStorage and IndexDB could be used. The localStorage caching library is configured to use either localStorage (default) or sessionStorage. A sibling library could be substituted that implemented the getObject and setObject functions that uses IndexDB as a data store.

]]>JavaScript[perf] Speeding up HTTPS with session resumptionhttp://calendar.perfplanet.com/2014/speeding-up-https-with-session-resumption/
One of the main drawbacks of HTTPS is the time it takes to set up a connection. Specifically, every new TLS connection requires a handshake in order to establish shared encryption keys. This handshake requires two extra round trips on top of the standard TCP handshake roundtrip. On a high-latency connection, waiting for three roundtrips [&#8230;]http://calendar.perfplanet.com/?p=1802Tue, 30 Dec 2014 21:07:21 +0000One of the main drawbacks of HTTPS is the time it takes to set up a connection. Specifically, every new TLS connection requires a handshake in order to establish shared encryption keys. This handshake requires two extra round trips on top of the standard TCP handshake roundtrip. On a high-latency connection, waiting for three roundtrips before the first byte is be transferred can make sites appear to load slowly.

A TLS handshake.

TLS has several features that can be used to eliminate round trips during when resuming a session. The two standardized session resumption mechanisms are session IDs (RFC 5246) and session tickets (RFC 5077). Using either technique, a client can resume a previously established a session with a server using an abbreviated handshake, saving one round trip.

Session resumption based on session ID is available in all modern browsers. Both Firefox and Chrome also support session tickets. Support on the server side is also widespread, with nginx, Apache, HAProxy, IIS and others supporting both session IDs and session tickets natively.

Session ID resumption

Resuming an encrypted session is easy if both client and server keep the session keys around. By giving every connection a unique identifier, the server can know if an incoming connection has been seen before. If the server still has the session keys used in that session, it can be resumed.

An abbreviated handshake with session ID resumption

Session IDs require the server to keep the session state (i.e. the session keys) ready in case a previous session needs to be resumed. This requires the server to store a lot of state information, which can require significant amounts of memory.

Session ID sharing is available in Apache through the SSLSessionCache directive and nginx through the ssl_session_cache directive.

Session ticket resumption

In session ticket resumption, the server doesn’t have to store state information for every session it has ever created. Instead, it saves the state in a blob of data and gives it to the client to maintain. Session tickets allow the server to outsource the storage of some of its state to clients, similar to the way HTTP cookies are sometimes used for authentication information.

A session ticket is an encrypted blob of data containing the information needed to resume a TLS connection (i.e. the session keys). It is typically encrypted with a “ticket key” known only to the server. The server sends a session ticket to the client during the initial handshake for it to store locally. When resuming a session, the client sends the session ticket back to the server which decrypts it and resumes the session.

An abbreviated handshake with session ticket resumption

Security Considerations of Session Tickets

If session tickets are implemented improperly there is a potential security downside. Some TLS cipher suites (such as ECDHE-RSA-AES128-SHA256) offer a security property called forward secrecy. If an attacker gets access to a server’s certificate private key, they can’t take past conversations and decrypt them.

With TLS session tickets, stealing the ticket key does allow an attacker to decrypt previous conversations. This makes the ticket key a very valuable key, and using the same key for too long compromises forward security. To keep forward security, ticket keys should be rotated often.

Session ticket resumption is available in Apache through the SSLTicketKeyDefault directive and nginx through the ssl_session_tickets directive. There is no automated way to rotate ticket keys at the moment, but restarting your Apache or nginx instance will cause it to either reload the key from disk or create a new random key.

Load balancing

One of the challenges faced when using these techniques at scale is load balancing. For one server to resume a connection, it needs the key from the previous session. If the previous session was on a different server, the new server needs to get the original session keys somehow.

The approach to this problem taken by both CloudFlare and Twitter is to use a centralized key generation system. Ticket keys are created in a centralized location on a fixed schedule and securely distributed to all the servers used to load balance TLS connections. Implementing session ticket sharing requires a custom system tailored to your architecture.

Conclusions

Reducing the number of roundtrips necessary to establish a connection makes sites appear to load more quickly. For sites using HTTPS, session resumption can be used to improve connection establishment times for returning visitors on all browsers. When implemented correctly, they can provide a noticeable improvement to page load times, even in load-balanced environments.

]]>performance[perf] Blast from the Past: Impact of TCP Components on Time to First Bytehttp://calendar.perfplanet.com/2014/blast-from-the-past-impact-of-tcp-components-on-time-to-first-byte/
Season&#8217;s Greetings! There are many reasons why your first byte can be slow but I am going to talk about a very specific interaction thats very well known to network geeks but can use some circulation among the front-end developers for it happens to be in the critical path of the browser. In particular this [&#8230;]http://calendar.perfplanet.com/?p=1806Mon, 29 Dec 2014 23:50:54 +0000Season’s Greetings!

There are many reasons why your first byte can be slow but I am going to talk about a very specific interaction thats very well known to network geeks but can use some circulation among the front-end developers for it happens to be in the critical path of the browser. In particular this has a tendency to effect the boundaries at which SSL record layer hands off control to the HTTP layer.

It is not a bad idea to refresh your basics of TCP before diving in to read the rest of this article.

Background & Motivation

As you can see we spend 120 bytes of TCP header overhead to transport 1 byte of data. To prevent this gross inefficiency/underutilization of the network the following distributed solution was implemented:

Sender
– delay transmission of partially filled segments until all previously transmitted data has been acknowledged

The BSD folks developed the client side solution and the server side was implemented by John Nagle. The deadlock introduced by the interaction of the two components was not noticed until it got deployed in the real world production networks.

Delayed Ackowledgements

Traditional TCP implementations hold up acknowledgements for up to 200ms, hoping to piggy-back ACK on a data bearing outgoing packet. However, an immediate ACK is generated after every second full size (MSS-sized) packet. The motivation for delayed ACKs is to save bandwidth. Now we incur the following costs due to its deployment:

* During slow start and congestion avoidance, congestion window growth is driven by the number of acknowledgements received. For bulk transfers, you tend to get an immediate ACKs every second packet – so the sender sees half as many ACKs. This means that slow start is significantly slower than the “double cwnd every roundtrip” that you read in textbooks. The naive expectation is to send 2,4,8,18 segments in the first 4 roundtrips whereas in reality you would see the pattern of 2,3,5,8 cumulative segments sent with this in effect. Also the effect on congestion avoidance is also significant: the usual TCP throughput bound (for high bandwidth, large receiver windows, low-but-nonzero packet loss) is reduced by a factor 1/sqrt(2) with delayed ACKs on.

* Various timer-related anomalies. For instance, if you were to let RTO fall below 200ms, the sender may decide a packet is lost when the receiver was simply holding up an ACK. Hence, TCP stacks generally don’t let RTO ever fall below 200ms, even when this would otherwise be desirable, such as on a small RTT link.

Linux modifies this delayed ack timeout to be adjusted to the inter-arrival spacing by sampling around a small time interval like 15-20 ms and enforces a typical maximum thats much less than 200 ms ( around 40 ms delay is common). However Windows Network stack sets the maximum timeout at 200 ms with a rapid increase during the connection life time.I am not

Note that in the case of request/response traffic such as HTTP, there is no hope
of piggybacking ACKs on data anyway, so one of the key motivations for delayed ACKs is gone.In relatively high-bandwidth links, the bandwidth cost of additional ACKs is small. Overall, it is almost certain that delayed ACKs are more complicated to implement for the (small) gain it warrants.

Nagle’s Algorithm

The first standard describing this algorithm is RFC 896 which says send no new segments (any size!), when new data arrives from the user while there are any unacknowledged segments (any size!). This worked well. However, RFC 1122 relaxed the original central clause and allows you to send out full segments early. Now the significance of the outstanding packet size became controversial among implementers as to what constitutes a full sized packet (MSS, PMTUD, receiver not knowing what sender’s MSS,etc). I wont go into the gory details of why MSS the calculation is messy but be assured that Nagle’s algorithm is not consistently implemented across stacks thanks to this confusion

One could argue that Nagle is irrelevant to today’s Internet, and that floods of small packets are no longer a problem worth solving. This argument fails on at least three grounds:

Many people connect to the network over wireless links, which usually are both slow and shared

Even on fast links, excessive use of small packets makes inefficient use of expensive resources, such as routers

Nagles’ algorithm is a useful firewall against sloppy applications or complex bugs that would otherwise send too many tiny packets.

There was a proposal to rethink the Nagle algorithm which seems promising.Mac OS X is the only IS that integrated this modification into their kernel but given that this is a server side option, it would be interesting to see the same on Linux to gauge its impact

The Classic Deadlock Problem

Now that we have seen why the components of TCP lets see how their interaction causes a deadlock.Instead of a contrived example to demonstrate this classic problem , lets take a look at this problem in the wild. The thread has a lot of details from WPT waterfalls to tcpdumps. The TLDR can be read here if you are impatient

The problem can be summarized as follows:

Nagle’s Algorithm kicks in and starts to wait for its in-flight data to be acknowledged before sending more.Delayed ACK applies to a single packet and even number of full sized packets will trigger an ACK immediately but odd number of packets will have to pay the tax of delayed ack timeout which is 200ms in the case here using a windows WPT client

To summarize the performance-killing effect:

Nagle/Delayed ack Interaction:

Nagle won’t send the last bit of data until it gets an ACK

Delayed ACK won’t send that ACK until it gets some response data

TCP won’t release the response to the application until it gets all the data acknowledged
– and so on till a timeout occurs

Standard Solution

This classic problem does not occur with non-persistent HTTP requests because closing the TCP connection also immediately
sends any data waiting for transmission. For persistent connection this can be resolved by disabling Nagle’s algorithm,
thus disabling the aspect of SWS avoidance which interferes with performance. If traffic is predominantly HTTP based, disabling Nagle’s algorithm in the TCP stack may generate a slightly larger number of packets but throughput will usually be better. Routinely people do enable a socket option called TCP_NODELAY that effectively disabled Nagle’s algorithm and lets the server send the data without waiting or the acknowledgement from the client

Analytically if the bottleneck is around 200 ms per transaction then you can only do 5 such transactions per second compared to at least 15 (assuming each transaction takes 62 ms) as seen in the following example

Nagle Turned ON:

<

Nagle Turned OFF:

Returning to the problem posted by the original poster the solution of disabling nagle did not work as intended and needed a bugfix from nginx folks that made it work

Variants of the Problem

This problem albeit reported in October 2014 has a precedent in 1997 when Heidemann investigated performance problems of persistent connections. Specifically, an object that has an odd number of full MSS packets followed by a short packet on a persistent connection will lead to a 200ms delay, which doesn’t occur if the object had an even number of large packets. This is common at the beginning or the end with a potential in between :

Short initial segment problem:

Slow start kicks off with initial congestion window (was 2 in his time and 10 currenly)

TCP delays ACK’s up to 200ms but acknowledges at least every other full segment if the number of segments is even but otherwise a delay between 1 and 3 packets if the first packet is less than the MSS

The broader observation made in the paper is that delaying ACK (hope to piggyback) is rarely successful in request/response traffic. (They note that delayed ACK is also bad for FTP).

Odd/Short-Final-Segment problem

Suppose have odd number of full segments and a short final segment. The sender won’t sent this final segment if it is small (less than half the clients advertised window) because of silly-window-avoidance+ Nagle until it sees an ACK. But, for same reasons as above, the client will delay ACK. So again, about 200ms delay.

Slow Start Restart Problem

If all data is acknowledged and no more data is sent for one retransmission time out period, then congestion window is set back to initial value and repeat slow-start as the network information might be out of date

The above reason is why persistent connections are not as effective as one might think. People usually disable the socket option tcp_slow_start_after_idle to prevent this behavior and is important for single connection transfers like SPDY

SSL Impact

My own brush with this problem took the form pictured below:

The two gray shaded areas above represent the slowness during SSL handshake with a WPT client and we were at a loss to understand why SSL handshake was so slow particularly when dealing with Windows clients and now you all know the solution

The Solution Space

The best way to attack this problem is in the application space rather than tweaking global socket transport. For example any application that is doing a write can actually selectively disable Nagle ONLY on the last write of the application which improves the performance while not sacrificing the benefits.

Most of the times, these days the underlying framework does the writes for you (like nginx, apache, openssl, etc) but given the rising trend of microservices the following tactics can be employed to avoid these deadlocks if you are writing to the network

Application Write Strategies :

Use Vectored (Scatter/Gather) I/O if a single module does all the writes to network

Investigate TCP_CORK and MSG_MORE Semantics as it befits your application to make sure you never send any segments smaller than MSS

Microsoft and Apple can reduce their Delayed ACK timeout implementation as 200 ms is an arbitrary choice based on original estimate for inter-arrival time of segments by choosing to use an adaptive policy

Linux can definitely implement the modifications to Nagle mentioned in Minshall & Mogul albeit wont help the SSL interaction issues

Final Thoughts & Conclusion

None of the classic TCP modeling efforts factor in the effect of delayed acknowledgements and please note that most web traffic flows never leave slow start phase so the impact of delayed acknowledgements is pretty severe on short web flows

Most people tend to disable Nagle rather than disable delayed acknowledgements because they can control the server side easily and have no control on the clients.There is a control knob at the edge servers of your CDN where I have seen this being disabled with no deleterious effects

So the last frontier is to selectively enable them at the client rather than pay the tax of these messy interactions. Personally I would like to study the effects on how much do we lose (my guess is a max of 1-2 RTTs for long flows and nothing for short flows) if we indeed did disable delayed acknowledgements. This helps put a number to the cost of SWS avoidance and we can do a cost benefit analysis.

A delayed ACK is a bet. You’re betting that there will be a reply, upon with an ACK can be piggybacked, before the fixed timer runs out. If the fixed timer runs out, and you have to send an ACK as a separate message, you lost the bet. Current TCP implementations will happily lose that bet forever without turning off the ACK delay. That’s just wrong.
The right answer is to track wins and losses on delayed and non-delayed ACKs. Don’t turn on ACK delay unless you’re sending a lot of non-delayed ACKs closely followed by packets on which the ACK could have been piggybacked. Turn it off when a delayed ACK has to be sent.

I should have pushed for this in the 1980s.

Thank you for reading thus far and I hope this conveyed the messy and somewhat surprising interactions among TCP components that affect your web performance timings.

Happy Holidays and Hopefully 2015 will see less of these problems

]]>performance[perf] Different Angles of Web Performancehttp://calendar.perfplanet.com/2014/different-angles-of-web-performance/
When people are talking about web performance, they may talk about different aspects of the subject depending on their role and the task on hand. The real life is rather messy, so we use abstractions that let us get away from details not important for the moment. The same reality may look quite differently depending [&#8230;]http://calendar.perfplanet.com/?p=1807Mon, 29 Dec 2014 17:30:34 +0000When people are talking about web performance, they may talk about different aspects of the subject depending on their role and the task on hand. The real life is rather messy, so we use abstractions that let us get away from details not important for the moment. The same reality may look quite differently depending on how we look at it. Adjusting our view for our specific needs, we probably may highlight four major angles to look at web performance.

1. How Fast is Fast Enough?

That angle of performance focuses on the requirement side: what performance should be, usually without diving into implementation details. The traditional approach here usually discusses system usability and user perception of performance. Older research (which may be tracked back at least to Robert Miller’s paper published in 1968 – still very good reading) usually focused on how fast the system should be to optimize user productivity (and typical scenario included users working with an internal system all the time – for example, entering orders). Most researches agreed that there are several threshold levels of human attention fundamental for human-computer interaction.

While it doesn’t look like anybody attacks the idea of several fundamental threshold levels of human attention, specific numbers vary. Some reports suggest that response time expectations increase with time. Forrester research of 2009 suggests two second response time; in 2006 similar research suggested four seconds (both research efforts were sponsored by Akamai, a provider of web accelerating solutions). While the trend probably exists (at least for the Internet and mobile applications, where expectations changed a lot recently), the approach of these reports was often questioned because they just asked users. It is known that user perception of time may be misleading.

While we still have users working with internal systems (and probably many more now), it is not the focus anymore. All discussions are about sales now (or, to say more generically, conversions) – and the typical model is free users in the Internet jungles – free to select where to go and what to abandon.

While it is important to know and understand these data, we shouldn’t forget that response time expectations depends on the number of elements viewed, the repetitiveness of the task, user assumptions of what the system is doing (see, for example, How Fast Is Fast Enough by Peter Sevcik), and interface interactions with the user (see, for example, An Introduction to Perceived Performance by Matt West). Stating a standard without specification of what page we are talking about may be an oversimplification.

Even more careful we should be with statements how much change in performance will cost you in sales and conversions. The published numbers are important points giving us an idea of what the relation can be, but we shouldn’t assume that it would be exactly the same in our specific case.

Discussing this “how fast is fast enough” angle we see that it usually doesn’t jump into details and often results in mantra-like statements – details are either absent at all or you need to apply a lot of efforts to find what is really behind the numbers. This angle concentrates on the requirements and cost of deviations from them – leaving most other details (such as specific metrics, ways of aggregation, and levels of load) out of discussion.

2. Web Performance Optimization

Another angle is Web Performance Optimization (WPO) which, in a way, established itself as a separate engineering discipline. WPO looks into minute performance details of every element of a specific page. Basically we analyze single-user performance, focusing on the front end, for specific pages and client configurations. We are looking into where exactly time is spent – while usually abstract from other details such as variability of response times or the level of load on the system. WPO is the central topic of the Performance Calendar, so I’d rather leave further description of this angle to experts in the field.

For the purpose of this discussion, I want only to highlight that here we have discussions on how exactly we should measure performance as far as we have many relevant metrics for a web page (not to mention that user action may not result in loading a web page – and probably we will have more such things in the future). See, for example, A non-geeky guide to understanding performance measurement terms by Joshua Bixby or Moving beyond window.onload() by Steve Souders for a discussion about available options. The topic of web performance metrics getting a lot of attention recently with several new approaches suggested – such as the Speed Index (see, for example, Measuring web performance by Patrick Meenan.

3. Presenting Data

The third angle is data aggregation and presentation: how would we monitor, analyze, and report? Response times, even if we agree on the way how to measure them, are not a single number. And even not a few numbers for typical configurations. It is a huge array of individual response times – with at least one number for every single action of every single user (or several if we measure different metrics). Full raw data are just not comprehensible by a human mind – you need to find a way to aggregate, present, and visualize this information to make it useful.

The problem is that whatever way you aggregate information, you lose some. Different ways of aggregation – averages, percentiles, min and max values, etc. – have their pros and cons, but none is ideal. Rigorous Performance Testing on the Web by Grant Ellis has a nice discussion about the topic starting slide 26, up to using histograms and CDFs (Cumulative Distribution Functions). We need different ways of aggregation for different purposes. For example, to track down issues we need a way to slice and dice information to narrow down the problematic area. In this case you need access to rather granular data – because if problematic results would be averaged with other data, they would be practically useless for further analysis.

A completely different task is high-level reporting of system’s health. You want to see the whole picture and overall trend at once. No ideal solution is suggested here either. One of probably most interesting approaches is Apdex (Application Performance Index). While many are skeptical about it and it looks like not much was happening with Apdex for many years – it still attracts a lot of interest. For example, Apdex is used by New Relic and, with some modification, by Dynatrace as User Experience Index.

So we have different ways to aggregate and report information – and we need all of them: from a high-level health indicator to deep-level slicing and dicing of information to get to specific issues. No ideal solutions are found yet, but it looks like this angle got a lot of interest recently as a new generation of motoring products get to maturity.

4. Load and Scalability

The fourth angle is load and scalability. It is most used in realms of back-end design and development, load testing, and back-end monitoring. For a high-level summary, see, for example, Andy Hawkes’ post When 80/20 Becomes 20/80 and my post Performance vs. Scalability.

There are systems nowadays, with parallelized architectures and auto-scaling, when load may not noticeable impact response times in normal modes of operations (if forget about third parties components and services). In such cases this angle may be less important. But, unfortunately, such systems are much rarer in the real life than it may seem from Internet discussions – and, when you see such system, it means that somebody did a very good job designing and optimizing back end.

Why Do We Care?

These different angles are useful abstractions to concentrate on what is important for the moment to overcome excessive details of the real life. In a way, they are four different dimensions and they may be considered orthogonal for some particular tasks. But, in general, they are not – and in reality are heavily interconnected on some levels. So it is important to remember that the subject has other facets you may need to factor in.

Ideally performance should be addressed at all phases of system lifecycle: from the very beginning (performance requirements, how fast the system should be and how much load to handle) to design and development (using scalable design and using performance good practices, both back-end and front-end) to testing (for both single-user performance and load) to support and maintenance (closely monitoring performance in production and providing input for both development and testing for further improvement). We look at performance from different angles depending on lifecycle phases and task on hand – but we need all of them for a holistic view.

]]>performance[perf] MozJPEG 3.0http://calendar.perfplanet.com/2014/mozjpeg-3-0/
Mozilla has done a study of image formats and concluded that WebP and JPEG XR are not a big-enough improvement over well-optimized JPEG. In the study only HEVC (H.265) was significantly better, but it&#8217;s a patent-encumbered format, so it can&#8217;t be used freely (shhhh!) It seems that Mozilla has a short-term and a long-term plan [&#8230;]http://calendar.perfplanet.com/?p=1781Sun, 28 Dec 2014 19:05:58 +0000Mozilla has done a study of image formats and concluded that WebP and JPEG XR are not a big-enough improvement over well-optimized JPEG. In the study only HEVC (H.265) was significantly better, but it’s a patent-encumbered format, so it can’t be used freely (shhhh!)

It seems that Mozilla has a short-term and a long-term plan for image compression. They’re sponsoring development of the Daala codec, which is technically very interesting, but not production-ready yet.

For the short term Mozilla has developed MozJPEG — a modernized JPEG encoder that offers better compression while remaining fully standard-compliant, so it’s compatible with all browsers, operating systems and native apps, and you can use it today without waiting for the whole world to upgrade (BTW: if you need the same for images with alpha channel, try lossy PNG).

MozJPEG features

Advanced compression

In addition to all standard libjpeg optimizations MozJPEG performs progressive scan optimization—a trick from the jpegcrush/jpegrescan tool. Parameters of progressive JPEG passes influence compression, so it’s possible to make smaller files by tuning them just right. It’s like having ImageOptim built-in.

MozJPEG also brings to JPEG a technique from modern video codecs called trellis quantization. Simple JPEG encoders discard a fixed amount of detail according to quality set, but MozJPEG looks how many bits it would cost to write each detail and discards details that compress the least. This makes files much smaller, but also tends to make images look softer (which, depending on an image, may be a good or a bad thing).

Cleaner black-on-white text and lines

JPEG is known for being terrible for compression of text and cartoons. Well, I’ve fixed that (partially): now MozJPEG won’t create ugly gray halos around high-contrast edges on a white background.

libjpeg 6b (6.2KB)cjpeg -sample 1x1 -quality 16

Deringing disabled (6KB)cjpeg -noovershoot ...

MozJPEG (6KB)cjpeg -sample 1x1 -quality 35.5

BTW: In this case JPEG’s file size is about the same as 32-color PNG’s! Image by xkcd.

Smoother high-DPI “compressive” images

JPEG files have quantization tables that allow fine-tuning of quality for various levels of detail. It’s like an audio equalizer with 64 knobs—e.g. you can choose whether you want sharper edges or less blocky gradients).

Modification of quantization tables used to be a hidden option reserved for the most advanced users. MozJPEG made it more accessible by adding a few good presets. It’s especially useful for high-resolution images. Standard JPEG tables have been tuned for low-DPI displays, and for compressive images tend to put too much emphasis on tiny details over accurate color reproduction.

ImageMagick (20.8KB)convert -quality 18

MozJPEG (20.6KB)cjpeg -quant-table 2 -quality 29.4

To be fair, technically you can get a decent result from ImageMagick, but it takes a few obscure options and an XML file.

libjpeg and libjpeg-turbo compatibility

MozJPEG is binary-compatible with both libjpeg-turbo and the classic libjpeg. This makes it possible to use MozJPEG as a drop-in replacement for libjpeg, and even install it as a system-wide replacement.

The latest release has been made with help of libjpeg-turbo maintainer. MozJPEG and libjpeg-turbo adopted a common future-proof API that allows programs to be compatible with both libraries and still take advantage of each library’s specific features.

How can I use it?

Ask makers of your favorite graphics programs to integrate MozJPEG! No, really. MozJPEG is a library for developers, and on its own is not useful to people who don’t like compiling C programs.

Run ./configure && make in mozjpeg source directory (you’ll need a compiler installed.)

Hope you don’t get a million compilation errors.

If it succeeds, you’ll be able to run ./cjpeg in this directory (the ./ prefix is important, otherwise you may invoke an old version bundled with your system.)

You can run sudo make install if you want cjpeg (and other mozjpeg libraries/tools) installed system-wide.

And then:

./cjpeg -quality 70 -outfile compressed-image.jpg source-image.png

The -quality option accepts fractional numbers (necessary if you want to make a fair benchmark) and two numbers separated by commas to set quality of brightness and color separately, e.g. -quality 60,70.

The TGA:- | bit tells ImageMagick to quickly and cheaply pipe uncompressed data to cjpeg, so you don’t waste CPU on generating a temporary PNG file.

Happy compressing

Xiph.org’s researcher Tim Terriberry calls JPEG an “alien technology from the future”. JPEG, designed over 20 years ago, has got so many details right that it still remains competitive today, despite being much simpler and faster to decode than newer formats trying to dethrone it. And MozJPEG isn’t done yet!

]]>images[perf] Saving Money by Investing in Performance: A Financial Modelhttp://calendar.perfplanet.com/2014/saving-money-by-investing-in-performance-a-financial-model/
TL;DR We know that improving performance can affect revenue in many situations Performance can also save the business money and reduce costs Simple financial modeling can show why investing in performance makes business and financial sense Example Overview I like to make things simple. Ok, if you&#8217;ve skimmed ahead you&#8217;re already raising an eyebrow. Please [&#8230;]http://calendar.perfplanet.com/?p=1793Sun, 28 Dec 2014 07:02:42 +0000TL;DR

We know that improving performance can affect revenue in many situations

Performance can also save the business money and reduce costs
Simple financial modeling can show why investing in performance makes business and financial sense

Example

Overview

I like to make things simple. Ok, if you’ve skimmed ahead you’re already raising an eyebrow. Please bear with me. I promise that I’ll show you another way to easily convince the business to invest in performance.

When the business asks us why we should care about performance we always point them to the research done by the commerce giants. Performance means revenue! Yea!

The benefits to the business doesn’t have to stop at revenue. Performance can also improve the bottom line. This is especially important in situations where it is harder to prove revenue impacts from performance improvements. Focusing on infrastructure and operational savings can make it easier to convince VIPs to invest in performance (while also having the added benefit of having happier and more productive users)

I’d like to share with you a financial model that I’ve used to show how performance can impact the cost of doing business.

Caveats

As with all things, there always footnotes, caveats and provisos.

First, this is just another tool to demonstrate why performance matters – a tool that compares before & after. Using financial modeling can easily lead you into a rathole. There are always details that you will need to defend. Your objective should be to show directionality, not absolute position. (Let your financial experts in your organization compute the actual numbers.)

Second, I’m going to use shortcuts and generalities. This is based on my experience owning a business, my years managing Infrastructure and Operations and the many conversations I’ve had with other managers of I&O. I use these shortcuts to, again, show directionality. Don’t mistake shortcut to mean inferior. To the contrary! If anything, using the shortcuts will give conservative numbers.

How it Works; When to Use Financial Modeling

The root premise of this financial model is that improving performance per webpage (or per transaction) is generally accomplished by:

reducing number of requests and round trips

reducing the bytes sent

reducing processing time on the back-end

Number 3 is often tightly connected with #1 and 2. That is you are either building out more hardware or you are optimizing what you have. Building out more hardware per interaction doesn’t scale with user growth.

Therefore, this model will work best when you are optimizing backend processes, adding caching layers (back-end, cdn, client) or optimizing user workflows.

In contrast this likely will fall down if you try to use it to argue for optimizations such as leveraging the GPU for client rendering, adding webp support.

Of course, this is just the beginning. There are many other financial models that you should consider. I’ll leave those for another post! For example:

how performance increases sales per user (ARPU)

how performance increases user growth (CAGR)

Basic Equation

The financial model I use boils down to a basic equation:

Cash_Flow = Capital_Exenses + Operational_Expenses

That is, how much money do you have to spend to buy new hardware (CapEx) and how much money do you have to spend to keep the hardware working and the electricity flowing (OpEx).

Each year in the model we will add new hardware, which will increase our operational costs.

Later on I might use Max PageView as a proxy for load and will compute the peak CashFlow/PV:

(OpEx + CapEx) / Max_PageView

Once we have these three data points projected, it is as simple as comparing different scenarios. Did your improvements slow the rate of new hardware purchased? Reduce the operating costs? What are the projected costs with and without improved performance?

The tricky part is computing the OpEx and CapEx. Here are the equations that I will be using:

Avoiding Funny Numbers

You’re probably thinking about a million variables and inputs I should be using in the numbers above. As I mentioned above, I’m going to stick with generalizations and avoid all the particulars. However, one thing I’m going to stress is that I’m avoiding all the funny numbers – soft costs, contract renegotiations, etc. This will allow you to bypass a 7 week discussion with your procurement about the true cost of your enterprise agreement.

For these reasons, I will intentionally avoid:

Costs of User Productivity

(This is truest in funny money. Users and staff will be as productive with the time available. You will not get this money back – but you might be able to invest this time in other activities.)

Software Licensing costs

(the true cost of an enterprise agreement could power an improbability drive – calculating the savings will require the power from a small star)

Revenue from selling old hardware

Calculating Capital Expenses (CapEx)

Capital Expenditure is the easiest item to calculate.

We want to make sure we are just capturing the cost of procuring the hardware and getting it installed. Once procured, it is a “Sunk Cost” and can’t be recouperated. A more sophisticated model could turn this into an amortization schedule – but we want to show Cash Flow impacts instead.

CapEx = Number_of_Servers * Average_Cost

To be clear, when we talk about the Average Cost of hardware we should think of it in two ways:

What is the fully loaded cost – more than just the list price of the hardware, but also what it takes to install. Cost to procure, security audits, colo service tickets, rack and stack, etc. That said, if you can’t get the fully loaded cost, be conservative and avoid “guestimation”.

You rarely have a uniform set of hardware. Instead of spending time inventorying your infrastructure use an average cost across hardware. Yes, you might have bigger boxes for db compared to app servers. Use an educated average to make the math simpler.

Some good numbers I’ve used are:

$5k for a pizza box

$100k for high density compute server

Virtual Servers also require CapEx. You have two options, one is to do the translation of number of VMs per server to actual hardware (don’t forget to factor in vMotion buffer). Or, if your IT group has the cost already computed, you can use the cost per VM. Bottom line: be consistent.

Calculating Operating Expenses (OpEx)

There are many ways to calculate the cost of operations for your application. The easiest way is to look at it in an aggregate view. That is, how many servers in total are used to deliver your app – regardless of the role.

The assumption we start with is that your current infrastructure is necessary to deliver the current level of performance. Increasing user traffic will likewise need to increase your infrastructure proportionally.

Most Co-Location providers these days use a simple billing model of charging only for energy used. The beauty of this model is that it usually includes everything you need for hosting as well. Functionally you can assume for the price of energy you get all the cooling, floor space and bandwidth you need.

Of course, each datacenter is a special snowflake. Don’t get bogged down in the details and keep the formula general. It is better to underestimate than to overestimate or worse, spend 6 weeks and arrive at a similar number.

The equation works out to:

OpEx = Number_of_Servers * KVA_per_Server * KVA_Price

The KVA per server can be the most challenging to calculate. Some hardware manufacturers provide a power calculator for server configurations. Many provide a range of potentials. Here is my recommendation:

Don’t try to calculate each piece of hardware. Pick one configuration that is representative

If in doubt, use the newest hardware’s power consumption since it will likely be the most efficient

If the manufacturer offers a fully loaded power use, use 80% of the value

If the manufacturer only offers one power profile – assume it to already be 80% loaded

Most hardware reports power as Watts and BTU. Assume a power-factor of 0.9 and use: KVA = Watts / 900

When in doubt I use these approximation numbers:

0.5KVA for an average pizza box server

3.5KVA for a 6U high density compute chassis

As I mentioned, most colo providers charge by electricity used and bundle all the other amenities into this price. Like all colo solutions there is a range of offerings from high-end ($0.70/KVA/mo) to low-end ($0.20/KVA/mo). In my experience, I’ve found that the cost to run your own datacenter can be pretty close to the average cost of renting colo space.

You’ll probably have a hard time getting procurement to offer up the price you pay for colo, so to save you the time I’d recommend using a number around $0.50/KVA/mo. This should also be sufficiently padded to account for any MPLS lines or dedicated circuits that your data center might need.

What about IaaS?

So far this model assumes you own or lease your infrastructure. However, if you use IaaS this model will fall short since you don’t own capital and it is pure operational expenses. The tricky part is that the cost of operations is not based on hardware procured but based on utilization. Savings can still be realized and modeled but it requires a slightly different formula and ultimately requires more insight into your cost of operations. This is worthy of a different talk.

PageView and Interaction Cost

A useful model to measure user load on a system is to look the maximum Page Views per second. Consider a Page View requests that return Content-Type: text/html. The principle is that each user ‘interaction’ or ‘transaction’ with your website will return html.

Using PageView isn’t always perfect – especially for single url apps. The goal is to find a metric that everyone can agree on and consistently represents the volume of user activity on your site.

Your current configuration of infrastructure is designed to meet a peak in volume of traffic. That is, you have built it to sustain the peak traffic throughout the year and get by with the least number of user complaints. This peak could be Black Friday or it could be annual performance review time.

Using the maximum page view per second will tell us how much money is spent to maintain this peak traffic:

Interaction_Cost = (OpEx + CapEx) / Page_View_per_Second

Each year, you expect to grow the business. As you grow, you will build more hardware in lock step. If you do nothing to improve the performance, then you should expect the Interaction Cost to remain constant, year over year.

Example:

Let me share an example use of this model – based on real life events.

In this example, let’s assume a retailer, with this configuration:

140 “pizza box” type servers ($5k, 0.5KVA)

400 Page View/s peak

30% YoY growth rate

The problem is that the home page and on key category page account for 40% of the site traffic – all of which cannot be cached by the local varnish or cdn layers and must go back to the datacenter. This is because:

html included basic personalization rendered by server side (“Hello Colin”)

unique shopper cookie is generated for each request (intrinsic for business KPIs)

You’re probably shaking your head. I know. But this kind of web application is all too common.

Investing in changes to the website html generation would improve the local cache layer and the offload of the cdn – not to mention improve the time to first byte substantially. (To accomplish this, client side javascript could be used to inject the personalization as well as guid generation for shopper tracking. This way the business goals can still be met and now a page is cacheable.)

With even a small TTL (eg: 1min) will result increase the offload from the datacenter by 40%. Assuming we don’t turn off any previously commissioned hardware, we can delay expanding the data center footprint by one year!

The results are more than enough to justify the cost of investment. The best part is that these projected savings don’t include any costs by the infrastructure teams to maintain the growing footprint. Your I/O teams will approve!

Plugging the numbers in we can see the cash spent this year (year 0) and the projected cash flow for next year if we don’t make the changes. In contrast we now have shown how our performance improvement will impact the total cost of ownership

Looking at it another way, we can see that the cost per interaction also decreases.

How can I use this model?

The example I gave shows how increasing caching impacts operations but it doesn’t stop there. Any change you make that makes applications more efficient, takes advantage of caches, reduces number of requests or makes requests smaller will impact the cost to the business. Using this simple model we can project the financial impact of those performance improvements.

This is only the beginning. I believe that there are many other financial models that can be used to help convince the business that performance matters!

]]>performance[perf] The Power of Perceived Performancehttp://calendar.perfplanet.com/2014/the-power-of-perceived-performance/
Recent years have seen a huge flux of SPAs — Single Page Applications. Though they enhance user experience, implementing SPAs for large-scale web applications is indeed a complex task. At eBay, we faced a similar challenge when we wanted to migrate one of our key desktop flows (search and item pages) to an app-like experience, from the [&#8230;]http://calendar.perfplanet.com/?p=1779Sat, 27 Dec 2014 05:23:07 +0000Recent years have seen a huge flux of SPAs — Single Page Applications. Though they enhance user experience, implementing SPAs for large-scale web applications is indeed a complex task. At eBay, we faced a similar challenge when we wanted to migrate one of our key desktop flows (search and item pages) to an app-like experience, from the current state of full page refreshes. Some of the key challenges were

Server & client sync: Solving this challenge is super critical for e-commerce applications. Both the browser and the server should maintain the state of the app. At any point in time the URL should be portable, meaning it should render the page markup in the same state as it was previously. From an e-commerce perspective, three main circumstances make this point critical: SEO, browser refreshes (especially for items ending soon), and URL sharing.

Code redundancy: This point follows from the previous one. In order for the app to be rendered on the server, all logic built in JavaScript for the browser should be replicated on the server. The result, however, is a maintenance nightmare, especially for large distributed teams. Although there are solutions like rendr and PhantomJS to simulate the browser environment on the server, they don’t work at scale.

Performance penalty for the first hit: Most SPA frameworks out there require the initial rendering on page load to happen on the browser. This is an anti-performance pattern and has proven to be a bad way to go for many (see, for example, Twitter’s experience). In addition, with initial rendering on the browser we lose the huge benefits of the preload scanners in modern browsers. This again reiterates the point that server side rendering is not just an add-on, but a must.

Browser Back/Forward: This may come as a surprise to many, but from what we have observed, even the slightest deviation from the default back/forward behavior of the browser has impacts on consumer behavior. Users are so accustomed to these actions (mainly in the desktop environment) that we need to make sure they work as expected. This is not much of a SPA challenge, but something to keep in mind.

Considering the above facts and still wanting to build a seamless experience, we decided to go the PJAX route. PJAX (pushState + AJAX) is a technique that delivers a fast browsing experience, without the SPA overhead. It has worked well for biggies like Twitter and GitHub. When looking to implement a PJAX library, we learned about YouTube’s SPF — Structured Page Fragments — from the 2014 Velocity conference (yes, SPF not SPA; we know it’s confusing). A quick dive into SPF indicated it was pretty much what we wanted. Moreover, the contributors to SPF responded promptly to all our queries and enhancement requests, thus enabling us to get started quickly. So what does SPF offer?

Application code remains intact: With SPF, we don’t have to change the way we build applications. Also, no specialized client treatment is required. The only change needed was to add a conditional hook (based on a particular request URL param) in our server response pipeline to respond with HTML in JSON format, instead of with standard HTML. This benefit is huge to us, as development teams can build and maintain applications while being agnostic about how client-side navigations might happen.

Always rendered on server: With SPF, markup is always rendered on the server. Along with the code maintenance benefit, this also removes the dependency on client hardware specifically for rendering. Although client machines are getting increasingly powerful, we still have a sizable set of our global users on the lower hardware spectrum whom we need to support. Our previous attempts at client-side rendering for these users were not fruitful. On a side note, we are very interested in trying out React for client-side rendering after initial load.

JavaScript & CSS optimizations

Moving to SPF provided an opportunity for us to clean up JavaScript and CSS. We did two types of optimization.

JavaScript Events: Our pages did not have a standard way of handling events — some were handled at an individual element level and some used delegation. This situation was not ideal, as complex pages were sometimes sluggish because they had tons of events. Now with PJAX, we’ve brought in a standard: to wigetize our UI modules and delegate events at the widget container level. This made event handling more efficient. Furthermore, we needed to re-initialize only those widgets that were changed on page navigation.

Resource Bundling: Most pages were bundled in a way that there was only one JavaScript and one CSS URL per page. All library JS (jQuery, Raptor, tracking, utils, etc.) and CSS (skin) were combined with application JS and CSS, making them one URL each. While this was good for reducing HTTP requests, it also meant anti-caching. When a user navigates from one page to another, the entire CSS and JS have to be downloaded and executed; library files were the big chunk of this overhead, which was unnecessary. With SPF, this bundling approach would fail right away, since it is a single page context and executing the same library code (like jQuery) would result in unintended behaviors. To fix this, we took on the task of creating two resource bundles for key pages — bundling all common JS and CSS shared across pages as one resource, and bundling the application JS and CSS per page as the second resource. This saves a lot of time in terms of resource parsing and execution, as only the small amount of application JS and CSS has to be processed on each navigation. Also, in SPF mode the resource processing happens only for the first navigation; for repeated views, the previously executed CSS and JS can be leveraged.

Progress Indicators

Now back to why the title “The Power of Perceived Performance.” Moving to SPF measurably increased performance on each navigation. But we had a problem: the performance gain was not visually perceptible. In an internal demo, the feedback was “yeah, it seems to be a little quicker, but nothing else is different.” We were scratching our heads about what was missing. Finally, it all came down to one thing — a progress indicator. Yes, we did not have progress indicators when users navigated pages in SPF mode.

Transitions or progress indicators mask the slowness in applications. There has been much research around this, and we actually experienced the humongous impact it has. Close observation of all major websites that use the PJAX technique reveals they use some sort of progress indicator. For instance, Twitter navigation uses a small throbber, replacing the bird icon in the static header. GitHub replaces the icon right next to a file or folder with a throbber. YouTube shows a red progress bar at the top to indicate the progress of a user’s action.

When we were considering how to implement a transition for SPF-based navigation, a lot of fancy ideas came up. From internal testing, the feedback we received was clear : more than the fancy stuff, customers just need an elegant transition. We ended up with a real-time progress bar similar to YouTube’s. With the progress indicator in place, we did another internal launch. This time, the feedback was unanimous: “Wow! The page looks fast.”

Without progress bar

With progress bar

Progress bar

It was surprising how a tiny progress indicator could change the perception of an entire application. The performance numbers with and without the progress indicators were the same. But just with that indicator, the application feels much faster. This is the real power of perceived performance. As a bonus, avoiding the re-parse and re-execution of large CSS and JavaScript on each navigation made our page interaction ready instantly.

Currently the PJAX-based navigation is enabled within the item page and is in production A/B testing. Search is next, and soon other key flows will follow. The ultimate goal is ONE eBay desktop experience.

]]>performance[perf] Simplify speed with the HALT numberhttp://calendar.perfplanet.com/2014/simplify-speed-with-the-halt-number/
Speed engineers rely on business buy-in to get the job done. Telling a clear story will gain you resources and help when it&#8217;s time to balance speed against other marketing considerations. The cost of a garbled message? Wasted time, frustrated users, and lost sales. Last year, the big sites got 23% slower and lost billions [&#8230;]http://calendar.perfplanet.com/?p=1786Fri, 26 Dec 2014 07:01:50 +0000

Speed engineers rely on business buy-in to get the job done. Telling a clear story will gain you resources and help when it’s time to balance speed against other marketing considerations.

We already know how to start the speed story simply. “Customers show up to our slow website, get thwarted, and move on without spending money”. But what happens next? How do you handle the exec who gut-checks by pulling up his cached site from a high-speed T3 line?

Boil it down

You need hard numbers to clearly explain the situation. Unfortunately, many speed measures aren’t intuitive or even very useful. By the time you’ve explained 95th percentile, you’ve lost your audience.

A single measure can explain how the site is performing across all your different pages and customers. It’s got to handle everyone’s intuitive feel that some pages are especially important. And this number should quickly give the gist, even if the details are tricky to compute. Hierarchal Average Load Time (HALT) combines the critical attributes:

HALT is a single number that’s simple to grasp. However, the calculation takes some effort. HALT integrates many different measures, so let’s unpack it a little.

1 – Break your pages down

Different audiences react to speed differently. If your customers display different speed behavior by country, mobile/fixed, or any other trait, split your page-level data. Just make sure these factors are statistically significant.

2 – Start at the median

The calculation begins with median (not average) load times on each page. Averages get bumped around by this type of data, but the load time for the middle point isn’t skewed by a few 100-second outliers.

3 – Measure in proportion to the userbase

Your coworkers normally access a Potemkin Village version of the website. Cached files, fast connection, and servers next to the browser. It’s artificial, but can feel real when your arguments for performance optimization are being evaluated.

If 10% of our users happen to access the site from Topeka, Kansas on a dial-up modem with Internet Explorer during business hours, then 10% of measurements should replicate those conditions. Simple execution if you’re using Real User Measurement tools like Google Analytics. If RUM doesn’t work for you, approximate this sampling with careful setup in synthetic measurement tools like Web Page Test.

4 – Weight by importance

Let’s face it, your legal disclaimers are less important to your customers than your home page. A weighted average of all the median load times keeps everything in proportion.

Page value – what’s the average value of goals achieved (goods sold, videos watched, etc.) after visiting this page? If you don’t have an analytics package installed, poll your marketing team to take an educated guess.

Speed sensitivity – do customers care about speed on this particular page? Is it reflected in page value? For example, your Cart Checkout page might show much more sensitivity than your home page. Be careful – correlation is definitely not causation. If the numbers are unavailable, assume constant sensitivity.

Your HALT number will be measured in seconds – and hopefully trend downwards. If it isn’t headed the right way, start digging in:

Apply speed technology and coding changes across the board

Focus on the most important pages and customer demographics

Watch for shifts in the key pages and customers

Keep it clean

Load time makes sense to many people, but there are good arguments for different measurements like Speed Index, Time To Interact or Above-The-Fold Time.

Whatever you do, play it square. Some useful techniques can modify load times (like deferred loads). If they unfairly skew times, make sure to either highlight the change or adjust numbers in compensation.

Likewise, if you take shortcuts, make them obvious. If you don’t have all the information you’d like, no problem. But as you make changes to your numbers, highlight the differences.

Remember to use your words and numbers like poetry – pack them with meaning. Your story will sink in better and coworkers will start thinking speed to reach their goals.

]]>Uncategorized[perl] So here it is Merry Christmashttp://perladvent.org/2014/2014-12-25.html
<div class='pod'><p>Another year of Perl Advent had drawn to a close, and what a year it&#39;s been. I hope you&#39;ve enjoyed reading all the articles as much as I have.</p>
<h3 id="Everybodys-having-fun">Everybody&#39;s having fun</h3>
<p>Of course, just because Advent is over for another year, doesn&#39;t mean the gift of Perl modules has to end. As I write this there are one hundred and forty one thousand seven hundred and forty nine modules on the CPAN and each year we only get to publish twenty four advent articles. So where can you read about the rest of these modules and other exciting things happening in Perl?</p>
<ul>
<li>You can search for Perl modules at <a rel="nofollow" target="_blank" href="https://metacpan.org/">metacpan</a>
where you can also see a list of <a rel="nofollow" target="_blank" href="https://metacpan.org/recent">recently
published modules</li><li><a rel="nofollow" target="_blank" href="http://cpanratings.perl.org/">CPAN Ratings</a> offers a ratings
and reviews of Perl modules. It's like a democatic advent calendar each day
of the year!</li>
<li>The <a rel="nofollow" target="_blank" href="http://perlweekly.com/">Perl Weekly</a> provides one small
succinct email a week with a summary of the interesting things happing in Perl
that week, including new modules, interesting articles and upcoming events.</li>
<li><a rel="nofollow" target="_blank" href="http://blogs.perl.org/">blogs.perl.org</a> hosts a vast number of
Perl blogs if you're looking for more articles to read throughout the year. The <a rel="nofollow" target="_blank" href="http://ironman.enlightenedperl.org/">
Perl Ironman Challenge</a> links to a lot more blogs as part of the <em>updating
your Perl blog weekly</em> challenge</li>
<li>You're reading the original programming article for advent advent calendar,
but if you like the format there's a whole collection of them available dedicated
to other aspects of Perl (e.g. <a rel="nofollow" target="_blank" href="http://www.catalystframework.org/calendar/2014">Catalyst</a>,
<a rel="nofollow" target="_blank" href="http://advent.perldancer.org/2014">Dancer</a>, <a rel="nofollow" target="_blank" href="http://perl6advent.wordpress.com/2014/">Perl 6</a>,
<a rel="nofollow" target="_blank" href="http://advent.perl.kr/2014/">Seoul.pm</a>, <a rel="nofollow" target="_blank" href="http://shadow.cat/blog/matt-s-trout/">MSTPAN</a>, and
<a rel="nofollow" target="_blank" href="http://blogs.perl.org/users/perlancar/2014/12/perlancars-2014-advent-calendar.html">perlancer</a>.) Other
languages have also adopted the format and <a rel="nofollow" target="_blank" href="http://www.lenjaffe.com/AdventPlanet/2014/">Advent Planet</a>
has a meta calendar that links to each entry for that day for over twenty
calendars</li>
<li>Of course, if you want more Perl Advent articles, there's always the
previous <a rel="nofollow" target="_blank" href="http://perladvent.org/archives.html">fourteen years of advent
calendar</a> to go through. Happy reading!</li>
</ul>
<p>The Perl Advent Calendar wouldn&#39;t be possible without the hard work of all the article authors. This year we owe thanks to Alex Balhatchet, Augustina Ragwitz, Dave Cross, Graham Ollis, John SJ Anderson, Legolas Greenleaf, Marcus Ramberg, Mark Allen, Mark Fowler, Neil Bowers, Nick Patch, Olaf Alders, Paul &quot;LeoNerd&quot; Evans and Ricardo Signes. These people not only devoted untold time writing and formatting their articles, but put up with me repeatedly hassling them with my self imposed deadlines and bugging them with copy corrections. Thank you for putting up with me.</p>
<p>Thanks also to everyone who submitted corrections, helped organize, setup hosting, debugged software issues, and otherwise did the unglamorous things that made the advent calendar happen.</p>
<h3 id="Look-to-the-future-now">Look to the future now</h3>
<p>As a Perl programmer reading the calendar chances are you have an article in you for publishing next year even if you don&#39;t know what it is yet. I encourage you to subscribe to the <a rel="nofollow" target="_blank" href="http://mail.pm.org/mailman/listinfo/perladvent">mailing list</a> where we organize all of this and simply drop us an email saying you&#39;d be interested in writing (you&#39;ll get a follow up email from me in about eight months when we start working out scheduling for 2015.)</p>
<h3 id="Its-only-just-begun">It&#39;s only just begun</h3>
<p>Perl does a lot for you. It keeps you entertained, it makes your life easier, and chances are it keeps you employed. In this season of giving you might consider making a small donation to <a rel="nofollow" target="_blank" href="https://donate.perlfoundation.org/donate.html">The Perl Foundation</a> who do important work, including sponsoring developers to work full time on improving the core of Perl.</p>
<h3 id="Its-Christmas">It&#39;s Christmas!</h3>
<p>Merry Christmas, one and all.</p>
</div></a>Mark Fowlerhttp://perladvent.org/2014/2014-12-25.htmlThu, 25 Dec 2014 05:00:00 +0000[java] Merry Christmas everyone!http://feedproxy.google.com/~r/JavaAdventCalendar/~3/YHT6vuqbpz8/merry-christmas-everyone.html
<div dir="ltr" style="text-align:left;">This is the first year of the <a rel="nofollow" target="_blank" href="http://www.javaadvent.com">Java Advent Project</a> and I am really grateful to all the people that got involved, published articles, twitted, shared, +1ed, shared etc. etc. <br/>It was an unbelievable journey and all the glory needs to go to the people that took some time from their loved ones to give us their wisdom. As they say, the <a rel="nofollow" target="_blank" href="http://www.javaadvent.com/2014/12">Class of 2014</a> of Java Advent is comprised of (in the order of publishing date): <ul> <li><a rel="nofollow" target="_blank" href="https://hype-free.blogspot.com/">Attila-Mihály Balázs</a>, <a rel="nofollow" target="_blank" href="http://www.transylvania-jug.org/">Transylvania JUG</a> member</li> <li><a rel="nofollow" target="_blank" href="https://www.linkedin.com/profile/view?id=82429506">Florin Bunau</a></li> <li><a rel="nofollow" target="_blank" href="http://www.petrikainulainen.net/">Petri Kainulainen</a></li> <li><a rel="nofollow" target="_blank" href="https://twitter.com/shelajev">Oleg Shelajev</a> from <a rel="nofollow" target="_blank" href="http://zeroturnaround.com">ZeroTurnaround</a></li> <li>The perfect <a rel="nofollow" target="_blank" href="http://plumbr.eu/">plumbr</a> for your web application, <a rel="nofollow" target="_blank" href="https://twitter.com/iNikem">Nikita Salnikov-Tarnovski</a></li> <li>If you want to hear out your developers, <a rel="nofollow" target="_blank" href="http://trishagee.github.io/">Trisha Gee</a>, can help you do it properly</li> <li>If you like to measure the performance in nanoseconds <a rel="nofollow" target="_blank" href="http://stackoverflow.com/users/57695/peter-lawrey">Peter Lawrey</a> (<a rel="nofollow" target="_blank" href="http://stackoverflow.com/users/57695/peter-lawrey">stackoverflow</a>) and <a rel="nofollow" target="_blank" href="http://openhft.net/">OpenHFT</a>can help you with it!</li> <li><a rel="nofollow" target="_blank" href="http://blog.eisele.net/">Markus Eisele</a> from <a rel="nofollow" target="_blank" href="http://blog.eisele.net/">Eisele.net</a></li> <li><a rel="nofollow" target="_blank" href="https://www.blogger.com/blogger.g?blogID=2481158163384033132">Rene Jansen</a></li> <li><a rel="nofollow" target="_blank" href="http://www.linkedin.com/in/mitemitreski">Mite Mitreski</a> from <a rel="nofollow" target="_blank" href="http://blog.mitemitreski.com/">mitemitreski.com</a></li> <li><a rel="nofollow" target="_blank" href="http://blog.schauderhaft.de/">Jens Schauder</a></li> <li><a rel="nofollow" target="_blank" href="http://www.jooq.org/">jOOQ</a>s papa - Lukas Eder(<a rel="nofollow" target="_blank" href="http://ch.linkedin.com/in/lukaseder">linkedin</a>, <a rel="nofollow" target="_blank" href="https://twitter.com/lukaseder">twitter</a>)</li> <li><a rel="nofollow" target="_blank" href="https://plus.google.com/110574453398024123194">Mani Sarkar</a></li> <li><a rel="nofollow" target="_blank" href="https://www.linkedin.com/pub/marcin-grzejszczak/19/651/155">Marcin Grzejszczak</a> from <a rel="nofollow" target="_blank" href="http://toomuchcoding.blogspot.ro/">Too much coding</a></li> <li><a rel="nofollow" target="_blank" href="https://www.blogger.com/blogger.g?blogID=2481158163384033132">Rudiger Moller</a></li><li><a rel="nofollow" target="_blank" href="http://uk.linkedin.com/in/martijnverburg">Martijn Verburg</a> from <a rel="nofollow" target="_blank" href="http://www.jclarity.com/">jClarity</a></li> <li><a rel="nofollow" target="_blank" href="https://www.linkedin.com/profile/view?id=12080828">Alexander Turner</a> from <a rel="nofollow" target="_blank" href="https://nerds-central.blogspot.com/">Nerds Central</a></li> <li>And yours truly <a rel="nofollow" target="_blank" href="https://www.linkedin.com/profile/view?id=21470605">Olimpiu POP</a></li></ul> <br/>Thank you girls and guys for making it happen yet once more. And sorry for stressing and pushing you out. Also, last but not least thanks to <a rel="nofollow" target="_blank" href="https://www.voxxed.com/">Voxxed</a> editors <a rel="nofollow" target="_blank" href="https://www.linkedin.com/profile/view?id=69144451&authType=NAME_SEARCH&authToken=T1GE&locale=en_US&srchid=214706051419364268205&srchindex=1&srchtotal=17&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A214706051419364268205%2CVSRPtargetId%3A69144451%2CVSRPcmpt%3Aprimary">Lucy Carey</a> and <a rel="nofollow" target="_blank" href="http://www.linkedin.com/in/mitemitreski">Mite Mitreski</a>. <br /></div><div class="feedflare">
<a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=YHT6vuqbpz8:5ojrSgvMKhY:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=yIl2AUoC8zA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=YHT6vuqbpz8:5ojrSgvMKhY:4cEx4HpKnUU"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?i=YHT6vuqbpz8:5ojrSgvMKhY:4cEx4HpKnUU" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=YHT6vuqbpz8:5ojrSgvMKhY:YwkR-u9nhCs"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=YwkR-u9nhCs" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=YHT6vuqbpz8:5ojrSgvMKhY:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=qj6IDK7rITs" border="0"></a>
</div><img height="1" width="1" alt=""/>Olimpiu Poptag:blogger.com,1999:blog-2481158163384033132.post-3764847274804782430Thu, 25 Dec 2014 03:04:00 +0000[sysadvent] Day 25 - Windows has Configuration Management?!?http://feedproxy.google.com/~r/sysadvent/~3/p7sWRCtdrqs/day-25-windows-has-configuration.html
<p>Written by: Steven Murawski (<a rel="nofollow" target="_blank" href="https://twitter.com/stevenmurawski">@stevenmurawski</a>)<br>
Edited by: William Shipway (<a rel="nofollow" target="_blank" href="https://twitter.com/@shipw">@shipw</a>)</p>
<p>Windows Server administration has long been the domain of &#8220;admins&#8221; mousing their way through a number of Microsoft and third party management UIs (and I was one of them for a while). There have always been a stalwart few who, by hook or by crook, found a way to automate the almost unautomateable. But this group remained on the fringes of Windows administration. They were labeled as heretics and shunned, until someone needed to do something not easily accomplished by a swipe of the mouse. </p>
<p>The sea winds have shifted and over the past seven or eight years, Microsoft released PowerShell and began focusing on providing a first class experience to the tool makers and automation-minded. The earlier group of tool makers and automators gained traction and began to develop a larger following, as more Microsoft and third party products added support for PowerShell. That intrepid group of early automators formed the core of the PowerShell community and began welcoming new converts - whether they were true believers or forced into acceptance by the lack of some capability in their comfortable management UIs. Now, most Windows Server administrators have delved into the command line and have begun to succumb to the siren call of automation.</p>
<p>Just as the PowerShell community&#8217;s evangelism was reaching a fevered pitch, Microsoft added another management tool - Desired State Configuration. The tool-makers and automators were stunned. Cries of &#8220;what about my deployment scripts?&#8221; and &#8220;but, I already built my VM templates!&#8221; echoed through the halls. Early adopters of PowerShell v3 lamented &#8220;isn&#8217;t this what workflows were for?&#8221;. Some had already begun to explore the dark arts of configuration management using tools like Chef and Puppet to bring order to their infrastructure management. With the help of those in the community who blazed a trail in implementing configuration management on Windows, those cries of dismay began to turn into rabid curiosity and even envy. The administrators began to read books like the Phoenix Project and hear stories from companies like Stack Exchange, Etsy, Facebook, and Amazon about this cult of DevOps. They wanted access to this new realm of possibilities, where production deployments don&#8217;t mean a week of late nights in the office and requests for new servers don&#8217;t go to the bottom of the pile to sit for a month to &#8220;percolate&#8221;.</p>
<p>Read on, dear reader to understand the full story of Desired State Configuration and its place in the new DevOps world where Windows Server administrators find themselves.</p>
<h3>An Introduction to Desired State Configuration</h3>
<p>With the release of Windows Server 2012 R2 and <a rel="nofollow" target="_blank" href="https://www.microsoft.com/en-us/download/details.aspx?id=40855">Windows Management Framework 4</a>, Microsoft <a rel="nofollow" target="_blank" href="http://blogs.msdn.com/b/powershell/archive/2013/11/01/configuration-in-a-devops-world-windows-powershell-desired-state-configuration.aspx">introduced Desired State Configuration</a> (DSC). DSC consists of three main components: the Local Configuration Manager, a configuration Domain Specific Language (DSL), and resources (with a pattern for building more). DSC is available on Windows Server 2012 R2 and Windows 8.1 64 bit out of the box and can be installed on Windows Server 2012, Windows Server 2008 R2, and Windows 7 64 bit with Windows Management Framework 4. There is an evolving ecosystem around Desired State Configuration, including support for a number of systems management and deployment projects. To me, one of the most important benefits of the introduction of Desired State Configuration is the awakening of the Windows administration community to configuration management concepts.</p>
<h3>A Platform Play</h3>
<p>The inclusion of Desired State Configuration may seem like a slap in the face to existing configuration management vendors, but that is not the case. Desired State Configuration is a platform level capability similar to PerfMon or Event Tracing for Windows. DSC is not intended to wholesale replace other configuration management platforms, but to be a base which other platforms can build on in a consistent manner.</p>
<h4>The Evolution of DSC</h4>
<p>One of the major knocks against administering Windows servers in the past has been the horrendous story around automation. Command-line tools were either lacking coverage or just plain missing. The shell was in a sorry state. </p>
<p>Then, shortly before Windows Server 2008 shipped, PowerShell came about. Initially, PowerShell had relatively poor native coverage for managing Windows, but it worked with .NET, WMI, and COM, so it could do just about anything you needed. </p>
<p>More coverage was introduced with each release of Windows Server. Windows Server 2012 had an explosion of coverage via native PowerShell commands for just about everything on the platform.</p>
<p>PowerShell appeared to be the management API for configuring Windows servers. The downside of a straight PowerShell interface is that PowerShell commands aren&#8217;t necessarily idempotent. Some like Add-WindowsFeature are, and do the right thing if the command is run repeatedly. Others are not, like New-Website, which will throw errors if the site already exists.</p>
<p>DSC was introduced to provide a common management API that offers consistent behavior. Under the covers, it is mostly PowerShell that is running, but the patterns the resources follow ensure that only the work that needs to be done is done, and when a resource is in the proper state, that it is left alone.</p>
<p>Being a platform feature means that there is a consistent, supported mechanism for customers and vendors to manage and evolve the configured state of Windows servers.</p>
<h4>Standards Based</h4>
<p>Desired State Configuration was built using standards already supported on the Windows platform - <a rel="nofollow" target="_blank" href="http://www.dmtf.org/standards/cim">CIM</a> and <a rel="nofollow" target="_blank" href="http://www.dmtf.org/standards/wsman">WSMAN</a>. </p>
<p>CIM, Common Information Model, is the DMTF standard that WMI is based upon and provides structure and schema for DSC.</p>
<p>WSMAN, WS-Management, is a web services protocol and DMTF standard for management traffic. WinRM and PowerShell remoting are built on this transport as well. </p>
<p>While these might not be the greatest standards in the world, they do provide a consistent manner for interacting with the Desired State Configuration service.</p>
<h4>An Evolving API</h4>
<p>Though Windows Management Framework (WMF) was just recently introduced (it has been released for just over a year), WMF 5 development is well under way and includes many enhancements and bug fixes. One major change is to make the DSC engine&#8217;s API more friendly to use by third-party configuration management systems. </p>
<p>There was also a recent rollup patch for Server 2012 R2 (KB3000850) that contains a number of bugfixes and some tweaks for ensuring compatibility with changes coming in WMF 5.</p>
<h3>Diving In</h3>
<p>Now that we&#8217;ve got a bit of history and rationale for existence out of the way, we can dig in to the substance of Desired State Configuration.</p>
<h4>The Local Configuration Manager</h4>
<p>The engine that manages the consistency of a Windows server is the Local Configuration Manager (LCM). The LCM is exposed as a WMI (CIM) class (MSFT_DscLocalConfigurationManager) in the Root/Microsoft/Windows/DesiredStateConfiguration namespace.</p>
<p>The LCM is responsible for periodically checking the state of resources in a configuration document. This agent controls</p>
<ul>
<li>whether resources are allowed to reboot the node as part of a configuration cycle</li>
<li>how the agent should treat deviance from the configuration state (apply and never check, apply and report deviance, apply and autocorrect problems)</li>
<li>how often consistency checks should be run</li>
<li>and more&#8230;</li>
</ul>
<p>It has a plugin/extension point with the concept of Download Managers. Download Managers are used for Pull mode configurations. There are two download managers that ship in the box, one using a simple REST endpoint to retrieve configurations and one using a SMB file share. As it currently stands, these are not open for replacement by third parties (but it could be made so - please weigh in to the PowerShell team about that before WMF 5 is done!). </p>
<h5>A Quick Note - Push vs. Pull</h5>
<p>DSC configurations can be imperatively pushed to a node (via the Start-DscConfiguration cmdlet or directly to the WMI API), or if a Download Manager is configured it can pull a configuration and resources from a central repository (currently either SMB file share or REST-based pull server). If a node is in PULL mode, when a new configuration is retrieved, it is parsed to find the various modules required for the configuration to be applied. If any of the requisite modules and versions are not present on the local node, the pull server can supply those.</p>
<h4>DSC Resources</h4>
<p>Resources are the second major component of the DSC ecosystem, and are what make things happen in the context of DSC. There are three ways of creating DSC resources: They can be written in PowerShell, as WMI classes, or in Windows Management Framework 5, as PowerShell classes. As PowerShell class-based resources are still an experimental feature and the level of effort to create WMI based resources is pretty high, we&#8217;ll focus on PowerShell-based resources here.</p>
<p>DSC resources are implemented as PowerShell modules. They are hosted inside another PowerShell module under a DSCResources folder. The host module needs to have a module metadata file and have a module version defined in order for it to host DSC resources. </p>
<p>The resources themselves are PowerShell modules that expose three functions or cmdlets:</p>
<ul>
<li>Get-TargetResource</li>
<li>Test-TargetResource</li>
<li>Set-TargetResource</li>
</ul>
<p>Get-TargetResource returns the currently configured state (or lack thereof) of the resource. The function returns a hashtable that the LCM converts to an object at a later stage. </p>
<p>Test-TargetResource is used to determine if the resource is in the desired state or not. It returns a boolean.</p>
<p>Set-TargetResource is responsible for getting the resource into the desired state. Set-TargetResource is only executed after Test-TargetResource. </p>
<h4>The Configuration DSL</h4>
<p>Also introduced with Desired State Configuration are some domain specific language extensions on top of PowerShell. Actually, Windows Management Framework 4 added some public extension points in PowerShell for creating new keywords, which is what DSC uses. </p>
<p><em>Stick with me here, as it may get a bit confusing - I&#8217;ll be using &#8220;configuration&#8221; in two contexts. First is the configuration script. This is defined in PowerShell and can be defined in a script file, a module, or an ad hoc entry at the command line. The second use of &#8220;configuration&#8221; is in the context of the configuration document. This is the final serialized representation of the configuration for a particular machine or class of machines. This document is in Managed Object Format (MOF) and is how CIM classes are serialized.</em></p>
<p>The first keyword defined is configuration. The configuration keyword indicates that the subsequent scriptblock will be a configuration document and should be parsed differently. All your standard PowerShell constructs and commands are valid inside of a configuration, as are a few new keywords. There are two static keywords and a series of dynamic keywords in a configuration document.</p>
<p>The first two static keywords are node and Import-DscResource. I&#8217;ll deal with the latter first, since it seems very oddly named. Import-DscResource looks in name like a cmdlet or function, but is a keyword that is valid only in a configuration document and only outside of the context of a node. Import-DscResource identifies custom and third-party modules to make available in a configuration document. By default, only DSC resources in modules located at $pshome/modules (usually c:&#92;windows&#92;system32&#92;windowspowershell&#92;v1.0&#92;modules) can be used without using Import-DscResource and specifying which modules to make resources available from. The second static keyword is the node keyword. Node is used to identify the machine or class of machines that the configuration is targeted at. Resources are generally assigned inside node declarations.</p>
<p>The configuration also includes a number of potential dynamic keywords which represent the DSC resources available for the configuration.</p>
<p>An example configuration script looks something like:</p>
<pre><code>configuration SysAdvent
{
Import-DscResource -ModuleName cWebAdministration
node $AllNodes.where({$_.role -like 'web'}).NodeName
{
windowsfeature IIS
{
Name = 'web-server'
}
cWebsite FourthCoffee
{
Name = 'FourthCoffee'
State = 'Started'
ApplicationPool = 'FourthCoffeeAppPool'
PhysicalPath = 'c:&#92;websites&#92;fourthcoffee'
DependsOn = '[windowsfeature]IIS'
}
}
}
</code></pre>
<p>The above configuration script, when run, creates a command in the current PowerShell session called SysAdvent. Running that command will generate a configuration document for every server in a collection that has the role of a web server. The configuration command has a common parameter of ConfigurationData which is where AllNodes comes from (more on that in a bit). The result of this command will be a MOF document describing the desired configuration for every node identified as a web server.</p>
<p>MOF documents created by the command are written in a folder (of the same name as the configuration) created in the current working directory. Files are named for the node they represent (e.g. server1.mof). You can specify a custom output location. Here is our newly created MOF document:</p>
<pre><code>/*
@TargetNode='localhost'
@GeneratedBy=Administrator
@GenerationDate=12/22/2014 04:12:56
@GenerationHost=ARMORY
*/
instance of MSFT_RoleResource as $MSFT_RoleResource1ref
{
SourceInfo = "::7::7::windowsfeature";
ModuleName = "PSDesiredStateConfiguration";
ModuleVersion = "1.0";
ResourceID = "[WindowsFeature]IIS";
Name = "web-server";
ConfigurationName = "SysAdvent";
};
instance of PSHOrg_cWebsite as $PSHOrg_cWebsite1ref
{
ResourceID = "[cWebsite]FourthCoffee";
PhysicalPath = "c:&#92;&#92;websites&#92;&#92;fourthcoffee";
State = "Started";
ApplicationPool = "FourthCoffeeAppPool";
SourceInfo = "::12::7::cWebsite";
Name = "FourthCoffee";
ModuleName = "cWebAdministration";
ModuleVersion = "1.1.1";
DependsOn = {
"[windowsfeature]IIS"};
ConfigurationName = "SysAdvent";
};
instance of OMI_ConfigurationDocument
{
Version="1.0.0";
Author="Administrator";
GenerationDate="12/22/2014 04:12:56";
GenerationHost="ARMORY";
Name="SysAdvent";
};</code></pre>
<h4>Other Tidbits</h4>
<p>There are a few other things one should know in preparation for digging into DSC.</p>
<h5>ConfigurationData and AllNodes</h5>
<p>Configurations have support for a convention-based approach to separating environmental data from the structural configuration. The configuration script represents the structure or model for the machine, and the environmental data (via ConfigurationData) fleshes out the details.</p>
<p>ConfigurationData is represented by a hashtable with at least one key - AllNodes. AllNodes is an array of hashtables representing the nodes that should have configurations generated and becomes an automatic variable that can be referenced in the configuration (like in the example above). The value provided in $ConfigurationData is also referenced and you can create custom keys and reference those in your configuration document. The PowerShell team reserves the right to use any key in the ConfigurationData hashtable that is prefixed with PS.</p>
<p>Example:</p>
<pre><code>$ConfigurationData = @{
AllNodes = (
@{NodeName = '*', InterestingData = 'Every node can reference me.'}
@{NodeName = 'Server1'; Role = 'Web'},
@{NodeName = 'Server2'; Role = 'SQL'},
)
}
Sysadvent -ConfigurationData $ConfigurationData
</code></pre>
<h5>DependsOn</h5>
<p>Resources in DSC are not ordered by default and there is no guarantee of ordering. The current WMF 4 implementation and the previews of WMF 5 all seem to serially process resources, but there is NO guarantee that will stay that way. If you need things to happen in a certain order, you need to use DependsOn to tell a resource what needs to happen first before that one can execute.</p>
<h5>Node Names</h5>
<p>In PUSH mode, the node name is either the server name, FQDN, or IP address (any valid way you can address that node via PowerShell remoting).</p>
<p>In PULL mode, the node name is not the server name. Servers are assigned a GUID and they use that to identify which configuration to retrieve from a pull server. Where this GUID comes from is up to you - you can generate them on the fly, pull one from AD, or use one from another system. Since the GUID is the identifier, you can use one GUID to represent an individual server or a class of servers.</p>
<h5>WMF 5 - In Production</h5>
<p>If you are running Windows Server 2012 R2, <a rel="nofollow" target="_blank" href="http://blogs.msdn.com/b/powershell/archive/2014/12/10/wmf-5-0-preview-defining-quot-experimental-designs-quot-and-quot-stable-designs-quot.aspx">you can stay on the bleeding edge AND get production support</a>. The PowerShell team recently announced that if you are using WMF 5, you can get production support for what they call &#8220;stable&#8221; designs - those features that either existed in previous versions of the Management Framework or have reached a level that the team is ready to provide support. Other features, which are more in flux, are labeled experimental and don&#8217;t carry the same support level. With this change, you can safely deploy WMF 5 and begin to test new features and get the bug fixes faster than waiting for the full release. WMF previews are released roughly quarterly.</p>
<p>With WMF 5, you can dig into new and advanced features like Debug mode, partial configurations, and separate pull servers for different resource types.</p>
<h3>Building an Ecosystem</h3>
<p>No tooling is complete without a community around it and Desired State Configuration is no different.</p>
<h4>PowerShellGet and OneGet</h4>
<p><a rel="nofollow" target="_blank" href="https://github.com/oneget/oneget">OneGet</a> and <a rel="nofollow" target="_blank" href="http://blogs.msdn.com/b/mvpawardprogram/archive/2014/10/06/package-management-for-powershell-modules-with-powershellget.aspx">PowerShellGet</a> are coming onto the scene with WMF 5 (although after they release they should be available somewhat downlevel too). OneGet is a package manager manager and provides an abstraction layer on top of things like nuget, chocolatey, and PowerShellGet, and eventually tools like npm, RubyGems, and more. PowerShellGet provides a way to publish and consume external modules, including those that contain DSC resources. </p>
<p>Finding new resources becomes as easy as:</p>
<pre><code>Find-Module -Includes DscResource
</code></pre>
<h4>Third Parties</h4>
<h5>Chef</h5>
<p>Back in July 2014, Chef made a preview of our DSC integration available (<a rel="nofollow" target="_blank" href="https://www.youtube.com/watch?v=mXaAIawzNic">video</a>, <a rel="nofollow" target="_blank" href="https://supermarket.chef.io/cookbooks/dsc">cookbook</a>) and in September <a rel="nofollow" target="_blank" href="https://www.chef.io/blog/2014/09/08/chef-11-16-gets-into-powershell-dsc/">shipped our first production-supported integration</a> (<a rel="nofollow" target="_blank" href="https://docs.chef.io/resource_dsc_script.html">the dsc_script resource</a>) and have more coming. DSC offers Chef increased coverage on the Windows platform.</p>
<h5>ScriptRock</h5>
<p>The guys at ScriptRock (full disclosure - they are friends of mine) have done a pretty interesting thing by <a rel="nofollow" target="_blank" href="http://www.scriptrock.com/blog/powershell-dsc-with-guardrail">taking a configuration visualization and testing tool and offering an export of the configuration as a DSC script</a>. Very cool.</p>
<h5>Puppet</h5>
<p>There is a <a rel="nofollow" target="_blank" href="https://forge.puppetlabs.com/msutter/dsc">Puppet module on the Forge</a> showing some DSC integration. I&#8217;m not too familiar with the state of that project, but it&#8217;s great to see it!</p>
<h5>Aditi</h5>
<p>Brewmaster from Aditi is a deployment tool and can leverage DSC to get a server in shape to host a particular application, allowing you to distribute a DSC configuration with an application.</p>
<h4>PowerShell.Org</h4>
<p>PowerShell.Org hosts a <a rel="nofollow" target="_blank" href="http://powershell.org/wp/dsc-hub/">DSC Hub</a> containing forums, blog posts, podcasts, videos and a free e-book on DSC.</p>
<h3>So, What Are You Waiting For?</h3>
<p>Start digging in! There&#8217;s a ton of content out there. Shout at me on Twitter (<a rel="nofollow" target="_blank" href="https://twitter.com/stevenmurawski">@stevenmurawski</a>) or via <a rel="nofollow" target="_blank" href="http://stevenmurawski.com">my blog</a> if you have any questions.</p><img height="1" width="1" alt=""/>Christopher Webbertag:blogger.com,1999:blog-3615332969083650973.post-2675725091070196977Thu, 25 Dec 2014 00:00:00 +0000[24ways] Cohesive UXhttp://feedproxy.google.com/~r/24ways/~3/5DxEWYCsfuY/
<p><a rel="nofollow" target="_blank" href="http://cameronmoll.com/">Cameron Moll</a> brings the tenth 24 ways to a close with a look at the increasing need for common experiences across devices. Despite our differences, there are more things we share than divide us. Merry Christmas!</p>
<p><a rel="nofollow" class="promo" target="_blank" href="http://beanstalkapp.com/?utm_source=24ways&amp;utm_medium=sponsored-post-12-4&amp;utm_campaign=24ways-2014">
<strong>Brought to you by Beanstalk.</strong> Celebrate shipping code again! With Beanstalk, your entire team can ship code as simple as git push.
</a></p><div class="feedflare">
<a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=5DxEWYCsfuY:zLlj08872Ws:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/24ways?d=yIl2AUoC8zA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=5DxEWYCsfuY:zLlj08872Ws:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/24ways?d=7Q72WNTAKBA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=5DxEWYCsfuY:zLlj08872Ws:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/24ways?i=5DxEWYCsfuY:zLlj08872Ws:V_sGLiPBpWU" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=5DxEWYCsfuY:zLlj08872Ws:dnMXMwOfBR0"><img src="http://feeds.feedburner.com/~ff/24ways?d=dnMXMwOfBR0" border="0"></a>
</div><img height="1" width="1" alt=""/>feeds@allinthehead.com (Cameron Moll)http://24ways.org/2014/cohesive-ux/Wed, 24 Dec 2014 12:00:00 +0000[perl] Out of Order Perlhttp://perladvent.org/2014/2014-12-24.html
<div class='pod'><p>This article will cover asynchronous programming with the AnyEvent library and will show use cases for managing multiple asynchronous requests in a single application. In addition, I hope to introduce good techniques for using metrics to drive technology decisions!</p>
<h3 id="Is-an-Asynchronous-Solution-right-for-me">Is an Asynchronous Solution right for me?</h3>
<p>New software techniques and practices are always rearing their heads in the industry. While asynchronous functionality is not new to Perl, it is not widely used where it could be, and maybe over-used where it shouldn&#39;t be. When considering whether or not to adopt some new methodology on existing software, it&#39;s important to make sure you have clearly identified what problem you are trying to solve. It&#39;s fine if you want to play around with something new, but remember that every solution has a cost. The cost of asynchronous functionality is that it can be difficult to read and debug (callback soup anyone?) and you have to consider compatibility of your current web framework as well as your current code base.</p>
<h4 id="Step-1:-Identify-the-Problem">Step 1: Identify the Problem</h4>
<blockquote>“70% of users are unable to download the TPS Report from the website because the
website times out.”</blockquote>
<blockquote>“Web site latency is reported for 75% of users.”</blockquote>
<h4 id="Step-2:-Identify-the-cause-the-clarify-the-problem-with-numbers">Step 2: Identify the cause the clarify the problem (with numbers!)</h4>
<p>To identify the cause, you can check logs, collect metrics (a lot of folks use statsd), or attempt to reproduce the problem in a staging environment.</p>
<blockquote>“The TPS report database query takes 30 seconds to run”</blockquote>
<blockquote>“Requests to HTTP services are averaging 1 second because each request is made individually and is only made once the previous request is complete.”</blockquote>
<h4 id="Step-3:-Propose-a-solution-based-on-data-and-facts">Step 3: Propose a solution (based on data and facts)</h4>
<p>Now that you&#39;ve collected data and clarified your problem in terms of metrics, you can say &quot;Asynchronous HTTP requests would reduce overall report loading time to only the amount of time it takes the longest request to return.&quot;</p>
<p>Implement a Solution based on your results... and use metrics to determine success There is a fully functioning sample application! The sample application from which the code snippets in this article come is located on <a rel="nofollow" target="_blank" href="http://github.com/missaugustina/perl-out-of-order">github</a>.</p>
<p>The sample application is a simple Mojolicious::Lite application the generates a &ldquo;TPS&rdquo; report. This report makes calls to 2 external services to collect data and then uses that data to make a complex call to the database. I&#39;m just using simple timing metrics to show the results for each of the reports. You can run it yourself to see the results as it records them in the database. First Improvement: Make it Async!</p>
<p>As we ascertained from our problem statement exercise just now, kicking off HTTP requests at the same time would drastically improve our report&#39;s performance.</p>
<p>Here&#39;s how you currently use LWP to make HTTP requests:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$ua</span> <span class="operator">=</span> <span class="word">LWP::UserAgent</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">();</span><br /><span class="keyword">my</span> <span class="symbol">$req</span> <span class="operator">=</span> <span class="word">HTTP::Request</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">(</span> <span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span> <span class="structure">);</span><br /><br /><span class="comment"># make the request, and wait until we get the results back<br /></span><span class="keyword">my</span> <span class="symbol">$res</span> <span class="operator">=</span> <span class="symbol">$ua</span><span class="operator">-&gt;</span><span class="word">request</span><span class="structure">(</span><span class="symbol">$req</span><span class="structure">);</span><br /><br /><span class="keyword">my</span> <span class="symbol">$content</span> <span class="operator">=</span> <span class="symbol">$res</span><span class="operator">-&gt;</span><span class="word">content</span><span class="structure">;</span><br /><span class="keyword">return</span> <span class="symbol">$content</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Here&#39;s how it looks in our sample application:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$req</span> <span class="operator">=</span> <span class="word">HTTP::Request</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">(</span> <span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span> <span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$res</span> <span class="operator">=</span> <span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">_user_agent</span><span class="operator">-&gt;</span><span class="word">request</span><span class="structure">(</span><span class="symbol">$req</span><span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$content</span> <span class="operator">=</span> <span class="symbol">$res</span><span class="operator">-&gt;</span><span class="word">content</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$http_data</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span><br /><span class="structure">}</span><br />&nbsp;&nbsp;<br /><span class="keyword">return</span> <span class="symbol">$http_data</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Notice that these requests happen for EACH URL in our list for EACH row in the database results one after the other, waiting for the previous request to return before starting the next. If we have 10 URLs and each one takes 3 milliseconds, our total request takes 10 x 3 milliseconds or 30 milliseconds total. If one of them takes any more time than that, it will hold up processing of the others and makes our request take even longer.</p>
<p>There are a lot of asynchronous libraries in the CPAN for you to consider. I&#39;m using AnyEvent because we&#39;re going to be talking to RabbitMQ later and AnyEvent::RabbitMQ is one of the better libraries for doing that. When considering what asynchronous library to use, you always need to think about your requirements and to determine if it&#39;s likely you&#39;ll be adding other asynchronous functionality. Swapping one out for another, however, isn&#39;t impossible as they all pretty much follow the same rules (they just call things by different names). What you&#39;ll learn here about how to use AnyEvent will also apply in concept to any other library you might want to use instead.</p>
<p>Here&#39;s what the asynchronous solution looks like in our sample application using AnyEvent::HTTP.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;<br />25:&nbsp;<br />26:&nbsp;<br />27:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br />&nbsp;&nbsp;<br /><span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span><br /><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br /><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$request</span><span class="structure">;</span><br />&nbsp;&nbsp;<br />&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">timeout</span> <span class="operator">=&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$content</span><span class="operator">,</span> <span class="symbol">$headers</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="core">undef</span> <span class="symbol">$request</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<span class="structure">);</span><br /><span class="structure">}</span><br /><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br /><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Let&#39;s add a few more comments to try and make it a little clearer what&#39;s going on:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;<br />25:&nbsp;<br />26:&nbsp;<br />27:&nbsp;<br />28:&nbsp;<br />29:&nbsp;<br />30:&nbsp;<br />31:&nbsp;<br />32:&nbsp;<br />33:&nbsp;<br />34:&nbsp;<br />35:&nbsp;<br />36:&nbsp;<br />37:&nbsp;<br />38:&nbsp;<br />39:&nbsp;<br />40:&nbsp;<br />41:&nbsp;<br />42:&nbsp;<br />43:&nbsp;<br />44:&nbsp;<br />45:&nbsp;<br />46:&nbsp;<br />47:&nbsp;<br />48:&nbsp;<br />49:&nbsp;<br />50:&nbsp;<br />51:&nbsp;<br />52:&nbsp;<br />53:&nbsp;<br />54:&nbsp;<br />55:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="comment"># A condition variable. Basically a mechanism to tell if we're done<br /># processing all the requests or not.<br /></span><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><br /><span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span><br /><br /><span class="comment"># tell the condition variable that it's not &quot;ready&quot; until it sees the &quot;end&quot;<br /># call at the end of the code signifying we're done setting up all the<br /># requests. This is good practice to avoid accidentally completing<br /># while we're still setting up all the requests.<br />#<br /># Also pass the code block that will be executed at this point when the<br /># condition variable is &quot;ready&quot;.<br /></span><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br /><span class="comment"># for each of our urls, start a request in parallel<br /></span><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br /><br /><span class="comment"> # add another thing to the list of things that must be completed<br />&nbsp;&nbsp;# before the condition variable is &quot;ready&quot;. i.e. let the condition variable<br />&nbsp;&nbsp;# know that we must wait until the HTTP request returns and the callback<br />&nbsp;&nbsp;# calls a corresponding &quot;end&quot; on the condition variable before we're done.<br /></span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">;</span><br />&nbsp;&nbsp;<br /><span class="comment"> # schedule a HTTP request to made asynchronously. Once it's done<br />&nbsp;&nbsp;# and we've got the content, call the callback.<br /></span> <span class="keyword">my</span> <span class="symbol">$request</span><span class="structure">;</span><br />&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">timeout</span> <span class="operator">=&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$content</span><span class="operator">,</span> <span class="symbol">$headers</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br /><br /><span class="comment"> # save the content for this request (i.e. what we downloaded) in our<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# result<br /></span> <span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span><br /><br /><span class="comment"> # clean ourselves up<br /></span> <span class="core">undef</span> <span class="symbol">$request</span><span class="structure">;</span><br /><br /><span class="comment"> # and let the condition variable know that this scheduled thing<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# is done and once the last one is done it can fire.<br /></span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<span class="structure">);</span><br /><span class="structure">}</span><br /><br /><span class="comment"># signify that we're done with stage of setting up all the requests. The<br /># condition variable become &quot;ready&quot; until all the requests that called &quot;begin&quot;<br /># above have also completed and called a corresponding number of &quot;end&quot; calls<br /></span><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br /><br /><span class="comment"># wait (i.e. block) until the condition variable is ready, i.e. until a<br /># corresponding number of &quot;end&quot; calls have been called for each &quot;begin&quot; call<br /></span><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>We&#39;ll get back to this code, explaining in more depth each section of code bit by bit once we&#39;ve learned a little more about AnyEvent.</p>
<p>First, let&#39;s look at the advantage of using this significantly more complicated code. With an asynchronous solution, if each request takes 3 milliseconds, then we get our results back in 3 milliseconds. If one of those requests takes 1 second, then we get all of our results back in 1 second. The time it takes to get our results back takes only as long as the longest request!</p>
<h3 id="What-does-it-mean-to-be-Asynchronous">What does it mean to be Asynchronous?</h3>
<h4 id="Asynchronous-code-is-event-driven">Asynchronous code is event-driven</h4>
<p>An event queue manages code execution. If you&#39;ve ever had to do any work with user interfaces, you should be familiar with the notion of an event-listener. An event in a user interface would be a button click. You would register a function to execute when the the button is pushed. In our example, the function (callback) we&#39;ve registered with the event we&#39;re interested in is pushed to an event queue that fires after all the other code executes. The event loop is a queue of callback functions. When an asynchronous function executes, the callback function is pushed into the event queue.</p>
<h4 id="Results-are-returned-in-the-background-while-your-application-does-other-things">Results are returned in the background while your application does other things</h4>
<p>The rest of your application proceeds ahead using a placeholder for the result until it reaches a point where it can&#39;t go any further without the actual result (for example writing that result out to a data store).</p>
<p>The basic formula for Async: <b>Event Loop + Listener + Callbacks</b></p>
<p>A listener registers your interest in a particular event. A callback is a function that runs when your request is in a <i>ready</i> state. No matter what Async library you use, all of them follow these rules.</p>
<p>Now we&#39;ll talk about more specifically what this looks like in the AnyEvent libraries.</p>
<h3 id="AnyEvent-101">AnyEvent 101</h3>
<h4 id="condvar"><code>condvar</code></h4>
<p>AnyEvent is just an abstraction layer on top of event loops. The <code>condvar()</code> method initializes the condition variable which represents a value that may or may not be available. The condition variable doesn&#39;t have a value until it is &ldquo;ready&rdquo;, meaning the request has completed. You use the <code>condvar()</code> method to interface with the event loop.</p>
<h4 id="send"><code>send</code></h4>
<p>The <code>send()</code> method sets the conditional variable to <i>ready</i>. This will indicate that one event is complete. In the case of multiple requests, you can bind a callback to <code>send()</code> so that you can store the values retrieved each time the condvar is in a ready state.</p>
<h4 id="recv"><code>recv</code></h4>
<p>This returns a value when the <code>condvar</code> is in a ready state. Note that this is blocking because this is called at the point where your program cannot continue until it has the value it needs. I&#39;ll show why this is important later when we introduce RabbitMQ and have to manage multiple asynchronous requests.</p>
<h4 id="begin-and-end"><code>begin</code> and <code>end</code></h4>
<p>These methods are syntactic sugar for using a counter within an event loop. Begin increments the counter and end decrements it. The event loops runs until the counter is at 0. You can use these methods when you&#39;re managing multiple requests within an event loop and you want to make sure the event loop doesn&#39;t terminate until all of the requests you want have finished.</p>
<p>If you were to implement this yourself, you could create a loop where you create a <code>condvar</code> for each individual request. This syntactic sugar lets you declare just one &lt;Ccondvar&gt; and then specify when you want to set its ready state so <code>recv</code> can return the result.</p>
<h3 id="AnyEvent::HTTP-Code-Walkthrough">AnyEvent::HTTP Code Walkthrough</h3>
<p>Remember that code sample from earlier? Well now that we have a little better vocabulary, we can walk through it and understand what&#39;s happening in a little more detail, and hopefully understand how we came to write the code like that.</p>
<p>The first step is to initialize the <code>condvar</code>. Remember at this point, it has no actual value!</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Next, you need to determine what you want to return at the <i>ready</i> state and assign it. This is where async programming can get a little tricky (hence the title of this article)! You need to think backwards when you start thinking async. Think from the result and work your way back. Decide at what point your application absolutely needs this data and put your <code>recv</code> call there.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="operator">...</span><span class="word">all</span> <span class="word">the</span> <span class="word">other</span> <span class="word">code</span><span class="operator">...</span><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Now all we need to work out what code goes between these two statements. We need to figure out when to call the send event so we can capture our result. We&#39;re going to call it within a callback bound to a begin call. Each time we want to fetch data from a URL, we&#39;ll increment our event loop counter so that we don&#39;t exit out of the event loop before we&#39;ve gotten a response for each of our URL requests. So, this means, in English: Once all the URLs are fetched, set the <code>condvar</code> to ready and return the result.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span> <span class="comment"># &lt;------------- we add this</span><br /><span class="operator">...</span><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Because we&#39;ve declared a begin, we need to declare an end outside of the loop to ensure that send gets called. If our loop has no data (in the while block) we would loop forever without this call to end here. This ensures the send we registered with our begin up above gets called.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><span class="operator">...</span><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span> <span class="comment"># &lt;------------- we add this</span><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Now that we&#39;ve defined our start and end points, we can focus on the actual logic around retrieving the URLs. We&#39;re going to use the AnyEvent::HTTP library, in particular the <code>http_request</code> method. Calling <code>http_request</code> means that you want the request to be made whenever it&#39;s possible (versus a procedural call which would mean stop everything and do it right now). This registers an IO watcher and tells the event loop that we are interested in this IO event.</p>
<p>When the request finishes, we want to invoke the callback that we&#39;ve bound to the function call.</p>
<p>This method doesn&#39;t actually do the request, it returns right away and tells the event loop to do the request whenever it can. When this chunk of code runs, the event loop is not running because this chunk of code has control.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span> <br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br /><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;<span class="word">my</span> <span class="symbol">$request</span><span class="structure">;</span> <span class="operator">&lt;----</span><br />&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span> <span class="readline">&lt;----<br />&nbsp;&nbsp;&nbsp;&nbsp;GET =&gt;</span> <span class="symbol">$url</span><span class="operator">,</span> <span class="readline">&lt;---- we add<br />&nbsp;&nbsp;&nbsp;&nbsp;timeout =&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds &lt;---- all of</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span> <span class="operator">&lt;----</span> <span class="word">this</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">my</span> <span class="structure">(</span><span class="symbol">$content</span><span class="operator">,</span> <span class="symbol">$headers</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span> <span class="operator">&lt;----</span> <span class="word">code</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span> <span class="operator">&lt;----</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span> <span class="operator">&lt;----</span><br />&nbsp;&nbsp;<span class="structure">);</span> <span class="operator">&lt;----</span><br /><span class="structure">}</span><br /><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>We need to unregister our interest in this event once we&#39;re done with it.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span> <br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br /><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;<span class="word">my</span> <span class="symbol">$request</span><span class="structure">;</span> <br />&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">timeout</span> <span class="operator">=&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$content</span><span class="operator">,</span> <span class="symbol">$headers</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="core">undef</span> <span class="symbol">$request</span><span class="structure">;</span> <span class="comment"># &lt;------------- we add this</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<span class="structure">);</span><br />&nbsp;&nbsp;<span class="operator">...</span><br /><span class="structure">}</span><br /><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>The inner begin and end calls define the individual requests that we want to track in the event loop. For each begin, the AnyEvent event loop will increment a counter. For each end the AnyEvent event loop decrements the counter. Once the counter reaches 0, AnyEvent can set the <code>condvar</code> to <i>ready</i>.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;<br />25:&nbsp;<br />26:&nbsp;<br />27:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br />&nbsp;&nbsp;<br /><span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;<br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br /><span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$service_name</span><span class="operator">,</span> <span class="symbol">$url</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">;</span> <span class="comment"># &lt;------------- we add this</span><br /><br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$request</span><span class="structure">;</span><br />&nbsp;&nbsp;<br />&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$url</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">timeout</span> <span class="operator">=&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$content</span><span class="operator">,</span> <span class="symbol">$headers</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$content</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="core">undef</span> <span class="symbol">$request</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span> <span class="comment"># &lt;------------- and we add this</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<span class="structure">);</span><br /><span class="structure">}</span><br /><br /><span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br /><br /><span class="keyword">return</span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">recv</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>And that&#39;s it, we&#39;ve got a working example!</p>
<h3 id="When-should-I-use-Asynchronous-Programming">When should I use Asynchronous Programming?</h3>
<p>An asynchronous solution works best for problems that meet all of the following criteria:</p>
<ul>
<li><p>You need to do many things that could take awhile</p>
</li>
<li><p>You don&#39;t care about the order you do those things in</p>
</li>
</ul>
<p>Given that an asynchronous solution can be hard to read and potentially hard to maintain, you should use metrics to determine if this added complexity really buys you anything.</p>
<h3 id="Next-Improvement:-Lets-create-a-work-queue">Next Improvement: Let&#39;s create a work queue!</h3>
<p>Work queues are good for throttling the work your application has to do, for scheduling when this work needs to happen, and is a good solution for anything that could potentially take a long time thus causing a timeout experience for the end user. To create a work queue, you need a message queue where one part of your application can post messages to (work to be done) and where another part of your application or a separate <i>worker</i> process can access the messages and perform the work.</p>
<p>You can throttle the database load by determining how many messages your worker handles at one time. This could be as simple as a while loop that pauses in between processing. You could also spin up more workers if your message queue gets too long. If you need to do massive updates to lots of records such that you don&#39;t want to do them at a peak time when the database indexes are constantly being updated due to a large number of writes, you can schedule your worker processes to run at an off-peak time. This can be as simple as a perl script you write and then use a cron job to schedule.</p>
<p>In our example, let&#39;s say our users all hit the database the same time every week for these reports. These reports hit the database pretty hard which causes timeouts for other users and reports to fail to complete. Again, before we jump into adding yet more complexity to our technology stack, we need to consider if this solution is the best one for our problem.</p>
<p>Other solutions could be to move our single database instance into a cluster, or to pre-compute the data into predictable date chunks. If we have a table that has a lengthy history we could look at windowing or partitioning of the table to limit the total results the query has to run against. Let&#39;s say in this situation, we don&#39;t have the resources to make changes to the database. Apparently that&#39;s another department and there&#39;s a lengthy red tape process we need to follow in order to get anything done in that regard. While that points to greater systemic problems at our example company, this does often reflect reality and we still need to come up with a solution.</p>
<p>Enter RabbitMQ. RabbitMQ runs as a separate process which the various Perl processes communicate with via AMQP (Advanced Message Queuing Protocol) and is implemented in Erlang. RabbitMQ essentially receives and forwards messages. You could also use Redis and the Redis::Requeue library for the same thing, but RMQ offers us an asynchronous option.</p>
<h3 id="What-does-Asynchronous-Programming-have-to-do-with-RabbitMQ">What does Asynchronous Programming have to do with RabbitMQ?</h3>
<p>Sockets are traditionally a blocking connection. The actual connection attempt blocks until the connection is available. So to connect to RabbitMQ, we would need to wait for the HTTP connection and we would need to be able to respond to error messages. AnyEvent is an asynchronous library and AnyEvent::RabbitMQ is the canonical Perl library for interacting with RabbitMQ.</p>
<p>The reason using an asynchronous connection with RabbitMQ is important is because RabbitMQ sends a heartbeat to determine if the connection is still active. Most web framework applications by their nature are blocking, code is only triggered when a specific request (usually via a URL route) is made.</p>
<p>Once the publisher has established a connection, RabbitMQ can create an exchange. This exchange handles all incoming updates and redistributes them to the available queues. Once the exchange is created, RabbitMQ can send messages.</p>
<p>If the web application does not respond to the RMQ heartbeat, RMQ will hang up on you thereby closing the connection. So unless you want to hang up and reconnect every time you want to send a message to RabbitMQ, you&#39;ll need to double check that your web framework has a way to respond to this heartbeat.</p>
<p>In the example code, I&#39;ve chosen to use Mojolicious because Mojo::IO Loop handles multiple reactor backends.</p>
<h4 id="Step-1:-Replace-the-original-work-section-with-publishing-a-message-to-RabbitMQ">Step 1: Replace the original &ldquo;work&rdquo; section with publishing a message to RabbitMQ</h4>
<p>Some things to think about:</p>
<ul>
<li><p>do you need to keep a record of the queue data?</p>
</li>
<li><p>do you need to recreate the queue if the messages are lost?</p>
</li>
</ul>
<p>First we need to set up our RabbitMQ connection. This is in the <a rel="nofollow" target="_blank" href="https://github.com/missaugustina/perl-out-of-order/blob/master/worker-queue/bin/main.pl">main.pl</a> file of the worker queue version of the sample code.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br /><br /><span class="keyword">my</span> <span class="symbol">$ar</span> <span class="operator">=</span> <span class="word">AnyEvent::RabbitMQ</span><span class="operator">-&gt;</span><span class="word">new</span><span class="operator">-&gt;</span><span class="word">load_xml_spec</span><span class="structure">()</span><span class="operator">-&gt;</span><span class="word">connect</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">host</span> <span class="operator">=&gt;</span> <span class="single">'localhost'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">port</span> <span class="operator">=&gt;</span> <span class="number">5672</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">user</span> <span class="operator">=&gt;</span> <span class="single">'guest'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">pass</span> <span class="operator">=&gt;</span> <span class="single">'guest'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">vhost</span> <span class="operator">=&gt;</span> <span class="single">'/'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">on_success</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span> <span class="operator">...</span> <span class="structure">}</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br /><span class="structure">);</span></code><br />&nbsp;</td></table>
<p>This essentially connects to the message queue and then calls the <code>on_success</code> callback. Inside this <code>on_success</code> callback we then open the channel:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="word">on_success</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$ar</span> <span class="operator">=</span> <span class="core">shift</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$ar</span><span class="operator">-&gt;</span><span class="word">open_channel</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">on_success</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span> <span class="operator">...</span> <span class="structure">}</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br /><span class="structure">}</span><span class="operator">,</span></code><br />&nbsp;</td></table>
<p>And inside the <code>on_success</code> callback of the <code>open_channel</code> we have the bit of code that actually puts stuff on the message queue:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="word">on_success</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$channel</span> <span class="operator">=</span> <span class="core">shift</span><span class="structure">;</span><br /><br /><span class="comment"> # use a named queue &quot;reports&quot;<br /></span> <span class="symbol">$channel</span><span class="operator">-&gt;</span><span class="word">declare_queue</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">queue</span> <span class="operator">=&gt;</span> <span class="single">'reports'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">auto_delete</span> <span class="operator">=&gt;</span> <span class="number">0</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br /><br /><span class="comment"> # publish (send) to the message queue<br /></span> <span class="keyword">my</span> <span class="symbol">%publish_args</span> <span class="operator">=</span> <span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">header</span> <span class="operator">=&gt;</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">content_type</span> <span class="operator">=&gt;</span> <span class="single">'application/json'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">body</span> <span class="operator">=&gt;</span> <span class="word">encode_json</span><span class="structure">(</span><span class="symbol">$params</span><span class="structure">)</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">routing_key</span> <span class="operator">=&gt;</span> <span class="single">'reports'</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$channel</span><span class="operator">-&gt;</span><span class="word">publish</span><span class="structure">(</span><span class="symbol">%publish_args</span><span class="structure">);</span><br /><br /><span class="comment"> # and update the condition variable to say we're done with the<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# queue sending part<br /></span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="double">&quot;Added report request to queue&quot;</span><span class="structure">);</span><br />&nbsp;&nbsp;<span class="structure">}</span><span class="operator">,</span><br />&nbsp;&nbsp;<span class="operator">...</span></code><br />&nbsp;</td></table>
<h4 id="Step-2:-Move-the-work-to-a-worker-process">Step 2: Move the <i>work</i> to a worker process</h4>
<p>Now we&#39;ve got something requesting work, we need a worker on the other end of the message queue actually doing the work.</p>
<p>You can schedule your worker as a cron job which works with RabbitMQ because there isn&#39;t any overstepping if the previous job hasn&#39;t finished yet. You can also use AnyEvent::timer on a loop if you aren&#39;t a fan of cron jobs. There a quite a few options for running your worker process on a schedule, the sample application just uses a simple Perl script with the idea that it would be a cron job.</p>
<p>The worker will need to take a message, process it, then grab the next one. You can have as many of these running as you want depending on your work load. To determine your work load, log metrics and keep an eye on them. Don&#39;t make assumptions that aren&#39;t based on actual data!</p>
<p>We&#39;ll use the <code>consume()</code> method to get the next message in the message queue. Once we get the message, we get the delivery tag for communication with the queue later. The payload is the queue item content. In our example we have this encoded in JSON so we need to decode it to do anything with it. This code can be found in the <a rel="nofollow" target="_blank" href="https://github.com/missaugustina/perl-out-of-order/blob/master/worker-queue/bin/report_worker.pl">report_worker.pl</a> file.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="comment"># connect to the message queue as in the previous example<br /></span><span class="symbol">$ar</span><span class="operator">-&gt;</span><span class="word">open_channel</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">on_success</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$channel</span> <span class="operator">=</span> <span class="core">shift</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br /><br /><span class="comment"> # consume (read) messages when they arrive<br /></span> <span class="symbol">$channel</span><span class="operator">-&gt;</span><span class="word">consume</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">on_consume</span> <span class="operator">=&gt;</span> <span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$message</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$frame</span> <span class="operator">=</span> <span class="symbol">$message</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="word">deliver</span><span class="structure">}</span><span class="operator">-&gt;</span><span class="word">method_frame</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$delivery_tag</span> <span class="operator">=</span> <span class="symbol">$frame</span><span class="operator">-&gt;</span><span class="word">delivery_tag</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$params</span> <span class="operator">=</span> <span class="word">decode_json</span><span class="structure">(</span><span class="symbol">$message</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="word">body</span><span class="structure">}</span><span class="operator">-&gt;</span><span class="word">payload</span><span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /><span class="comment"> # do work specified in %params...<br /></span> <br /><span class="comment"> # confirm that we've done the work and the message<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# queue can drop the message<br /></span> <span class="symbol">$channel</span><span class="operator">-&gt;</span><span class="word">ack</span><span class="structure">(</span><span class="word">delivery_tag</span> <span class="operator">=&gt;</span> <span class="symbol">$delivery_tag</span><span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><span class="operator">,</span><br /><span class="structure">);</span></code><br />&nbsp;</td></table>
<h3 id="The-Gotcha">The Gotcha</h3>
<p>Here&#39;s what the work looks like in the <i>do work</i> section mentioned above. We create a ReportBuilder instance and tell it to build the report.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$report_builder</span> <span class="operator">=</span> <span class="word">Poo::ReportBuilder</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">();</span><br /><span class="keyword">my</span> <span class="symbol">$report</span> <span class="operator">=</span> <span class="symbol">$report_builder</span><span class="operator">-&gt;</span><span class="word">build_report</span><span class="structure">(</span><span class="symbol">$params</span><span class="structure">);</span><br /><span class="keyword">my</span> <span class="symbol">$report_json</span> <span class="operator">=</span> <span class="word">encode_json</span><span class="structure">(</span><span class="symbol">$report_data</span><span class="structure">);</span><br /><span class="keyword">my</span> <span class="symbol">$report</span> <span class="operator">=</span> <span class="word">Poo::Report</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">(</span><span class="cast">&#92;</span><span class="symbol">%params</span><span class="structure">)</span><span class="operator">-&gt;</span><span class="word">save</span><span class="structure">;</span></code><br />&nbsp;</td></table>
<p>Now here&#39;s the gotcha, and this is an easy mistake to make when you&#39;re first dealing with asynchronous programming: Remember that our <code>build_report</code> function itself uses AnyEvent::HTTP to build the reports - to run more than one web request at a time - and we didn&#39;t write it in a way that plays well with the AnyEvent code we just wrote to deal with the message queue.</p>
<p>If you were to run this code as is, you would get the following error:</p>
<pre><code> AnyEvent::CondVar: recursive blocking wait attempted at ../lib/Poo/ReportBuilder.pm line 238</code></pre>
<p>For reference, I&#39;ve implemented this in the <a rel="nofollow" target="_blank" href="https://github.com/missaugustina/perl-out-of-order/tree/master/async">async</a> version of the application in Github with a comment explaining to not do that. Use that code to play around with adding another asynchronous call (say by adding the RabbitMQ code) and seeing what happens.</p>
<p>Remember, we have to call <code>recv</code> at the point in our code when we absolutely need the result to continue: Because our program cannot continue until this method returns something, <code>recv()</code> is considered a blocking function, but we don&#39;t want to block in the middle of generating a report - we need to be going back to the event loop in order to respond to those heartbeats!</p>
<p><code>recv</code> always needs to be at the top level. We need to use callbacks at the highest level, not the <code>condvar</code>. Anything that depends on any return value shouldn&#39;t depend on a variable, it should bind to a callback.</p>
<p>This is also why asynchronous programming can be very confusing and also very hard to maintain and debug. There are some techniques to improve this workflow, including Promises or Futures, but that&#39;s outside of the scope of this article. I do encourage you to research them though as they are reasonable solutions for dealing with callback soup.</p>
<p>Our original build_report method had this code:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">sub</span> <span class="word">build_report</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">my</span> <span class="symbol">$http_data</span> <span class="operator">=</span> <span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">_get_data_from_urls</span><span class="structure">(</span> <span class="cast">&#92;</span><span class="symbol">%urls_list</span> <span class="structure">);</span><br /><br /><span class="comment"> # do stuff with $http_data...<br /></span><span class="structure">}</span></code><br />&nbsp;</td></table>
<p>We need to change this to bind to a callback so a caller can manage their own <code>condvar</code> and their own <code>recv</code> call.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">sub</span> <span class="word">build_report</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$self</span><span class="operator">-&gt;</span><span class="word">_get_data_from_urls</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="cast">&#92;</span><span class="symbol">%urls_list</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span> <br /><span class="comment"> # do stuff with $http_data...<br /></span> <span class="operator">...</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br /><span class="structure">}</span></code><br />&nbsp;</td></table>
<p>We&#39;ll alter the <code>_get_data_from_urls()</code> method by removing the <code>recv</code> method and by using the <code>cb</code> method instead. Whenever it completes and something is sent to the <code>condvar</code>, the <code>cb</code> method of the <code>condvar</code> calls the specified function as soon as it completes. The caller is responsible for providing a callback.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;<br />7:&nbsp;<br />8:&nbsp;<br />9:&nbsp;<br />10:&nbsp;<br />11:&nbsp;<br />12:&nbsp;<br />13:&nbsp;<br />14:&nbsp;<br />15:&nbsp;<br />16:&nbsp;<br />17:&nbsp;<br />18:&nbsp;<br />19:&nbsp;<br />20:&nbsp;<br />21:&nbsp;<br />22:&nbsp;<br />23:&nbsp;<br />24:&nbsp;<br />25:&nbsp;<br />26:&nbsp;<br />27:&nbsp;<br />28:&nbsp;<br />29:&nbsp;<br />30:&nbsp;<br />31:&nbsp;<br />32:&nbsp;<br />33:&nbsp;<br />34:&nbsp;<br />35:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">sub</span> <span class="word">_get_data_from_urls</span> <span class="structure">{</span><br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$urls</span> <span class="operator">=</span> <span class="core">shift</span><span class="structure">;</span><br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$callback</span> <span class="operator">=</span> <span class="core">shift</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$cv</span> <span class="operator">=</span> <span class="word">AnyEvent</span><span class="operator">-&gt;</span><span class="word">condvar</span><span class="structure">;</span><br />&nbsp;&nbsp;<br />&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$result</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">(</span><span class="keyword">sub</span> <span class="structure">{</span> <span class="core">shift</span><span class="operator">-&gt;</span><span class="word">send</span><span class="structure">(</span><span class="symbol">$result</span><span class="structure">)</span> <span class="structure">});</span><br /><br />&nbsp;&nbsp;<span class="keyword">while</span> <span class="structure">(</span><span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$customerid</span><span class="operator">,</span> <span class="symbol">$urls</span><span class="structure">)</span> <span class="operator">=</span> <span class="structure">(</span><span class="word">each</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$urls</span><span class="structure">}))</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">for</span> <span class="keyword">my</span> <span class="symbol">$service_name</span> <span class="structure">(</span><span class="word">keys</span> <span class="cast">%</span><span class="structure">{</span><span class="symbol">$urls</span><span class="structure">})</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">begin</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="symbol">$request</span><span class="structure">;</span><br /><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$request</span> <span class="operator">=</span> <span class="word">http_request</span><span class="structure">(</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">GET</span> <span class="operator">=&gt;</span> <span class="symbol">$urls</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span><span class="operator">,</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="word">timeout</span> <span class="operator">=&gt;</span> <span class="number">2</span><span class="operator">,</span> <span class="comment"># seconds</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">sub</span> <span class="structure">{</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="keyword">my</span> <span class="structure">(</span><span class="symbol">$body</span><span class="operator">,</span> <span class="symbol">$hdr</span><span class="structure">)</span> <span class="operator">=</span> <span class="magic">@_</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$result</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$customerid</span><span class="structure">}</span><span class="operator">-&gt;</span><span class="structure">{</span><span class="symbol">$service_name</span><span class="structure">}</span> <span class="operator">=</span> <span class="symbol">$body</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">);</span><br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<span class="structure">}</span><br />&nbsp;&nbsp;<br />&nbsp;&nbsp;<span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">end</span><span class="structure">;</span><br />&nbsp;&nbsp;<br /><span class="comment"> # this replaces &quot;my $http_result = $cv-&gt;recv;&quot; # &lt;--<br />&nbsp;&nbsp;# now the condition variable calls the callback when it # &lt;-- new<br />&nbsp;&nbsp;# reaches the ready state # &lt;-- code<br /></span> <span class="symbol">$cv</span><span class="operator">-&gt;</span><span class="word">cb</span><span class="structure">(</span><span class="symbol">$callback</span><span class="structure">);</span> <span class="comment"># &lt;--</span><br /><span class="structure">}</span></code><br />&nbsp;</td></table>
<h3 id="Sample-Code-Available">Sample Code Available</h3>
<p>Again, I want to reiterate, I have <a rel="nofollow" target="_blank" href="https://github.com/missaugustina/perl-out-of-order/">sample code</a> available for you to download and play with. Please experiment with this, try setting different values and using your debugger to step through the code to see when things get called and what values they produce. Asynchronous programming isn&#39;t something you really understand until you have to use it, and it&#39;s worth understanding.</p>
<h2 id="See-Also">See Also</h2>
<ul>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/AnyEvent">AnyEvent</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="http://github.com/missaugustina/perl-out-of-order">Sample code</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://www.youtube.com/watch?v=VYBLCvMu_pA">YAPC::NA talk video</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/AnyEvent::HTTP">AnyEvent::HTTP</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/AnyEvent::RabbitMQ">AnyEvent::RabbitMQ</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="http://www.rabbitmq.com/">http://www.rabbitmq.com/</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/Promises">Promises</a></p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/Future">Future</a></p>
</li>
</ul>
</div>Augustina Ragwitzhttp://perladvent.org/2014/2014-12-24.htmlWed, 24 Dec 2014 05:00:00 +0000[java] A Musical Finalehttp://feedproxy.google.com/~r/JavaAdventCalendar/~3/DvLi6qSq238/a-musical-finale.html
<div dir="ltr" style="text-align:left;"><h2 style="text-align:left;">What could be more fitting than Christmas music for Christmas Eve?</h2><div><div><div style="text-align:left;">In this post I want to discuss the joy of making music with Java and why/how I have come to use Python...<br /><br /></div><h3>But first, let us celebrate the season!</h3>We are all human and irrespective of our beliefs, it seems we all enjoy music of some form. For me some of the most beautiful music of all was written by Johan Sebastian Bach. Between 1708 and 1717 he wrote a set of pieces which are collectively called Orgelbüchlein (Little Organ Book). For this post and to celebrate the Java Advent Calendar I tasked Sonic Field to play this piece of music modelling the sounds of a 18th century pipe organ. If you did not know, yes some German organs of about that time really were able to produce huge sounds with reed pipes (for example, <a rel="nofollow" target="_blank" href="http://youtu.be/F51uHpH3yQk">Passacaglia And Fugue</a> the <a rel="nofollow" target="_blank" href="http://www.organartmedia.com/trost">Trost Organ</a>). The piece here is a 'Choral Prelude' which is based on what we would in English commonly call a Carol to be sung by an ensemble.<br /><br /><div class="separator" style="clear:both;text-align:center;"></div><div class="separator" style="clear:both;text-align:center;"><embed width="320" height="266" src="https://www.youtube.com/v/Xdm0_eKyaT0?version=3&f=user_uploads&c=google-webdrive-0&app=youtube_gdata" type="application/x-shockwave-flash"></iframe></div><br /><div style="text-align:center;">BWV 610 Jesu, meine Freude [Jesus, my joy]<br /><i>This performance dedicated to the Java Advent Calendar</i><br /><i>and created exclusively on the JVM using pure</i><br /><i>mathematics.</i></div></div><div><br /><b>How was this piece created?</b><br />Step one is to transcribe the score into midi. Fortunately, someone else already did this for me using automated score reading software. Not so fortunately, this software makes all sorts of mistakes which have to be fixed. The biggest issue with automatically generated midi files is that they end up with <a rel="nofollow" target="_blank" href="http://manual.ardour.org/editing-and-arranging/edit-midi/handle-overlapping-notes/">overlapped notes on the same channel</a>; that is strictly impossible in midi and ends up with an ambiguous interpretation of what the sound should be. Midi considers audio as note on, note off. So Note On, Note On, Note Off, Note Off is ambiguous; does it mean:<br /><br />One note overlapping the next or:<br /><span style="font-family:Courier New, Courier, monospace;">-----------------</span><br /><span style="font-family:Courier New, Courier, monospace;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;---------------</span><br /><br />One note entirely contained in a longer note?<br /><span style="font-family:Courier New, Courier, monospace;">----------------------------</span><br /><span style="font-family:Courier New, Courier, monospace;">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;----</span><br /><span style="font-family:Courier New, Courier, monospace;"><br /></span>Fortunately, tricks can be used to try and figure this out based on note length etc. The Java decoder always treats notes as fully contained. The Python method looks for very short notes which are contained in long ones and guesses the real intention was two long notes which ended up overlapped slightly. Here is the python (the Java is here on <a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/src/com/nerdscentral/audio/io/MidiFunctions.java">github</a>).<br /><pre class="prettyprint"><code><br />def repareOverlapMidi(midi,blip=5):<br /> print "Interpretation Pass"<br /> mute=True<br /> while mute:<br /> endAt=len(midi)-1<br /> mute=False<br /> index=0<br /> midiOut=[]<br /> this=[]<br /> next=[]<br /> print "Demerge pass:",endAt<br /> midi=sorted(midi, key=lambda tup: tup[0])<br /> midi=sorted(midi, key=lambda tup: tup[3])<br /> while index&lt;endAt:<br /> this=midi[index]<br /> next=midi[index+1]<br /> ttickOn,ttickOff,tnote,tkey,tvelocity=this<br /> ntickOn,ntickOff,nnote,nkey,nvelocity=next<br /> <br /> # Merge interpretation<br /> finished=False<br /> dif=(ttickOff-ttickOn)<br /> if dif&lt;blip and tkey==nkey and ttickOff&gt;=ntickOn and ttickOff&lt;=ntickOff:<br /> print "Separating: ",this,next," Diff: ",(ttickOff-ntickOn)<br /> midiOut.append([ttickOn ,ntickOn ,tnote,tkey,tvelocity])<br /> midiOut.append([ttickOff,ntickOff,nnote,nkey,nvelocity])<br /> index+=1<br /> mute=True <br /> elif dif&lt;blip:<br /> print "Removing blip: ",(ttickOff-ttickOn)<br /> index+=1<br /> mute=True <br /> continue<br /> else:<br /> midiOut.append(this) <br /> # iterate the loop<br /> index+=1<br /> if index==endAt:<br /> midiOut.append(next)<br /> if not mute:<br /> return midiOut<br /> midi=midiOut<br /></code></pre>[<a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/scripts/python/Bach-Large-Organ/note-formation.sy">This AGPL code is on Github</a>]<br /><br />Then comes some real fun. If you know the original piece, you might have noticed that the introduction is not original. I added that in the midi editing software <a rel="nofollow" target="_blank" href="http://ariamaestosa.sourceforge.net/">Aria Maestosa</a>. It does not need to be done this way; we do not even need to use midi files. A lot of the music I have created in Sonic Field is just coded directly in Python. However, from midi is how it was done here.<br /><br />Once we have a clean set of notes they need to be converted into sounds. That is done with 'voicing'. I will talk a little about that to set the scene then we can get back into more Java code oriented discussion. After all, this is the Java advent calendar!<br /><br />Voicing is exactly the sort of activity which brings Python to the fore. Java is a wordy language which has a large degree of strictness. It favours well constructed, stable structures. Python relies on its clean syntax rules and layout and the <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Principle_of_least_astonishment">principle of least astonishment.</a> For me, this Pythonic approach really helps with the very human process of making a sound:<br /><br /><pre class="prettyprint"><code><br />def chA():<br /> global midi,index<br /> print "##### Channel A #####"<br /> index+=1<br /> midi=shorterThanMidi(midis[index],beat,512)<br /> midi=dampVelocity(midi,80,0.75)<br /> doMidi(voice=leadDiapason,vCorrect=1.0)<br /> postProcess()<br /> <br /> midi=longAsMidi(midis[index],beat,512)<br /> midi=legatoMidi(midi,beat,128)<br /> midi=dampVelocity(midi,68,0.75)<br /> midi=dampVelocity(midi,80,0.75)<br /> doMidi(voice=orchestralOboe,vCorrect=0.35,flatEnv=True,pan=0.2)<br /> postProcessTremolate(rate=3.5)<br /> doMidi(voice=orchestralOboe,vCorrect=0.35,flatEnv=True,pan=0.8)<br /> postProcessTremolate(rate=4.5)<br /></code></pre><br />Above is a 'voice'. Contrary to what one might think, a synthesised sound does not often consist of just one sound source. It consists of many. A piece of music might have many 'voices' and each voice will be a composite of several sounds. To create just the one voice above I have split the notes into long notes and short notes. Then the actual notes are created by a call to doMidi. This takes advantage of Python's 'named arguments with default values' feature. Here is the signature for doMidi:<br /><pre class="prettyprint"><code><br />def doMidi(voice,vCorrect,pitchShift=1.0,qFactor=1.0,subBass=False,flatEnv=False,pure=False,pan=-1,rawBass=False,pitchAdd=0.0,decay=False,bend=True):<br /></code></pre><br /><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float:left;margin-right:1em;text-align:left;"><tbody><tr><td style="text-align:center;"><a rel="nofollow" target="_blank" href="http://1.bp.blogspot.com/-EAAntOrHBnw/VJb3guSNz_I/AAAAAAAAqWA/sG7LfMZrbYc/s1600/sing-spectrum.tiff" style="clear:left;margin-bottom:1em;margin-left:auto;margin-right:auto;"><img border="0" src="http://1.bp.blogspot.com/-EAAntOrHBnw/VJb3guSNz_I/AAAAAAAAqWA/sG7LfMZrbYc/s1600/sing-spectrum.tiff" height="146" width="320"/></a></td></tr><tr><td class="tr-caption" style="text-align:center;"><i>The most complex (unsurprisingly) voice to create is that</i><br /><i>of a human singing. I have been working on this for</i><br /><i>a long time and there is a long way to go; however, its</i><br /><i>is a spectrogram of a piece of music which does</i><br /><i>a passable job of sounding like someone singing.</i></td></tr></tbody></table>The first argument is actually a reference to a function which will create the basic tone. The rest of the arguments describe how that tone will be manipulated in the note formation. Whilst an approach like this can be mimicked using a builder pattern in Java; this latter language does not lend it self to the 'playing around' nature of Python (at least for me).<br /><br />For example, I could just run the script and add flatEvn=True to the arguments and run it again and compare the two sounds. It is an intuitive way of working.<br /><br />Anyhow, once each voice has been composited from many tones and tweaked into the shape and texture we want, they turn up as a huge list of lists of notes which are all mixed together and written out to disk as a flat file format which is basically just a dump of the underlying double data. At this point it sounds terrible! Making the notes is often only half the story.<br /><br /><div class="separator" style="clear:both;text-align:center;"><embed width="320" height="266" src="https://www.youtube.com/v/LGYcCJXKtgw?version=3&f=user_uploads&c=google-webdrive-0&app=youtube_gdata" type="application/x-shockwave-flash"></iframe></div><div class="separator" style="clear:both;text-align:center;"><i><span style="font-size:x-small;">Voice Synthesis by Sonic Field</span></i></div><div class="separator" style="clear:both;text-align:center;"><i><span style="font-size:x-small;">played specifically for this post.</span></i></div><br />You see, real sounds happen in a space. Our Choral is expected to be performed in a church. Notes played without a space around them sound completely artificial and lack any interest. To solve this we use impulse response reverberation. The mathematics behind this is rather complex and so I will not go into it in detail. However in the next section I will start to look at this as a perfect example of why Java is not only necessary but ideal as the back end to Python/Jython.<br /><br /><h3>You seem to like Python Alex - Why Bother With Java?</h3><div class="p1">My post might seem a bit like a Python sales job so far. What has been happening is simply a justification of using Python when Java is so good as a language (especially when written in a great IDE like Eclipse for Java). Let us look at something Python would be very bad indeed at. Here is the code for performing the <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Fast_Fourier_transform">Fast Fourier Transform</a>, which is a the heart of putting sounds into a space.</div><div class="p1"><br /><pre class="prettyprint"><code><br />package com.nerdscentral.audio.pitch.algorithm;<br /><br />public class CacheableFFT<br />{<br /><br /> private final int n, m;<br /><br /> // Lookup tables. Only need to recompute when size of FFT changes.<br /> private final double[] cos;<br /> private final double[] sin;<br /> private final boolean forward;<br /><br /> public boolean isForward()<br /> {<br /> return forward;<br /> }<br /><br /> public int size()<br /> {<br /> return n;<br /> }<br /><br /> public CacheableFFT(int n1, boolean isForward)<br /> {<br /> this.forward = isForward;<br /> this.n = n1;<br /> this.m = (int) (Math.log(n1) / Math.log(2));<br /><br /> // Make sure n is a power of 2<br /> if (n1 != (1 &lt;&lt; m)) throw new RuntimeException(Messages.getString("CacheableFFT.0")); //$NON-NLS-1$<br /><br /> cos = new double[n1 / 2];<br /> sin = new double[n1 / 2];<br /> double dir = isForward ? -2 * Math.PI : 2 * Math.PI;<br /><br /> for (int i = 0; i &lt; n1 / 2; i++)<br /> {<br /> cos[i] = Math.cos(dir * i / n1);<br /> sin[i] = Math.sin(dir * i / n1);<br /> }<br /><br /> }<br /><br /> public void fft(double[] x, double[] y)<br /> {<br /> int i, j, k, n1, n2, a;<br /> double c, s, t1, t2;<br /><br /> // Bit-reverse<br /> j = 0;<br /> n2 = n / 2;<br /> for (i = 1; i &lt; n - 1; i++)<br /> {<br /> n1 = n2;<br /> while (j &gt;= n1)<br /> {<br /> j = j - n1;<br /> n1 = n1 / 2;<br /> }<br /> j = j + n1;<br /><br /> if (i &lt; j)<br /> {<br /> t1 = x[i];<br /> x[i] = x[j];<br /> x[j] = t1;<br /> t1 = y[i];<br /> y[i] = y[j];<br /> y[j] = t1;<br /> }<br /> }<br /><br /> // FFT<br /> n1 = 0;<br /> n2 = 1;<br /><br /> for (i = 0; i &lt; m; i++)<br /> {<br /> n1 = n2;<br /> n2 = n2 + n2;<br /> a = 0;<br /><br /> for (j = 0; j &lt; n1; j++)<br /> {<br /> c = cos[a];<br /> s = sin[a];<br /> a += 1 &lt;&lt; (m - i - 1);<br /><br /> for (k = j; k &lt; n; k = k + n2)<br /> {<br /> t1 = c * x[k + n1] - s * y[k + n1];<br /> t2 = s * x[k + n1] + c * y[k + n1];<br /> x[k + n1] = x[k] - t1;<br /> y[k + n1] = y[k] - t2;<br /> x[k] = x[k] + t1;<br /> y[k] = y[k] + t2;<br /> }<br /> }<br /> }<br /> }<br />}<br /></code></pre>[<a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/src/com/nerdscentral/audio/pitch/algorithm/CacheableFFT.java">This AGPL code is on Github</a>]<br /><br />It would be complete lunacy to implement this methematics in JPython (<a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Late_binding#Late_binding_in_dynamically-typed_languages">dynamic late binding</a>&nbsp;would give unusably bad performance). Java does a great job of running it quickly and efficiently. In Java this runs just about as fast as it could in any language plus the clean, simple object structure of Java means that using the 'caching' system as straight forward. The caching comes from the fact that the cos and sin multipliers of the FFT can be re-used when the transform is the same length. Now, in the creation of <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Reverberation">reverberation effects</a> (those effects which put sound into a space) FFT lengths are the same over and over again due to <a rel="nofollow" target="_blank" href="http://www.ni.com/white-paper/4844/en/">windowing</a>. So the speed and object oriented power of Java have both fed into creating a clean, high performance implementation.<br /><br />But we can go further and make the FFT parallelised:<br /><pre class="prettyprint"><code><br />def reverbInner(signal,convol,grainLength):<br /> def rii():<br /> mag=sf.Magnitude(+signal)<br /> if mag&gt;0:<br /> signal_=sf.Concatenate(signal,sf.Silence(grainLength))<br /> signal_=sf.FrequencyDomain(signal_)<br /> signal_=sf.CrossMultiply(convol,signal_)<br /> signal_=sf.TimeDomain(signal_)<br /> newMag=sf.Magnitude(+signal_)<br /> if newMag&gt;0:<br /> signal_=sf.NumericVolume(signal_,mag/newMag) <br /> # tail out clicks due to amplitude at end of signal <br /> return sf.Realise(signal_)<br /> else:<br /> return sf.Silence(sf.Length(signal_))<br /> else:<br /> -convol<br /> return signal<br /> return sf_do(rii)<br /> <br />def reverberate(signal,convol):<br /> def revi():<br /> grainLength = sf.Length(+convol)<br /> convol_=sf.FrequencyDomain(sf.Concatenate(convol,sf.Silence(grainLength)))<br /> signal_=sf.Concatenate(signal,sf.Silence(grainLength))<br /> out=[]<br /> for grain in sf.Granulate(signal_,grainLength):<br /> (signal_i,at)=grain<br /> out.append((reverbInner(signal_i,+convol_,grainLength),at))<br /> -convol_<br /> return sf.Clean(sf.FixSize(sf.MixAt(out)))<br /> return sf_do(revi)<br /><br /></code></pre>Here we have the Python which performs the FFT to produce <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Convolution_reverb">impulse response reverberation</a>&nbsp;(c<i>onvolution reverb</i> is another name for this approach). The second function breaks the sound into grains. Each grain is then processes individually and they all have the same length. This performs that windowing effect I talked about earlier (I use a <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Window_function#Triangular_window">triangular window</a> which is not ideal but works well enough due to the long window size). If the grains are long enough, the impact of lots of little FFT calculation basically the same as the effect of one huge one. However, FFT is a nLog(n) process, so lots of little calculations is a lot faster than one big one. In effect, windowing make FFT become a linear scaling calculation.<br /><br />Note that the granulation process is performed in a future. We define a closure called <i>revi</i> and pass it to <i>sf_do()</i> which is executed it at some point in the future base on demand and the number of threads available. &nbsp;Next we can look at the code which performs the FFT on each grain - <i>rii</i>. That again is performed in a future. In other words, the individual windowed FFT calculations are all performed in futures. The expression of a parallel windowed FFT engine in C or FORTRAN ends up very complex and rather intractable. I have not personally come across one which is integrated into the generalised, thread pooled, future based schedular. Nevertheless, the combination of Jython and Java makes such a thing very easy to create.<br /><br /><h3 style="text-align:left;">How are the two meshed?</h3>Now that I hope I have put a good argument for hybrid programming between a great dynamic language (in this case Python) and a powerful mid level static language (in this case Java) it is time to look at how the two are fused together. There are many ways of doing this but Sonic Field picks a very distinct approach. It does not offer a general interface between the two where lots of intermediate code is generated and each method in Java is exposed separately into Python; rather it uses a uniform single interface with virtual dispatch.<br /><br />Sonic Field defines a very (aggressively) simple calling convention from Python into Java which initially might look like a major pain in the behind but works out to create a very flexible and powerful approach.<br /><br />Sonic Field defines 'operators' which all implement the following interface:<br /><pre class="prettyprint"><code><br />/* For Copyright and License see LICENSE.txt and COPYING.txt in the root directory */<br />package com.nerdscentral.sython;<br /><br />import java.io.Serializable;<br /><br />/**<br /> * @author AlexTu<br /> * <br /> */<br />public interface SFPL_Operator extends Serializable<br />{<br /><br /> /**<br /> * &lt;b&gt;Gives the key word which the parser will use for this operator&lt;/b&gt;<br /> * <br /> * @return the key word<br /> */<br /> public String Word();<br /><br /> /**<br /> * &lt;b&gt;Operate&lt;/b&gt; What ever this operator does when SFPL is running is done by this method. The execution loop all this<br /> * method with the current execution context and the passed forward operand.<br /> * <br /> * @param input<br /> * the operand passed into this operator<br /> * @param context<br /> * the current execution context<br /> * @return the operand passed forward from this operator<br /> * @throws SFPL_RuntimeException<br /> */<br /> public Object Interpret(Object input, SFPL_Context context) throws SFPL_RuntimeException;<br />}<br /></code></pre><div style="font-weight:normal;"><span style="font-size:small;">[</span><a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/src/com/nerdscentral/sython/SFPL_Operator.java"><span style="font-size:small;">This AGPL code is on Github</span></a><span style="font-size:small;">]</span></div><div><br /></div><div><span style="font-size:small;"><span style="font-weight:normal;">The word() method returns the name of the operator as it will be expressed in Python. The Interpret() method processes arguments passed to it from Python. As Sonic Field comes up it creates a Jython interpreter and then adds the operators to it.&nbsp;</span><span style="font-weight:normal;">The mechanism for doing this is a little involved so rather than go into detail here, I will simply give links to the code on github:</span></span></div><div></div><ul><li><a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/src/com/nerdscentral/sython/Sython.java" style="font-weight:normal;"><span style="font-size:small;">SonicField.java</span></a></li><li><span style="font-size:small;font-weight:normal;"><a rel="nofollow" target="_blank" href="https://github.com/nerds-central/SonicFieldRepo/blob/master/SonicField/src/com/nerdscentral/sython/Sython.java" style="font-weight:normal;">SonicField.py</a></span></li></ul><div><div style="text-align:left;"><span style="font-size:small;font-weight:normal;">The result is that every operator is exposed in Python as sf.xxx where xxx is the return from the word() method. With clever operator overloading and other syntactical tricks in Python I am sure that the approach could be refined. Right now, there are a lot of sf.xxx calls in Sonic Field Python ( I call it Synthon ) but I have not gotten around to improving on this simple and effective approach.</span><br /><span style="font-size:small;font-weight:normal;"><br /></span></div></div><div>You might have noticed that everything passed into Java from Python is just 'object'. This seems a bit crude at first take. However, as we touched on in the section of futures in the <a rel="nofollow" target="_blank" href="http://www.javaadvent.com/2014/12/a-serpentine-path-to-music.html">previous post</a>, it offers many advantages because the translation from Jython to Java is orchestrated via the Caster object and a layer of Python which transparently perform many useful translations. For example, the code automatically translates multiple arguments in Jython to a list of objects in Java:<br /><pre class="prettyprint"><code><br /> def run(self,word,input,args):<br /> if len(args)!=0:<br /> args=list(args)<br /> args.insert(0,input)<br /> input=args<br /> trace=''.join(traceback.format_stack())<br /> SFSignal.setPythonStack(trace)<br /> ret=self.processors.get(word).Interpret(input,self.context)<br /> return ret<br /></code></pre><br /></div>Here we can see how the arguments are processed into a list &nbsp;(which is Jython is implemented as an ArrayList) if there are more than one but are passed as a single object is there is only one. We can also see how the Python stack trace is passed into a thread local in &nbsp;the Java SFSignal object. Should an SFSignal not be freed or be double collected, this Python stack is displayed to help debug the program.<br /><br /><h4 style="text-align:left;">Is this interface approach a generally good idea for Jython/Java Communication?</h4>Definitely not! It works here because of the nature of the Sonic Field audio progressing architecture. We have processors which can be chained. Each processor has a simple input and output. The semantic content passed between Python and Java is quite limited. In more general purpose programming, this simple architecture, rather than being flexible and powerful, would be highly restrictive. In this case, the normal Jython interface with Java would be much more effective. Again, we can see a great example of this simplicity in the previous post when talking about threading (where Python access Java Future objects). Another example is the direct interaction of Python with SFData objects in this post on <a rel="nofollow" target="_blank" href="http://sonic-field.blogspot.co.uk/2014/03/python-creating-oscillators-in-python.html">modelling oscillators in Python</a>.<br /><pre class="prettyprint"><code><br />from com.nerdscentral.audio import SFData<br />...<br /> data=SFData.build(len)<br /> for x in range(0,length):<br /> s,t,xs=osc.next()<br /> data.setSample(x,s)<br /></code></pre><br />Which violated the programming model of Sonic Field by creating audio samples directly from Jython, but at the same time illustrates the power of Jython! It also created one of the most unusual soundscapes I have so far achieved with the technology:<br /><br /><div class="separator" style="clear:both;text-align:center;"><embed width="320" height="266" src="https://www.youtube.com/v/nV5T2n6RAa0?version=3&f=user_uploads&c=google-webdrive-0&app=youtube_gdata" type="application/x-shockwave-flash"></iframe></div><div style="text-align:center;"><span style="font-size:x-small;"><i>Engines of war, sound modelling</i></span></div><div style="text-align:center;"><span style="font-size:x-small;"><i>from oscillators in Python.</i></span></div><h3 style="text-align:left;">Wrapping It Up</h3>Well, that is all folks. I could ramble on for ever, but I think I have answered most if not all of the questions I set out in the first post. The key ones that really interest me are about creativity and hybrid programming. Naturally, I am obsessed with performance as I am by profession an optimisation consultant, but moving away from my day job, can Jython and Java be a creating environment and do they offer more creativity than pure Java?<br /><br /><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float:left;margin-right:1em;text-align:left;"><tbody><tr><td style="text-align:center;"><a rel="nofollow" target="_blank" href="http://4.bp.blogspot.com/-_OASFw1c9Ak/VJajaqMzERI/AAAAAAAAqU0/n6Es1yyaNwA/s1600/F3.large.jpg" style="clear:left;margin-bottom:1em;margin-left:auto;margin-right:auto;"><img border="0" src="http://4.bp.blogspot.com/-_OASFw1c9Ak/VJajaqMzERI/AAAAAAAAqU0/n6Es1yyaNwA/s1600/F3.large.jpg" height="296" width="320"/></a></td></tr><tr><td class="tr-caption" style="text-align:center;">Transition State Analysis using<br />hybrid programming</td></tr></tbody></table>Too many years ago I worked on a similar hybrid approach in scientific computing. The <a rel="nofollow" target="_blank" href="http://people.bath.ac.uk/chsihw/grace/grace.html">GRACE</a> software which I helped develop as part of the team at <a rel="nofollow" target="_blank" href="http://www.bath.ac.uk/">Bath</a> was able to break new ground because it was easier to explore ideas in the hybrid approach than writing raw FORTRAN constantly. I cannot present in deterministic, reductionist language a consistent argument for why this applied then to science or now to music; nevertheless, experience from myself and others has show this to be a strong argument.<br /><br /><i><b>Whether you agree or disagree; irrespective of if you like the music or detest it; I wish you a very merry Christmas indeed.</b></i></div><div></div></div></div></div><br/><br/><em style="background-color:#fcffee;color:#222222;font-family:Verdana, Geneva, sans-serif;font-size:18px;line-height:24.6399993896484px;">This post is part of the&nbsp;<a rel="nofollow" target="_blank" href="http://javaadvent.com/" style="color:#888888;text-decoration:none;">Java Advent Calendar</a>&nbsp;and is licensed under the&nbsp;<a rel="nofollow" target="_blank" href="https://creativecommons.org/licenses/by/3.0/" style="color:#888888;text-decoration:none;">Creative Commons 3.0 Attribution</a>&nbsp;license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!</em><div class="feedflare">
<a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=DvLi6qSq238:AstJ4UwAz5w:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=yIl2AUoC8zA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=DvLi6qSq238:AstJ4UwAz5w:4cEx4HpKnUU"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?i=DvLi6qSq238:AstJ4UwAz5w:4cEx4HpKnUU" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=DvLi6qSq238:AstJ4UwAz5w:YwkR-u9nhCs"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=YwkR-u9nhCs" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/JavaAdventCalendar?a=DvLi6qSq238:AstJ4UwAz5w:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/JavaAdventCalendar?d=qj6IDK7rITs" border="0"></a>
</div><img height="1" width="1" alt=""/>Alexander Turnertag:blogger.com,1999:blog-2481158163384033132.post-373003818198838120Wed, 24 Dec 2014 00:30:00 +0000[sysadvent] Day 24 - 12 days of SecDevOpshttp://feedproxy.google.com/~r/sysadvent/~3/LK4uoLjMd1U/day-24-12-days-of-secdevops.html
<p>Written by: Jen Andre (<a rel="nofollow" target="_blank" href="https://twitter.com/fun_cuddles">@fun_cuddles</a>)<br>
Edited by: Ben Cotton (<a rel="nofollow" target="_blank" href="https://twitter.com/funnelfiasco">@funnelfiasco</a>)</p>
<p>Ah, the holidays. The time of year when we want to be throwing back the eggnogs, chilling in front our fake fireplaces, maybe catching a funny Christmas day movie&#8230; but oh no we can’t, because guess what, a certain entertainment company was held hostage by a security breach the likes of which corporate America has never seen before&#8230; and no more movie for you. </p>
<p>It’s an interesting time to be a security defender. The recent Sony breach has just put a period on the worst-of-the-worst scenarios that us tinfoil-hat, paranoid security people have been ranting about all along: one bad breach could be business shattering. </p>
<p>But let’s step back, and look at the theme of this blog: the 12 days of SecDevOps. Besides being a ridiculous title that I’m 90% sure my ops director chose specifically as a troll for me (thanks, Pete), it underlines an important concept. <strong>Whether `security` is in your job title or not, operations is increasingly becoming the front-line for implementing security defenses.</strong></p>
<p>Given that reality, and the fact that security breaches are NOT going away, and that most of us don’t have yacht-sized security budgets, I thought it would be interesting to come up with 12 practical, high-impact things that small organizations could be doing to shore up their security posture. </p>
<h3>Day 1: Fear and Loathing and Risk Assessment and Hipsters</h3>
<p>Risk assessment. It’s not just some big words auditors love to use. It’s simply weighing the probability of bad things happening against the cost to mitigate the risk of that bad thing happening. And using that to make good security decisions as you make day-to-day architecture and ops choices:</p>
<blockquote>
<p>risk = (threat) x (probability) x (business impact)*</p>
<p>*whoever told you there would be no math lied to you</p>
</blockquote>
<p>You may not be aware of it, but as an ops person you are likely doing risk assessment already, except more likely around things like uptime and reliability. Consider this scenario:</p>
<ul>
<li>John, the web guy, proposes replacing PostgreSQL with SomeNewHipsterDB.</li>
<li>You ask yourself, ‘huh, what’s the chances that I’m going to get paged at 3am because writes stop happening and my web site starts screaming in pain?’ You are probably not having warm-fuzzy feelings about this plan.</li>
<li>Your development and ops team evaluates the <strong>benefits</strong> to the engineering team and business for switching to SomeNewHipsterDB and weighs it against the <strong>probability</strong> that you are going to get woken up all of the time, and the <strong>impact</strong> it will have on your sunny disposition and decide that yeah… maybe not gonna do it.</li>
<li>Or, you do, except you <strong>mitigate</strong> this risk by saying ‘John, you will be forever paged for all SomeNewHipsterDB issues. Done.’</li>
</ul>
<p>Cool. Now do this for security. Every time you are making architecture choices, or changing configuration of your infrastructure, or considering some new third-party service SaaS you’ll be sending data to, you should be asking yourself: what’s the impact if that service or system gets hacked? How will you mitigate the risks? </p>
<p>This doesn’t have to be a formal or fancy report. It can be a running text file or spreadsheet with all of the possible points of failure. Get everyone involved with thinking of ways pieces of the infrastructure or organization can be hacked, and ways you are protected against those worst-case scenarios. It can be like ‘ANYONE WHO OWNS OUR CHEF SERVER COULD DESTROY EVERYTHING [but we have uber-monitoring and Jane over there reviews audit logs daily]’. Start with the scenario: what if… ? and have conversations with engineers and business owners defend why what we’re doing is good enough. Make security a fundamentally collaborative process.</p>
<h3>Day 2: Shared Secrets: Figure it Out Now</h3>
<p>There’s 3 things in life that are inevitable: death, taxes&#8230; and the fact that a sales guy left to his own devices will always put all of his passwords in a plain text file (or if fancy, an Excel spreadsheet).</p>
<p>The lesson is this: password management isn’t something that just the technical team decides and manages for itself. We should be advocating <strong>organization-wide</strong> education on managing credentials, because guess what? Access to Salesforce, Gmail, and all of these SaaS services with sensitive business data are being used by people who are not engineers. </p>
<p>Solution? As part of every employee’s onboarding process, <strong>install password management on an employee’s workstation, and show them how to use it</strong> (e.g. 1Password or LastPass, or whatever your tool of choice is). Start doing this from the outset, as it’s best to figure this out on Day 1 rather than 200 employees in. </p>
<h3>Day 3: Shared Secrets for Infrastructure, Too</h3>
<p>When it comes to infrastructure secrets, there are extra concerns because in most cases, systems needs to be able to access these secrets in a non-interactive, automated way (e.g. I need to be able to spin up an app server that knows how to authenticate to my database). </p>
<p>If all of your infra passwords start unencrypted somewhere in a git repo, You Are Going To Have A Bad Time. <a rel="nofollow" target="_blank" href="https://coderanger.net/chef-secrets/">Noah has a good article on various options for managing shared secrets in your infrastructure.</a></p>
<h3>Day 4: Config Management On All Of The Things (So You Aren’t Sweating from Shell Shocks)</h3>
<p>This should be obvious to everyone who drinks from the DevOps Koolaid, but CM has done beautiful things for patch management. It may be tempting to deploy a one-off box used for dev manually without config management installed, but guess what? <a rel="nofollow" target="_blank" href="http://www.browserstack.com/attack-and-downtime-on-9-November">In the case of Browser Stack, that turned out to be a massive achilles heel.</a></p>
<p>Making the process easy for devs to get access to the infrastructure they need (while giving you the ability to manage systems) is key. Do this right away.</p>
<h3>Day 5: Secure your Development Environments (Because No One Else Will)</h3>
<p>If left to their own devices, development environments tend to veer to chaotic. This isn’t just because developers are lazy (and as a developer, I mean this in the nicest possible way) but because of the nature of the prototyping and testing process. </p>
<p>From a security perspective, this all means bad juju (see Browser Stack example above). I can assure you that if you start building your prototype or dev infrastructure exposed to the public internet, deploying it without even the basic config management, it will stay that way forever. </p>
<p>So: if you are using AWS, start with an Amazon VPC with strict perimeter security, and require VPN access for any development infrastructure. Get some config management on everything, even if it’s just for system patches. </p>
<p>Put some bounds around the chaos early on, and this will make it easy to mature the security controls as the product and organization mature. </p>
<h3>Day 6: 2-Factor all of the things (well, the important things)</h3>
<p>Require 2-factor wherever you can. Google Apps has made enforcing this super easy, and technologies like DuoSecurity and YubiKey make adding 2-factor to your critical infrastructure (e.g., your VPN accounts) far, far less annoying than it used to be.</p>
<h3>Day 7: Encrypt your Emails (and other communications)</h3>
<p>Encrypt your emails. It’s annoying to set up, but guess what? Hackers just <em>love</em> to post juicy stuff on pastebin. Again, from Day 1, help every single employee configure PGP or SMIME encryption as part of the onboarding process. Once installed, it’s relatively painless to use (as long as you don’t mind archaic mail clients from 1999). </p>
<p>This is especially important to drill into executives because they tend to have more sensitive emails (e.g. their private boardroom chatter), and are particularly susceptible to phishing style-attacks. With the recent Sony email leaks, you now have some leverage. You can throw the ‘Angelina Jolie’ emails in front of them and ask: how much do you think business and reputations would suffer were their entire email archives publically disclosed via a breach? </p>
<p>For many of us, chat is as crucial as email in terms of the type of reputation-critical information we put there. It may not be reasonable to switch to a self-hosted chat solution, but in that case, ensure you are picking a service that helps YOU mitigate your risk. E.g., do you need all of the history? Do you need private history for user chats? </p>
<h3>Day 8: Security Monitoring: Start Small, Plan Big</h3>
<p>Put the infrastructure in place to collect as much security data as possible, then start slowly making potential security issues <em>visible</em> by adding reports and alerts that deal with threat scenarios you are most worried about.</p>
<p>Start small. Remember that risk assessment list you made? Identify what you are most afraid of (um, that PHP CMS that has hundreds of vulnerabilities reported per year? Your VPN server?) and tackle monitoring for those items first. </p>
<p>Instrumenting your infrastructure from day 1 for security monitoring (even if it’s just collecting all of the system and application logs) puts you in a good position later on to start sophisticated reporting and intrusion detection on that data. </p>
<h3>Day 9: Code/Design Reviews</h3>
<p>Although there have been a lot of advancement in static and dynamic source code analysis tools (<a rel="nofollow" target="_blank" href="http://samate.nist.gov/index.php/Source_Code_Security_Analyzers.html">which you can integrate right into your CI process</a>), a good-old fashion code review by a human being goes a long way. If you’re using GitHub, just make it part of the development workflow and testing pipeline. Whenever changes are made to authentication or authorization, have someone look for automated tests that deal with those cases.</p>
<h3>Day 10: Test Your Users</h3>
<p>Phish yourself regularly. It’s really easy to do, and can be illuminating to the rest of the business which may not be as technical as the operations/engineering side, and not really understand really the impact of opening an attachment in an email or not checking URLs where they are logging into a website. <a rel="nofollow" target="_blank" href="http://krebsonsecurity.com/2012/01/phishing-your-employees-101/">You can use some open source tools</a>, but are also many services now that you can pay to do this for you.</p>
<h3>Day 11: Make an Incident Response Plan Now</h3>
<p>So, you see something odd in your logs. Like, Bob your DBA ran a Postgres backup on production DB, tar’d it up, and sent it to an FTP server in Singapore. Bob lives in Reston VA, and this is definitely not normal. You start seeing evidence of other weird stuff ‘bob’ is doing that he shouldn’t be.</p>
<p>What now? Do you email Bob and say ‘something weird is happening?’ Do you call the Director of Ops? Do you put a message in a lonely chat room? </p>
<p>Figure out a plan for escalating possible critical security issues. Doesn’t have to be fancy or use specialize ITIL incident response workflow tools. Make a group in PagerDuty. Have an out-of-band channel for communicating details, in case your normal network goes the way of Sony and is totally compromised or just plain is not working. Maybe it’s as simple as an email list that doesn’t use the corporate email accounts, or a conference bridge everyone can hop on.</p>
<h3>Day 12: Don’t be the Security ‘A**hole’</h3>
<p>You. Yes, you. Don’t be the security a**hole that gets in everyone’s way and loses sight of the real reason for everyone’s existence: to run a business. You can be the security champion without being the blocker. In fact, that’s the only way to be effective. If a user is coming to you and saying ‘this is really really annoying, I don’t want to do it’ - <strong>listen to them</strong>. Too many security personnel disregard the usability issue of security controls for the sake of security theater, which leads to (unsurprisingly) abandonment, cynicism, and apathy when it comes to real security concerns.</p>
<p>DevOps is really a philosophy: it’s not a job title, or a set of tools, it’s the concept of using modern tools and processes to facilitate collaboration with the engineers who deliver the code and those who must maintain it. Um, that was a lot of words, but the key word is <strong>collaboration</strong>. It’s no longer acceptable to throw ‘security over the wall’ and expect your users and ops people to just do what you say. </p>
<p>The best security cultures are not prescriptive, they are collaborative. They understand that business needs to get done. They are intellectually honest and admit ‘yeah, we could get hacked’ - but what can we do about this in a way that doesn’t bring everything to a halt? <a rel="nofollow" target="_blank" href="https://www.duosecurity.com/blog/duo-tech-talk-building-a-modern-security-engineering-organization">Zane Lackey has a great talk on building a modern security engineering organization</a> that expounds many of these ideas, and more.</p><img height="1" width="1" alt=""/>Christopher Webbertag:blogger.com,1999:blog-3615332969083650973.post-7223093802223447718Wed, 24 Dec 2014 00:00:00 +0000[ux] The Elf that Came in From the Coldhttp://uxmas.com/2014/the-elf-that-came-in-from-the-cold
<p>UCD City was asleep. The gin joints were shut. Hoods and flatfoots and gumshoes alike were all in their holes. Snow fell like a silent movie, blanketing the usual street noise of passing cars and fighting alley cats. Christmas neon flashed on every second apartment. Behind the lopsided shutters of my one-man agency I poured myself another lonely egg nog.</p>
<p>The doorknob of my front door jiggled. I thought nothing of it&mdash;probably kids on holidays up past their bedtime. Suddenly&nbsp;the door burst open, and an elf tumbled in, clutching a mobile phone. His eyes screamed silently as he reached toward&nbsp;me.</p>
<p>&ldquo;Merry &hellip;&rdquo;</p>
<p>He hit the floor. Only then did I notice the kitchen knife protruding from his kidneys.</p>
<p style="text-align:center;"><img alt="A silhouette of an elf at the doorway" src="http://uxmas.com/images/uploads/elf-in-doorway.jpg" style="width:549px;height:496px;"/></p>
<p>I heard footsteps downstairs. Leaping over the elf&rsquo;s body and flying down the staircase railings, I spied a shadow fleeing the building. I made it to the street, breathless, in time to watch twin tail lights screech around the corner. Another win for the bad guys. The neighbourhood fell silent and lonely again.</p>
<p>Footprints were melting into the muddy slush: size 12 at least, probably boots. I glanced up at the window to my room where a dead elf lay.</p>
<p>I was definitely going to be on the naughty list again this year.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>Most of the dark red wasn&#39;t an intentional part of the elf&#39;s costume&mdash;it was now pooling on my floorboards. I was no stranger to crime, but this was heinous: a dead pixie on my doorstep on Christmas Eve? Hardly a Merry Christmas.</p>
<p>What had he been trying to say? &ldquo;Merry!&rdquo; What did that mean? Merry Christmas? How was that a dying man&#39;s secret this time of year? And why to me? I&#39;d consulted with Santa&#39;s Workshop much earlier in my career&mdash;half a lifetime ago. But I&rsquo;d had nothing to do with the operation since.</p>
<p>There was a buzz from the smartphone in his little hand: a SnapChat message. I swiped the screen before thinking what I was doing, and deserved what I saw.</p>
<p style="text-align:center;"><img alt="A SnapChat photo of Santa, bound and gagged" src="http://uxmas.com/images/uploads/santa-tied-up.jpg" style="width:550px;height:466px;"/></p>
<p>It was Saint Nick himself, bound, gagged, and seated on a stool. He looked haggard and beaten, and two armed men stood over him. Good grief&mdash;who would dare mess with Santa? Was all of Christmas in jeopardy? I cursed as SnapChat erased the photo. Evidence and answers were slipping through my fingers. Damned <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/designing-for-ephemerality">ephemeral messaging</a>!</p>
<p>I rifled through the elf&#39;s pockets, and turned up a Sharpie marker, some half-finished sketches on a bunch of post-it notes, the elf&rsquo;s company RFID card, and a sealed envelope.</p>
<p>I wrapped the envelope in a plastic bag, and pinned it to the outside window ledge. Then I took a closer look at the rest of the items.</p>
<p>Printed on the RFID card was the elf&rsquo;s name and position:</p>
<p style="text-align:center;"><em><strong>Zachary Kringle</strong><br />
Chief of Operations<br />
Workshop Division<br />
Santa&rsquo;s Workshop Inc.</em></p>
<p>Not a junior then. I sifted through&nbsp;the post-it notes. The sketches were pretty basic&mdash;an early concept for some sort of board game similar to Snakes &amp; Ladders, with additional cards that formed part of the mix. The sketches were annotated&mdash;<strong>process</strong>, <strong>iterate</strong>, <strong>collaborative</strong>&mdash;obviously some new gift idea he&rsquo;d been working on. Too bad he&rsquo;d croaked. It looked like a good game.</p>
<p>I retrieved the sealed envelope from the bag on the windowsill. The gum was now frozen stiff, making it easy to open the envelope without damaging it. There was a photo inside. I recognised the face instantly.</p>
<p>Mary Christmas.</p>
<p style="text-align:center;"><img alt="A polaroid photograph of Mary Christmas getting into a black car" src="http://uxmas.com/images/uploads/mary-getting-into-car.jpg" style="width:550px;height:517px;"/></p>
<p>The elf&#39;s words made more sense now: &ldquo;Mary&rdquo;, not &ldquo;Merry&rdquo;. Panic hit me like ice water. Could the Mary I knew really be behind such a brutal crime?</p>
<p>Mary was as curvy as always. Not a young woman, but a plenty real one, like someone had poured her into her clothes and forgot to say when. It had been taken recently, and with a zoom lens: she was getting into a black car while her minders stood around. Behind her, the corner of a warehouse partially hid a dockside crane.</p>
<p>I had my lead.</p>
<p>There were two things I needed to do right away. The first was a simple task. I sat on my desk looking at the elf and called Uncle Xavier, the only straight cop I knew in homicide. Xavier worked old-school, and although we didn&rsquo;t always see eye to eye, we both knew the way the game was played. I spilled my story but conveniently omitted the photo&mdash;I would confront Mary myself when the time was right.</p>
<p>The last few hours of downing egg nog had made my brain as lazy as an old dog. So I shrugged on my hat and trenchcoat, packed my trusty 357 snubnose and hit the icy pavement to clear my head.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>A dead elf, a captive Santa, Mary Christmas and a mystery warehouse: not much of a brief. I remembered a project I&#39;d worked on with an engineer at Path-E-Tech Management. As the engineer said himself:</p>
<p>&ldquo;We interviewed hundreds of users, and turned all their suggestions into features. As it turns out, <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/is-the-user-always-right">every user we talked to was an idiot</a>. And their dumb suggestions ruined our product. In hindsight <a rel="nofollow" target="_blank" href="http://www.dilbert.com/strips/comic/2012-05-07/">we should have talked to people outside the building</a>.&rdquo;</p>
<p>Not the best attitude, but it does say something about user research outside the obvious. And that&rsquo;s exactly what I planned to do.</p>
<p>A pile of cardboard and old blankets moved as I walked past. "Help an old graphic artist fallen on bad times?" a voice croaked from within.</p>
<p>"You see anyone come past here in the last half hour?" I asked.</p>
<p>A&nbsp;hooligan in a turtleneck poked his head out.&nbsp;"Just a short guy&mdash;real shorty he was, and a gorilla. Jumped into a car and took off that way.&rdquo;</p>
<p>I flipped him a few coins. He grinned in appreciation.</p>
<p>"Thanks man, Merry Christmas", he said.</p>
<p>"Yeah. Merry Christmas."</p>
<p>No elf I&#39;d ever met had size twelve boots and looked like a gorilla. Someone outside the Workshop had stabbed him in the back&mdash;someone who didn&#39;t want Christmas to run the usual clockwork. But who? And why was someone sending a SnapChat of a nicked Saint Nick to the elf? Did they know he was dead? What would happen if Santa wasn&#39;t released in time? <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/the-5ws-and-1h-of-user-interviews">These were the questions that needed answering.</a></p>
<p>I grunted. <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/santa-the-child-and-the-magic-of-ux">Two billion unhappy kids</a> is what would happen. The stakes couldn&rsquo;t be higher.</p>
<p>I talked to anyone who would listen: lovers buying last-minute gifts, retailers and late-night carol singers. And when my first coffee-stained notebook was full, I turned south&mdash;I needed to triangulate my findings with a <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/3-ways-to-select-the-perfect-method-for-your-research-goal">different technique altogether</a>.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>I stopped at a local toy store and made my way towards the back. Two figures behind frosted glass moved like a puppet show, and I stepped behind a giant cuddly Pink Panther to eavesdrop. I waited patiently, catching only snippets of information, but enough to make out something about global shipments. I was on the right path.</p>
<p>The silhouettes left via a back door, and I waited until I heard the door slam. I stepped from my hiding place and into the back room. I fumbled with the lock on the drawer of the desk where they&rsquo;d been seated. The lock was no match for my expertise with a paper clip, and once it was open I found the jackpot: the store manager&rsquo;s journal.</p>
<p>My greedy eyes ran over the entries from the past few weeks. I felt dirty&mdash;it wasn&#39;t my usual method of conducting a <a rel="nofollow" target="_blank" href="http://uxmastery.com/resources/techniques/#diary-study">diary study</a>, but it sure delivered the goods. I tried to ignore the personal and stick to business: names, meeting times, summary notes &hellip; I struck gold with a purchase order form for a toy shipment. It listed a warehouse address down on the waterfront. I took a quick snap using my camera phone.</p>
<p>There was a creak behind me and a torch blinded me.</p>
<p style="text-align:center;"><img alt="Our hero is held at gunpoint" src="http://uxmas.com/images/uploads/grab-some-air.jpg" style="width:550px;height:455px;"/></p>
<p>"Grab some air, pal" a gruff voice barked from the shadows. I raised my hands.</p>
<p>I turned to look at him. I didn&#39;t like his face. When he said his name was Skinny I didn&#39;t like his name either. But the secret to being a great detective is to listen more and talk less&mdash;I kept my trap shut.</p>
<p>Two goons loomed behind him: dull eyes, flat heads, black hats, and no brains at all. I looked uneasily at their hands. They were big&mdash;as big as plates of pork ribs, and twice as greasy. I knew exactly what was coming next.</p>
<p>With a nod from Mr Skinny, the other two moved in and started to work me over. I braced myself for the blows, and winced as they rained down on me.&nbsp;Then my world went dark.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>The thing about getting licked by goons is that if you&#39;re lucky enough to wake up, you&#39;ll always be missing grey cells. I was a veteran now. I hoped what I&rsquo;d lost in intelligence was more than made up for by experience. At least that&rsquo;s what I told myself when returning the favour to some gunsel. When you&rsquo;re on a case, you need to defend your thinking, and can&rsquo;t pull punches.</p>
<p>I dragged my sorry ass to a joint where I knew I&#39;d find a few fellow gumshoes. Donna was there, absent-mindedly <a rel="nofollow" target="_blank" href="http://uxmastery.com/design-games-card-sorting/">sorting cards into groups</a>. Patrick was there too, cradling a rye as he always did. He had a reputation for <a rel="nofollow" target="_blank" href="http://www.uxdrinkinggame.com/">drinking anyone under the table</a>, especially after a &ldquo;UX fail&rdquo;. He&#39;d made a game of it, they said. But this time I didn&rsquo;t feel like playing.</p>
<p>I accepted a glass of hooch but waved away their questions. All I wanted was quiet companionship. They got the gist and went back to cards and drinking.</p>
<p>My body was bruised, but my thinking was as sharp as ever. If Skinny and his goons wanted me out of the way they could have left me alone to just drink myself to death. Why didn&rsquo;t they finish me? They obviously didn&rsquo;t know why I was there.</p>
<p>After resuming my balance I returned to my office and stared at the post-its stuck between strips of peeling pink wallpaper.</p>
<p>Outside the snow had turned to rain. Someone somewhere was blowing a horn as mournful as a turkey in December. It went nicely with my mood.</p>
<p>The post-its were grouped with all the nuggets of info I&rsquo;d collected so far. I had an affinity for mapping out thoughts like this since working with the feds. I added my new findings: the waterfront warehouse address, and the dates from the journal.</p>
<p>But there wasn&#39;t yet enough to work with&mdash;I couldn&#39;t draw any threads, and I reluctantly realised what I had to do. I preferred to work alone but there was no way this could be solved without getting another set of eyes on the case. And there was only one person who could help, even if she was too close to the furnace.</p>
<p>I picked up the phone and dialled her number for the first time in years.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>Mary Christmas crossed her legs. She knew they were good.</p>
<p>&ldquo;I&rsquo;m just not that kind of girl," she snarled. &ldquo;I played your game once, but now I do things differently. Sorry&mdash;you&rsquo;re on your own.&rdquo;</p>
<p>I hinted that I knew she was in up to her neck with the impolite crowd.</p>
<p>"I don&#39;t like to see cheap hoods messing with a sweet kid like you, Princess."</p>
<p>She went bright red, and I should have seen the slap coming. I nursed my jaw and just let her fume.</p>
<p>"Careful, Mac", she said. "Don&#39;t be a dick."</p>
<p>I let the obvious comment slide.</p>
<p style="text-align:center;"><img alt="A close-up portrait of the enigmatic Mary Christmas" src="http://uxmas.com/images/uploads/mary-up-close.jpg" style="width:550px;height:520px;"/></p>
<p>"That elf died on my doormat, and those goons don&#39;t mess about either." I reminded her. "I&#39;m just saying ... don&#39;t go there."</p>
<p>She knocked back her shot and glared at me. "No" she said, "don&#39;t you go there. If I were you, I&#39;d take the easy option and go back to your two-bit sleazy dive. Forget any of this ever happened!"</p>
<p>They were the same words as the Dear John letter she&#39;d written me all those years ago. I was obviously still learning.</p>
<p>We looked at each other sadly over the table. "Lucky I&#39;m not you then," I said, and this time I was the one to walk out.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>Something woke me to the world of cigarette smoke and last night&#39;s leftover <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/gluten-free-mince-pies">fruit mince pies</a>. It was the office phone.</p>
<p>"Mac here," I mumbled.</p>
<p>"Sit this dance out, Mac," snapped a voice. Not Mr Skinny&mdash;he was older, more sure of himself. A Danish accent?</p>
<p>I waited for him to continue, but the line went dead.</p>
<p>Perhaps it was someone working with Skinny? Maybe the higher-ups were starting to show concern.</p>
<p>The thought gave me a mild sense of satisfaction. But the other side definitely had all the cards&mdash;and all the eyes. Time to get up and work. I pulled the blinds shut and prepared myself a <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/mid-uxmas-season-tipple">Mexican Hot Chocolate</a>. Taking a long swig, I turned back to the notes on my wall.</p>
<p>It was going to be a long night.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>"You&#39;re not to work this case any more, Mac. You&#39;re fired!"</p>
<p>Uncle Xavier was none too happy that I&rsquo;d been doing my own investigation, especially as I wasn&rsquo;t off the suspects list.</p>
<p>"Maybe you&#39;ve fired me, but I haven&#39;t fired me!"</p>
<p>The fuzz were on my case, the goons were hitting on me, and Mary was playing hard to get. I had to crack this before someone caught up, or Christmas would be cancelled and I&#39;d be making licence plates.</p>
<p>I hadn&#39;t started this thing, but it was up to me to finish it.</p>
<p>Luckily, I had a plan.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>Anyone that knows me will say I&#39;m a sucker for punishment. I keep going back, failing again and again. Most people think I&#39;m nuts, but you actually learn a lot from failing. More than winning the first time, at least. I never had the brains to win first time, but if I made my mind up I could stick to something like superglue. And I always came through in the end.</p>
<p>My limpet-style antics were about to pay off again. &ldquo;Follow the data,&rdquo; they say. My coffee-stained notebook contained multiple statements from Christmas shoppers stating that new toy store stock always seemed to be available in store on Wednesdays. Tonight was Tuesday.</p>
<p>The old waterfront warehouse that Santa&rsquo;s Workshop had used before going digital back in the &lsquo;80s was long abandoned. Were they planning to open operations again? That&#39;d explain the elf&#39;s involvement. And if Skinny and his goons managed to squeeze Santa for his trade secrets, they&rsquo;d have a monopoly on Christmas.</p>
<p>I was battling with barbed wire in the fence when a waft of perfume cut through the stink of grease and bilgewater and rotting fish. I sighed and sat pretty. I knew that smell. Mary strutted around the corner like she owned the place. Which I guess she did&mdash;the three minders she had with her underlined that fact.</p>
<p>&ldquo;You&rsquo;re getting rusty&rdquo; she said to me, looking at the wire.</p>
<p>&ldquo;You know me. I like to get stuck into things.&rdquo;</p>
<p>&ldquo;Tangled up in a mess, you mean.&rdquo;</p>
<p>&ldquo;It&rsquo;d help if you gave me a hand rather than just standing there.&rdquo;</p>
<p>The worry in my own eyes was reflected in hers.</p>
<p>&ldquo;You&rsquo;re gonna need this.&rdquo; I handed her the elf&rsquo;s RFID card.</p>
<p>Peering through a window, we saw a warehouse filled with crates of toys, puddings, holly, tinsel, and Christmas decorations. Either Christmas had been postponed, or these were next year&rsquo;s goods.</p>
<p>One entire wall was covered with notes and sketches: annotated blueprints, moodboards, experience maps. This was a major project. I recognised some of the sketches from those I&rsquo;d taken out of the elf&rsquo;s pockets.</p>
<p>I stopped short when I saw the broad back and shoulders of a red-suited old man, sitting in the corner, bound and helpless. He turned to face us. His face said a lot. Those eyes would&rsquo;ve even made a bloodhound sad.</p>
<p style="text-align:center;">*&nbsp;&nbsp; &nbsp;*&nbsp;&nbsp; &nbsp;*</p>
<p>Mary swiped the elf&rsquo;s ID card, and the massive doors of the warehouse rattled open, revealing four figures. I&rsquo;d have bet any money the biggest wore size 12 boots. He dwarfed the three others, especially the skinny runt at the front. I nodded in recognition at the two goons and they nodded back. They both packed Chicago heaters. Things were getting crowded.</p>
<p>Mr Skinny reached for the bulge inside his coat. He revealed the top corner of a Samsung Galaxy S6.</p>
<p>&ldquo;Don&rsquo;t make me use this.&rdquo; he drawled.</p>
<p>What was he going to do&mdash;<a rel="nofollow" target="_blank" href="http://uxmas.com/2014/digital-transgressions">SnapChat me to death</a>?</p>
<p>I made a beeline for Santa, and managed to untie him before the goons stitched a neat burst of holes just in front of our toes. Santa didn&rsquo;t flinch. He shook the ropes free and flew at the gorilla, catching him off guard while he reloaded.</p>
<p>The two large men rolled across the floor while I fired a Hail Mary at Skinny and his goons to distract them. For an old guy, Santa could give plenty.</p>
<p>Their fisticuffs interspersed with gunfire in the cavernous warehouse was deafening. Bullets flew like wasps. Mary&#39;s men were solid but I was running low on ammo.&nbsp;I ducked behind a shipping container of Buzz Lightyear figurines, and crouched beside Mary. She pointed across the floor at a row of crates&mdash;one crate in particular looked like a single shove would topple it.</p>
<p>&ldquo;If you&rsquo;re thinking what I&rsquo;m thinking, you&rsquo;re crazy.&rdquo; I whispered.</p>
<p>&ldquo;This is nothing compared to ballet school!&rdquo; she chuckled. She made a dash for it and covered&nbsp;the gap in less than a second&mdash;quick enough to reach the opposite side before the goons noticed. I wasn&rsquo;t going to be left huddling there like a sissy, but they were watching me now, and I&rsquo;d have to hightail it. I dug deep and sprinted towards Mary, throwing in a Commando roll for old time&rsquo;s sake. I landed safely beside her, but my hat and coat now had some extra ventilation.</p>
<p>&ldquo;Right as rain.&rdquo; I grinned, <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/hooked-on-a-feeling">feeling more alive than ever</a>.</p>
<p>We leant into the crate and felt it start to set off a domino effect, the smaller crates crashing into ever larger ones. It was a giant, slow-motion tsunami. I couldn&rsquo;t hear it over the chatter of the Tommy guns, but I was surprised the goons couldn&rsquo;t feel each crash. Too late for them now anyway. Fifty tonnes of toy blocks makes a God almighty noise when it lands. It also turns goons into pancakes.</p>
<p>The dust settled&nbsp;as Xavier&#39;s men arrived, sirens blaring, late to the party&nbsp;as always. At least we&#39;d saved Christmas. Santa&#39;s global delivery&nbsp;project would meet its annual deadline once more.</p>
<p>&ldquo;You&rsquo;re one helluva mental model, Mary.&rdquo;</p>
<p>"You old dog,"&nbsp;she grinned.</p>
<p>I&rsquo;d certainly <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/old-dog-how-can-you-learn-some-new-tricks">learnt some new tricks</a>. Not sure I&rsquo;d be up for putting them to use any time soon, though. This caper had almost done me, and <a rel="nofollow" target="_blank" href="http://uxmas.com/2014/were-living-in-the-fast-web">I longed for the slow life</a>.</p>
<p>A sprig of mistletoe and holly from the firefight dangled from a girder above us.</p>
<p>"Tell me you had nothing to do with this," I pleaded with Mary. "Even if you lie, I&#39;ll believe you."</p>
<p>She smiled and we kissed, but I knew it was for the last time.</p>
<p style="text-align:center;"><strong><em>The End.</em></strong></p>Tue, 23 Dec 2014 21:00:00 +0000[24ways] Taglines and Truismshttp://feedproxy.google.com/~r/24ways/~3/eVYk4olZJWg/
<p><a rel="nofollow" target="_blank" href="http://stuffandnonsense.co.uk/">Andrew Clarke</a> poses the question, that if we&#8217;re all telling prospective clients that we&#8217;re crafting and designing delightful, beautiful and remarkable digital experiences, what marks any of us out?</p><div class="feedflare">
<a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=eVYk4olZJWg:ojQnjUq_8Uw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/24ways?d=yIl2AUoC8zA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=eVYk4olZJWg:ojQnjUq_8Uw:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/24ways?d=7Q72WNTAKBA" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=eVYk4olZJWg:ojQnjUq_8Uw:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/24ways?i=eVYk4olZJWg:ojQnjUq_8Uw:V_sGLiPBpWU" border="0"></a> <a rel="nofollow" target="_blank" href="http://feeds.feedburner.com/~ff/24ways?a=eVYk4olZJWg:ojQnjUq_8Uw:dnMXMwOfBR0"><img src="http://feeds.feedburner.com/~ff/24ways?d=dnMXMwOfBR0" border="0"></a>
</div><img height="1" width="1" alt=""/>feeds@allinthehead.com (Andrew Clarke)http://24ways.org/2014/taglines-and-truisms/Tue, 23 Dec 2014 12:00:00 +0000[perl] CLDR TL;DRhttp://perladvent.org/2014/2014-12-23.html
<div class='pod'><p>2014 has been an exciting year for CLDR development on the CPAN. But first, what is the CLDR? The <a rel="nofollow" target="_blank" href="http://cldr.unicode.org/">Unicode Common Locale Data Repository</a> is a standardized repository of locale data along with a specification for its use and implementation. The simplest use case is easy access to translations for use in user interfaces, including month and day names, country and language names, and units of measure such as hours, bytes, meters, and even furlongs. More complex uses include localized ranges of dates using the local calendaring and numbering systems.</p>
<p>The <a rel="nofollow" target="_blank" href="http://www.unicode.org/reports/tr35/">CLDR specification</a>, however, is increasingly complex and the amount of data is increasingly large. This makes sense because natural languages are complex and each release supports additional minority locales. Fortunately, the CPAN has had more CLDR-based development this year than ever before. This means you don&rsquo;t have to worry about reading complex specifications or manually parsing large XML data structures.</p>
<h3 id="What-are-locales">What are locales?</h3>
<p>There are two parts to a locale: an identifier and data. The identifier is used to specify user preferences, generally based on languages and regions. The simplest locale is a language code alone, such as <b>es</b> for Spanish and <b>zh</b> for Chinese. Including the user&rsquo;s country in the locale can provide additional valuable information. For example there are many differences in displaying dates and even numbers between European Spanish (<b>es-ES</b>) and Mexican Spanish (<b>es-MX</b>). Much additional information can be explicitly included in the locale, but most of the time it&rsquo;s implicitly derived from the language and region. For example, many locales default to the Gregorian calendar while some to the Buddhist calendar or others; <b>zh-CN</b> defaults to Simplified Han script while <b>zh-TW</b> defaults to Traditional Han. However, if you want to get really explicit, you could say <b>tlh-Hira-AQ-u-ca-julian-nu-roman</b> for Klingon in the Hiragana script as used in Antarctica with the Julian calendar and Roman numerals.</p>
<p>Whenever possible, include the user&rsquo;s language and country when constructing a locale identifier in order to provide the most localized experience.</p>
<p>Now let&rsquo;s take a tour of some simple solutions to common localization problems using CPAN modules.</p>
<h3 id="CLDR::Number">CLDR::Number</h3>
<p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/CLDR::Number">CLDR::Number</a> is a new module that become stable early this year and provides localized formatting of numbers, prices, and even percents and ranges of numbers. Full disclosure: I wrote this module and it powers Shutterstock in 20 languages, 150+ countries, and many currencies.</p>
<p>Here&rsquo;s an example of formatting numbers:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">use</span> <span class="word">CLDR::Number</span><span class="structure">;</span><br /><br /><span class="keyword">my</span> <span class="symbol">$cldr</span> <span class="operator">=</span> <span class="word">CLDR::Number</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">(</span><span class="word">locale</span> <span class="operator">=&gt;</span> <span class="symbol">$locale</span><span class="structure">);</span><br /><span class="keyword">my</span> <span class="symbol">$decf</span> <span class="operator">=</span> <span class="symbol">$cldr</span><span class="operator">-&gt;</span><span class="word">decimal_formatter</span><span class="structure">;</span><br /><br /><span class="word">say</span> <span class="double">&quot;$locale: &quot;</span><span class="operator">,</span> <span class="symbol">$decf</span><span class="operator">-&gt;</span><span class="word">format</span><span class="structure">(</span><span class="float">123456.7</span><span class="structure">);</span></code><br />&nbsp;</td></table>
<p>Now let&rsquo;s see the results in European Spanish, Mexican Spanish, and Bengali:</p>
<ul>
<li><p>es-ES: 123 456,7</p>
</li>
<li><p>es-MX: 123,456.7</p>
</li>
<li><p>bn-IN: &#x9E7;,&#x9E8;&#x9E9;,&#x9EA;&#x9EB;&#x9EC;.&#x9ED;</p>
</li>
</ul>
<p>This demonstrates that both the language and the country can significantly change the results of basic number formatting. Now let&rsquo;s see this applied to prices in different currencies.</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">my</span> <span class="symbol">$curf</span> <span class="operator">=</span> <span class="symbol">$cldr</span><span class="operator">-&gt;</span><span class="word">currency_formatter</span><span class="structure">(</span><span class="word">currency_code</span> <span class="operator">=&gt;</span> <span class="symbol">$currency</span><span class="structure">);</span><br /><br /><span class="word">say</span> <span class="double">&quot;$locale / $currency: &quot;</span><span class="operator">,</span> <span class="symbol">$curf</span><span class="operator">-&gt;</span><span class="word">format</span><span class="structure">(</span><span class="float">9.99</span><span class="structure">);</span></code><br />&nbsp;</td></table>
<p>Here are the results in American English and Canadian English for both US Dollars and Canadian Dollars.</p>
<ul>
<li><p>en-US / USD: $9.99</p>
</li>
<li><p>en-CA / USD: US$9.99</p>
</li>
<li><p>en-CA / CAD: $9.99</p>
</li>
<li><p>en-US / CAD: CA$9.99</p>
</li>
</ul>
<p>This demonstrates that localized formatting is important even when your only supported language is English. When it comes to currency formatting, the language, country, and currency each can significantly change the results.</p>
<h3 id="Locale::CLDR">Locale::CLDR</h3>
<p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/Locale::CLDR">Locale::CLDR</a> is another new module released earlier this year by John Imrie, with the goal of providing access to all of the CLDR via locale objects&mdash;an impressive task!</p>
<p>Different locales use different punctuation and this is commonly ignored even in applications with translations in many languages. Fortunately, Locale::CLDR makes this aspect of localization easy.</p>
<p>Here is a simple solution to formatting a list of strings for the user:</p>
<table class='code-listing'><tr><td class='line-numbers'><br /><code>1:&nbsp;<br />2:&nbsp;<br />3:&nbsp;<br />4:&nbsp;<br />5:&nbsp;<br />6:&nbsp;</code><br />&nbsp;</td><td class='code'><br /><code><span class="keyword">use</span> <span class="word">Locale::CLDR</span><span class="structure">;</span><br /><br /><span class="keyword">my</span> <span class="symbol">$cldr</span> <span class="operator">=</span> <span class="word">Locale::CLDR</span><span class="operator">-&gt;</span><span class="word">new</span><span class="structure">(</span><span class="symbol">$locale</span><span class="structure">);</span><br /><span class="keyword">my</span> <span class="symbol">@gifts</span> <span class="operator">=</span> <span class="words">qw( foo bar baz )</span><span class="structure">;</span><br /><br /><span class="word">say</span> <span class="double">&quot;$locale: &quot;</span><span class="operator">,</span> <span class="symbol">$cldr</span><span class="operator">-&gt;</span><span class="word">list</span><span class="structure">(</span><span class="word">map</span> <span class="structure">{</span> <span class="symbol">$cldr</span><span class="operator">-&gt;</span><span class="word">quote</span><span class="structure">(</span><span class="magic">$_</span><span class="structure">)</span> <span class="structure">}</span> <span class="symbol">@gifts</span><span class="structure">);</span></code><br />&nbsp;</td></table>
<p>The <code>quote</code> method is used to quote each element and the <code>list</code> method is used to format the entire list. Let&rsquo;s take a look at the results in Portuguese, French, and Urdu.</p>
<ul>
<li><p>pt: &ldquo;foo&rdquo;, &ldquo;bar&rdquo; e &ldquo;baz&rdquo;</p>
</li>
<li><p>fr: &laquo;foo&raquo;, &laquo;bar&raquo; et &laquo;baz&raquo;</p>
</li>
<li><p>ur: &rdquo;foo&ldquo;&#x60C; &rdquo;bar&ldquo;&#x60C; &#x627;&#x648;&#x631; &rdquo;baz&ldquo;</p>
</li>
</ul>
<p>Note that for support of all locales, you currently have to use the Locale::CLDR v0.25.x release on CPAN instead of v0.26.x because the latter is in the process of being broken into locale bundles and that work is ongoing.</p>
<h3 id="New-year-new-development">New year, new development</h3>
<p>We&rsquo;ve had a great year for Perl localization and I hope 2015 will be even better. Once the most important <a rel="nofollow" target="_blank" href="https://metacpan.org/module/CLDR::Number::TODO">CLDR::Number::TODO</a> tasks are completed, <a rel="nofollow" target="_blank" href="https://metacpan.org/module/DateTime::Locale">DateTime::Locale</a> will receive some much needed love. The top gift on my wishlist is a Perl wrapper for <a rel="nofollow" target="_blank" href="http://site.icu-project.org/">ICU4C</a> (International Components for Unicode), which is a mature project providing full CLDR support. I&rsquo;m confident that if I continue to fill my uncle&rsquo;s boots with coleslaw on Yaksmas Eve, the Gilded Yak may finally deliver.</p>
<h2 id="See-Also">See Also</h2>
<ul>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/Locales">Locales</a> provides much of the basic CLDR data, such as names of countries and languages.</p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/DateTime">DateTime</a> provides CLDR-based formatting using <a rel="nofollow" target="_blank" href="https://metacpan.org/module/DateTime::Locale">DateTime::Locale</a>.</p>
</li>
<li><p><a rel="nofollow" target="_blank" href="https://metacpan.org/module/Geo::Region">Geo::Region</a> provides UN M.49 and CLDR geographical region and grouping data.</p>
</li>
</ul>
</div>Nick Patchhttp://perladvent.org/2014/2014-12-23.htmlTue, 23 Dec 2014 05:00:00 +0000