Featured in
Process & Practices

In-App Subscriptions Made Easy

There are various types of subscriptions: recurring, non-recurring, free-trial periods, various billing cycles and any possible billing variation one can imagine. But with lack of information online, you might discover that mobile subscriptions behave differently from what you expected. This article will make your life somewhat easier when addressing an in-app subscriptions implementation.

Featured in
Enterprise Architecture

EIP Designer: Bridging the Gap Between EA and Development

This article presents the EIP Designer project, an Eclipse-based tool for introducing integration patterns into an EA design, providing fluidity and continuity while filling the gap existing between EA practices and concrete software development.

Google Announces Cloud Dataflow Beta at Google I/O

At its annual developer conference, Google announced a set of new initiatives for cloud computing. At the top of the list is Cloud Dataflow -- a way of managing complex data pipelines.

InfoQ spoke with Brian Goldfarb, the Head of Product Marketing for Google's cloud platform. He pointed out that Cloud Dataflow handles both batch and streaming data. Imagine analyzing millions of tweets posted during a worldwide event in real time. In one pipeline segment, you read the tweets. In the next segment you extract tags. In another segment, you classify tweets by sentiment (positive, negative, or other). In the next segment, you filter for keywords. And so on. Map/Reduce -- an older paradigm for handling large data sets -- doesn't readily deal with such real-time data, and doesn't easily apply to such long, complex pipelines.

Google's new paradigm uses the same API for both batch processing and real time processing, and uses the same API for both simple and complex pipelines. With the product the developer concentrates exclusively on the data logic, leaving pipeline optimization details to the Google cloud. Instead of concentrating on each pipeline segment separately, Cloud Dataflow takes into account the way segments interact with other segments. That way, a single segment with slow processing doesn't necessarily stall the action in all the downstream segments. To handle the traffic among segments, Cloud Dataflow uses aggregation by key, sliding windows, parts of Map/Reduce, and many other techniques.

With Cloud Dataflow, Google's cloud makes choices about the best way to optimize any particular application. The developer can accept these optimizations for most scenarios, and override the defaults for edge cases.

For the developer, most scenarios involve coding in relatively simple parts of the API. Here's an example from the conference keynote:

The developer defines the pipeline, making sure that the code inside each segment (TweetTransformer, CalculateSentiment, and so on) is efficient and correct. Google's cloud then orchestrates the flow between and among the segments. Google's cloud also takes care of the low-level VM details. Operations such as deploying, scaling, spinning up and spinning down are all done behind the scenes.

To accompany Cloud Dataflow, Google has four new tools to make the developer's work easier and more productive. The tools are Cloud Save, Cloud Debug, Cloud Trace, and Cloud Monitor.

Cloud Save is a simple API for saving and retrieving user information in the cloud. This information can include application data, preferences and other things.

New paradigms like Storm, Spark, Giraph already emerged inside Hadoop space had replaced MapReduce. Since it was used already heavily, it is still difficult to transform existing system.muhammadkhojaye.blogspot.com

At its annual developer conference, Google announced a set of new initiatives for cloud computing. In its ongoing I/O, Google announced the launch of Cloud Dataflow, Though yet in the Beta stage, cloudwedge Dataflow is promising though. thanks