Blockchain technology is on the radar of a number of tech corporations – and IBM is leading the way, according to a study. Big Blue has been making considerable steps forward with research and development projects aimed at broadening the scope of distributed ledger technology to include industries other than the financial services.

Last month it announced a partnership with Nestle, Unilever, Wal-Mart and other food giants to trace the movements of food and tackle contamination faster by using the technology. According to research firm Juniper Research, the corporation is better positioned than competitor Microsoft as far as its blockchain credentials are concerned. More than 40 percent of tech executives and leaders in the blockchain sector ranked IBM as top, with only 20 percent saying the same of Microsoft.

A blockchain is a huge decentralized log of data spread across numerous locations. It secures the data through encrypted 'blocks', accessed via a peer-to-peer network. The purpose of the original blockchain was solely financial, serving as a distributed ledger for bitcoin transactions. In July, Juniper said that more than half (57%) of the world's large corporations are considering the deployment of their own blockchain solutions. The race towards building distributed ledger solutions has become increasingly heated, with EY announcing its own platform aimed at securing insurance for the shipping industry earlier this month.

Edge computing refers to data processing power at the edge of a network instead of holding that processing power in a cloud or a central data warehouse.

Cloud computing has dominated IT discussions for the last two decades, particularly since Amazon popularized the term in 2006 with the release of its Elastic Compute Cloud. In its simplest form, cloud computing is the centralization of computing services to take advantage of a shared data center infrastructure and the economy of scale to reduce costs. However, latency, influenced by the number of router hops, packet delays launched by virtualization, or server placement within a data center, has always been a key issue of cloud migration. Edge computing has also been a driver of innovation within OpenStack, the open source cloud computing project.

This is where edge computing comes in. Edge computing is essentially the process of decentralizing computer services and moving them closer to the source of data. This can have a significant impact on latency, as it can drastically reduce the volume of data moved and the distance it travels.

The term “edge computing” covers a wide range of technologies, including peer-to-peer, grid/mesh computing, fog computing, blockchain, and content delivery network. It's been popular within the mobile sector and is now branching off into almost every industry.

The relationship between edge and cloud

There is much speculation about edge replacing cloud, and in some cases, it may do so. However, in many situations, the two have a symbiotic relationship. For instance, services such as web hosting and IoT benefit greatly from edge computing when it comes to performance and initial processing of data. These services, however, still require a robust cloud backend for things like centralized storage and data analysis.

Edge computing: a brief history

Edge computing can be traced back to the 1990s, when Akamai launched its content delivery network (CDN), which introduced nodes at locations geographically closer to the end user. These nodes store cached static content such as images and videos. Edge computing takes this concept further by allowing nodes to perform basic computational tasks. In 1997, computer scientist Brian Noble demonstrated how mobile technology could use edge computing for speech recognition. Two years later, this method was also used to extend the battery life of mobile phones. At the time, this process was termed “cyber foraging,” which is basically how both Apple’s Siri and Google’s speech recognition services work.

1999 saw the arrival of peer-to-peer computing. In 2006, cloud computing emerged with the release of Amazon’s EC2 service, and companies have adopted it in huge numbers since then. In 2009, “The Case for VM-Based Cloudlets in Mobile Computing” was published, detailing the end-to-end relationship between latency and cloud computing. The article advocated for a “two-level architecture: the first level is today’s unmodified cloud infrastructure” and the second consisted of dispersed elements called cloudlets with state cached from the first level.” This is the theoretical basis for many aspects of modern edge computing, and in 2012 Cisco introduced the term “fog computing” for dispersed cloud infrastructure designed to promote IoT scalability.

This brings us to current edge solutions, of which there are many. Whether purely distributed systems such as blockchain and peer-to-peer or mixed systems such as AWS’s Lambda@Edge, Greengrass, and Microsoft Azure IoT Edge, edge computing has become a key factor driving the adoption of technologies such as IoT.

EMC has announced it will acquire Greenplum, a data warehousing and business analytics software firm for an undisclosed sum. EMC will use this acquisition to form the basis of a new Data Computing Products Division led by Bill Cook, CEO of Greenplum, who will report to Pat Gelsinger, COO of EMC's Information Infrastructure Products. To put that statement into perspective, Backup and Recovery Solutions (where Data Domain and other related acquisitions now live) is also a separate EMC division reporting to Gelsinger. BRS is a big division with a lot of products. Therefore, I think one can safely bet that Data Computing Products at EMC will grow in scale and scope.

And here we see that EMC's marketing minds were hard at work. While EMC is positioning Greenplum in business analytics, this new division is not being called the EMC Business Analytics Division, nor the Data Warehousing/Business Intelligence Division, or the even sexier Cloud Analytics Division. No. This is the Data Computing Products Division. What is data computing or a data computing product? I'll let EMC explain because I'm not sure that I can.

First observation: A common assumption is that EMC is doing this to respond to Oracle's success with its Exadata solution and NetApp's acquisition of Bycast. True, EMC's Data Computing solutions yet to be seen will likely compete with the new Sun Oracle as systems vendor. But we should add to the list Hewlett-Packard, IBM, Teradata, database vendors, cloud software vendors, and anyone else in the new, cloud friendly, business analytics space. EMC sees Greenplum's database and cloud portal technologies as disruptive to the traditional data warehousing/business intelligence market. And, in the possession of EMC's worldwide marketing and sales force, Greenplum could very well be disruptive--within the DW/BI marketplace and beyond. Read the specs on Greenplum's massively parallel processing (MPP) database product. They're impressive. What's even better for EMC is that the Greenplum Database likes to be integrated with storage. This acquisition cries out for an integrated hardware software stack (think VBlock) or a yet-to-be-named EMC Data Computing appliance (GBlock?). Is EMC ooching toward becoming a systems vendor? No. Not without a major services group.

Right now, EMC just wants to be a major player in the fast-growing business analytics segment--hence the Greenplum acquisition and the creation of the EMC Data Computing Products division. However, the challenge for EMC will be to grow a presence in a space where its large and formidable sales force is virtually unknown. In EMC's favor is the fact that DW/BI is rapidly evolving from a relatively slow batch processing application that pulls data from a few sources to one that takes massive amounts of data from a variety of sources including real time sensory data, and delivers results in real or near-real time to many concurrent users. Business analytics is a new opportunity for all.

Second observation: We've seen very little mention made so far about Greenplum Chorus which is even more germane to EMC's "Journey to the Private Cloud" strategy. Greenplum calls it the "first commercial Enterprise Data Cloud." I'm not here to argue the validity of that statement. I only wish to point out that anyone who attended EMC World 2010 couldn't have helped but notice the Journey to the Private Cloud signage. It was everywhere. Greenplum Chorus does cloud-based self-service provisioning of data marts, allows cloud analytics users to share data sets and data marts with others, and supports social networking and collaboration. As such, Chorus fits right in as a major journey-to-the-cloud destination. Let me restate my previous conjecture in a slightly different way: cloud data analytics is a new opportunity for all.

Expect to see a few of Greenplum's partners defect. (HP and Sun are good bets here, I think.) However, integration projects with open-source software (Hadoop for example) will proceed forward. Expect to see some more EMC Data Computing products relatively soon like an EMC/Greenplum data computing appliance or at least some DIY reference models. In a recent post, EMC blogger extraordinaire Chuck Hollis alludes to running parts of the Greenplum processing stack on any storage array that uses x86 processing technology. That allusion could presage a new optimized array from EMC or a version of a current EMC storage array that supports Greenplum Database processing offload.

At the UC Office of the President, the IT function is always on the lookout to partner with other campuses and leverage their services. This has proven to be both cost-effective and a great way to develop great cross-location relationships, and ultimately create even more efficient uses of shared services.

Currently, UCOP uses the services of the Production Control Shared Service Center (PCSSC), located at UC Berkeley, for batch scheduling and managed file transfers. The formation of the PCSSC was a joint collaboration between UC San Francisco, UCOP, and UC Berkeley in 2012 to consolidate duplicate functions into a shared service.

As the formal transition of this service occurred, meetings were held to determine how incidents would be handled. At that time UC Berkeley was using Footprints and UCOP was using ServiceNow, so the plan going forward was to use emails and phone calls for raising requests and reporting incidents. As you can imagine, this resulted in a few actions falling through the cracks and did not provide great visibility into the status of the various activities.

When UC Berkeley established its instance of ServiceNow, a collaboration between the ServiceNow development teams yielded the creation and implementation of reusable integration code called UCMITI – UC Multi Instance Task Integration. Through the work of Manager and Architect Scott Hall’s team from UC Berkeley and Nithin Reddy, ServiceNow administrator at UCOP, the project moved very quickly to production. “The close collaboration between our two teams, along with a rapid iterative approach to design and development, allowed us to accomplish quite a lot in a short amount of time,” Hall said.

With UCMITI activated within each ServiceNow instance, tasks became integrated. When PCSSC opens a ServiceNow incident in their environment at UC Berkeley, the integration creates and assigns a task to UCOP within UCOP’s ServiceNow instance. At present, task creation and updates are a one-way communication; however, future work is planned to include bi-directional coordination.

The result is that incidents are created, tracked, and managed in the system of record for each organization, giving visibility to the local support organizations for incident resolution. The integration took on greater importance when UCOP moved to UCPath because it provides critical and timely communication and incident resolution to UCPath customers.

So ultimately, our initial collaboration in forming the PCSSC put us on the path to improve that service via more collaborative efforts, and to develop the framework that would enable the service to adapt to future needs!

Machine learning is the real reason for Apache Spark because, at the end of the day, you don't want to just ship and transform data from A to B (a process called ETL (Extract Transform Load)). You want to run advanced data analysis algorithms on top of your data, and you want to run these algorithms at scale. This is where Apache Spark kicks in.

Apache Spark, in its core, provides the runtime for massive parallel data processing, and different parallel machine learning libraries are running on top of it. This is because there is an abundance on machine learning algorithms for popular programming languages like R and Python but they are not scalable. As soon as you load more data to the available main memory of the system, they crash.

Apache Spark in contrast can make use of different computer nodes to form a cluster and even on a single node can spill data transparently to disk therefore avoiding the main memory bottleneck. Two interesting machine learning libraries are shipped with Apache Spark, but in this work we'll also cover third-party machine learning libraries.

The Spark MLlib module, Classical MLlib, offers a growing but incomplete list of machine learning algorithms. Since the introduction of the Data Frame-based machine learning API called SparkML, the destiny of MLlib is clear. It is only kept for backward compatibility reasons.

This is indeed a very wise decision, as we will discover in the next two chapters that structured data processing and the related optimization frameworks are currently disrupting the whole Apache Spark ecosystem. In SparkML, we have a machine learning library in place that can take advantage of these improvements out of the box, using it as an underlying layer.

A recent press release out of the company states, “Informatica, the Enterprise Cloud Data Management leader accelerating data-driven digital transformation, today announced that Gartner, Inc., a leading IT research, and advisory firm, has positioned Informatica as a Leader in its 2017 Magic Quadrant for Data Integration Tools for the 12th consecutive year. This year’s report marks the fourth year in a row that Gartner has positioned Informatica as the furthest and highest on the completeness of vision and ability to execute axes, respectively. The complete report, including the quadrant graphic, was published on August 3, 2017, and is available at https://informatica.com/data-integration-magic-quadrant.html. ‘Gartner estimates that the data integration tool market generated more than $2.7 billion in software revenue (in constant currency) at the end of 2016. A projected five-year compound annual growth rate of 6.32% will bring the total market revenue to around $4 billion in 2021 (see ‘Forecast: Enterprise Software Markets, Worldwide, 2014-2021, 2Q17 Update’)’.”

The release goes on, “According to the Gartner report, ‘the data integration tool market has established a focus on transformational technologies and approaches demanded by data and analytics leaders. The presence of legacy, resilient systems, and innovation all in the market together requires robust, consistent delivery of highly developed practices.’ The report notes that ‘the biggest change in the market from 2016 is the pervasive yet elusive demand for meta data driven solutions. Consumers are asking for hybrid deployment not just in the cloud and on-premises (which is metadata-driven combined with services distribution), but also across multiple data tiers throughout broad deployment models, plus the ability to blend data integration with application integration platforms (which is metadata driven in combination with workflow management and process orchestration) and a supplier focus on product and delivery initiatives to support these demands’.”

Bitcoin topped the $2,500 mark for the first time recently. A surge in demand from China coupled with an increase in ICOs (initial coin offerings) is likely driving the most recent spike in price. With this milestone now in the rearview mirror, I thought it might be interesting to discuss a few applications spanning two technologies that I’m somewhat familiar with: unified communications (UC) and blockchains.

A Brief History

I began learning about blockchain algorithms in 2011, which led me to start bitcoin mining when 1.00 BTC was worth $6.00 USD. Note, this was when GPU mining (hashing with high-end graphic cards) was still economical. At the height of my mining career, I probably had 40 GPUs running concurrently.

I eventually recruited my brother to help out. We mined thousands of bitcoins. It was fun, profitable and exciting to be (sort of) on the fringe of society. I dabbled a bit in next-gen ASIC mining equipment but eventually let the hobby go when it started to feel like another job. I was busy enough building a tech startup with some awesome co-founders. Our company was experiencing a growth spurt around the time bitcoin emerged. I helped write some early UC applications for our platform, so there were more than enough exciting challenges to keep my mind occupied. Fast-forward to today, where unified communications and blockchain may now be on a collision course.

What Is A Blockchain?

Without going into a lot of technical detail, a blockchain is essentially an ever-growing list of transactions (listed in blocks) that are verified and permanently recorded. Each new block is linked to the previous block in chronological order, thus forming the chain. Blockchains are commonly stored in public distributed databases that allow for decentralized (peer-to-peer) ratification and acceptance. Blockchains are used in cryptocurrencies (like bitcoin) because they ensure a clear record of who owns what and are effectively immune from retroactive changes.

Have you all thought about upgrading from WCS v7.0 to WCS v8.0? If not, then its probably about time you do. The migration is more straight forward than previous release; however, this depends on your customization and architecture.

“Why should I upgrade?”, you might ask? Well, IBM WebSphere Commerce v8.0 delivers a new approach to e-commerce with an improved business user experience to help merchandisers and marketers deepen customer engagement and greatly enhance business results. WCS v8.0 also includes a new customer service capability that enables organizations' to provide a seamless experience across digital and call center channels.