DrC0shttps://drc0s.wordpress.com
Musings of Dao-Clinicist / 道可道 非常道Sun, 27 May 2018 04:03:34 +0000enhourly1http://wordpress.com/https://drc0s.files.wordpress.com/2017/08/cropped-my_honeybadger.jpg?w=32DrC0shttps://drc0s.wordpress.com
3232Cool FOSS’ heads prevail once againhttps://drc0s.wordpress.com/2017/09/23/cool-foss-heads-prevail-once-again/
https://drc0s.wordpress.com/2017/09/23/cool-foss-heads-prevail-once-again/#commentsSat, 23 Sep 2017 17:24:45 +0000http://drc0s.wordpress.com/?p=335Continue reading "Cool FOSS’ heads prevail once again"]]>As you have seen in my last post or elsewhere, Facebook has recently added a dubious patent clause in the license of multiple projects including ReactJS. And predictably, a number of organizations, companies, and open-source advocates made it clear that it’s way too dangerous to keep on using the code with such restrictions because of possible legal repercussions.

Well, I am pleased to tell to all my readers, that they have back-tracked on this after Apache Foundation, WordPress, and many others have express their clear intention of switching to safe alternatives to React.js and other frameworks from FB, or banning their use. As you all know, FOSS is a free market ecosystem; it is thriving from the forces of intellectual competition, always offering multiple choices to its users. And this approach won again: facing the danger of loosing their user base and, effectively, rendering themselves irrelevant, they made the decision to, once again, re-license some of their projects under MIT.

Namely, ReactJS will be released under the new license. So if you are using it – make sure to update your dependencies to v.16 once it is out next week. Remember, re-licensing isn’t usually retroactive, so don’t fall into that trap.

Disclaimer: I am not using, planning nor recommending to use any Facebook’s sponsored projects

I won’t bother you with much details, as they are readily available elsewhere. I just want to point out that Facebook is hedging their open-source “exposure”. What they are effectively saying is “Go ahead and use our awesome stuff. But if we ever decide that you’re competing with us, we’ll yank your licence to use our frameworks so fast your shoes will fall off.” It doesn’t matter if someone has developed this code for you: you won’t be able to use it anyway.

That’s the essence. It is the original intention of the license behind ReactJS and a few other frameworks. And that’s why Apache Foundation has moved the license to Cat-X, prohibiting any of its projects to touch things like ReactJS. Facebook software is NOT compatible with the projects developed under widely accepted and respected ALv2.

Here’s the excerpt:

Facebook BSD+Patents license

The Facebook BSD+Patents license includes a specification of a PATENTS file that passes along risk to downstream consumers of our software imbalanced in favor of the licensor, not the licensee, thereby violating our Apache legal policy of being a universal donor. The terms of Facebook BSD+Patents license are not a subset of those found in the ALv2, and they cannot be sublicensed as Alv2.

These are the unintended consequences of meddling with well thought open-source software licenses. That is the beauty of open-source: if you trying to lock people in or out – they will move. It doesn’t matter how much money you have, how big you are, nor what your SJW position is. Developers will go, and the users will as well.

I’m sure we haven’t heard the last of it yet. And that’s the damning and loud application of the golden rule!

]]>https://drc0s.wordpress.com/2017/08/26/facebook-licensed-code-is-kicked-out/feed/1drc0sGab.ai was kicked from GooglePlay…https://drc0s.wordpress.com/2017/08/18/gab-ai-was-kicked-from-googleplay/
https://drc0s.wordpress.com/2017/08/18/gab-ai-was-kicked-from-googleplay/#respondSat, 19 Aug 2017 00:04:20 +0000http://drc0s.wordpress.com/?p=318Continue reading "Gab.ai was kicked from GooglePlay…"]]>But who cares… All you need to do, is go the Gab’s website, and right there on the left side a link to where you can grab and install the .apk package for the mobile app

Make sure your phone’s settings allow to install applications from “Unknown sources” (I will let you figure out how to do it ;), and vu’a la. Enjoy!

]]>https://drc0s.wordpress.com/2017/08/18/gab-ai-was-kicked-from-googleplay/feed/0drc0sScreen Shot 2017-08-18 at 17.01.54Apache process (webinar)https://drc0s.wordpress.com/2017/08/17/apache-process-webinar/
https://drc0s.wordpress.com/2017/08/17/apache-process-webinar/#respondWed, 16 Aug 2017 23:57:44 +0000http://drc0s.wordpress.com/?p=190A few days ago, I gave this talk about the Apache Software Foundation processes (however few of them are there) and how communities operate. If you are interested, there’s the recording of the webinar.

]]>https://drc0s.wordpress.com/2017/08/17/apache-process-webinar/feed/0drc0sFinally, I have moved away from Google!https://drc0s.wordpress.com/2017/08/17/finally-i-have-moved-away-from-google/
https://drc0s.wordpress.com/2017/08/17/finally-i-have-moved-away-from-google/#respondWed, 16 Aug 2017 23:35:56 +0000http://drc0s.wordpress.com/?p=174Continue reading "Finally, I have moved away from Google!"]]>As you might have noticed, my blog is no longer hosted on Blogger.com (actually Google).

I did it for two reasons:

I was planning on it for a long time because of somewhat mediocre functionality of the Blogger.

What was the last straw is the Google’s reaction to the intellectual argument of James Damore (if you aren’t familiar with the story, it means you probably were a part of the first Mars expedition). I cannot trust my content to a company that suppresses the free speech in full disregard to the individual rights, protected by the law of the land.

I made an effort to make sure that old URLs are working properly and redirect you to the new location. That should take care about cached searches and bookmarks. If you notice that something is missing – please let me know, so I can fix it ASAP.

Well, ever since the company behind the read-only open-source project called Tachyon has decided to change the name of the project, I was puzzled. If you build something successful, you want the name of it to be recognized, right? In marketing, it is called “brand recognition”.

Why would Coca-Cola rename their product into SludgeWaters? Indeed, it doesn’t make much sense! The most infamous brand-recognition screw-up was when SUNW (Sun Microsystems) got renamed to JAVA on the NASDAQ. And _that_ ended well, for sure. The brilliant idea belonged to the Silicon Valley class-clown with the pony-tail. I am sure you know, whom I refer to.

At any rate, why an allegedly successful software project would change its name in a middle of the rise? I have a hypothesis, that it has been caused by the fact that any time one searches for Tachyon on Google (or elsewhere), the first link popping-up would be to my blog from last year and the close second would point to the story how Tachyon BDFL has decided to remove my benign answer from their public mail list.

So, in the interest of the history preservation, I am putting up the new one, but correcting the name to reflect new reality of Alluxio project. The technical findings stand the same, so just go and read the year old blog to figure where the old application with the new name is falling short.

The last but not least, since the time of the original write-up, Apache Ignite has graduated to Apache TLP project, that’s why the “(incubating)” suffix is dropped as well

]]>https://drc0s.wordpress.com/2016/07/15/apache-ignite-vs-alluxio-former-tachyon/feed/0drc0sLet’s speed up Apache Hive with Apache Ignite & Apache Bigtophttps://drc0s.wordpress.com/2015/10/08/lets-speed-up-apache-hive-with-apache-ignite-apache-bigtop/
https://drc0s.wordpress.com/2015/10/08/lets-speed-up-apache-hive-with-apache-ignite-apache-bigtop/#respondThu, 08 Oct 2015 00:41:00 +0000http://drc0s.wordpress.com/2015/10/08/lets-speed-up-apache-hive-with-apache-ignite-apache-bigtopContinue reading "Let’s speed up Apache Hive with Apache Ignite & Apache Bigtop"]]>Today we will be looking into how we can speed Hive using Apache Ignite. For this particular exercise I will be using Apache Bigtop stack v1.0 because I don’t care wasting my time with manual cluster setting; nor I do want to use any of the overly complex stuff like Cloudera Manager or Ambari. I am a Unix command-line guy, and CLI leaves all these fancy yet semi-backed contraptions biting the dust. Let’s start.

For the simplicity I’d suggest to use docker. If you don’t know how to use docker you can do the same on your own system and clean the mess later. Or better yet – learn how to use docker (if you’re on Mac – you’re on your own!). Despite all the hype around it, it is still a useful tool in some cases. I’ll be using one from an official Bigtop Ubuntu-14.04 image:

Now you can follow bigtop-deploy/puppet/README.md on how to deploy your cluster. Make sure you have selected hadoop, yarn, ignite-hadoop, and hive while editing /etc/puppet/hieradata/site.yaml (as specified in the README.md). Once puppet apply command is finished you should have a nice single node cluster, running HDFS, YARN, and ignite-hadoop w/ IGFS. Hive should be configured and ready to run. Let’s do a couple more steps to get the data in place and ready for the experiments:

and now to Hive. Make sure it is executed with proper configuration to take advantage of in-memory data fabric provided by Apache Ignite. Let’s start Hive CLI to work with Ignite cluster, set the tables and run some queries:

Notice the times of both queries.Quit the hive session and restart it with standard config to run on top of YARN:

% hive cli

;; All the tables are still in place, so let’s just repeat the queries:SELECT COUNT(*) FROM batting WHERE year > 1909 AND year <= 1969;SELECT a.year, a.player_id, a.runs from batting a JOIN (SELECT year, max(runs) runs FROM batting GROUP BY year ) b ON (a.year = b.year AND a.runs = b.runs) ;

Once again: notice the execution times and appreciate the difference! Enjoy!

]]>https://drc0s.wordpress.com/2015/10/08/lets-speed-up-apache-hive-with-apache-ignite-apache-bigtop/feed/0drc0s30+ time faster Hadoop MapReduce application with Bigtop and Ingitehttps://drc0s.wordpress.com/2015/05/06/30-time-faster-hadoop-mapreduce-application-with-bigtop-and-ingite/
https://drc0s.wordpress.com/2015/05/06/30-time-faster-hadoop-mapreduce-application-with-bigtop-and-ingite/#respondWed, 06 May 2015 01:50:00 +0000http://drc0s.wordpress.com/2015/05/06/30-time-faster-hadoop-mapreduce-application-with-bigtop-and-ingiteContinue reading "30+ time faster Hadoop MapReduce application with Bigtop and Ingite"]]>Did you ever wonder how you can deploy Hadoop stack quickly? Or what can be done to speed up that slow MapReduce job? Look no further – with Apache Bigtop you can get a Hadoop cluster stack deployed in a matter of a few minutes with no hassle and no sweat. And how to run your old MapReduce applications very fast? Apache Ignite (incubating) gives you that option out of the box with its Hadoop Accelerator

The stack being deployed in the following demo is from Apache Bigtop 1.0 RC (Hadoop 2.6, Ignite 1.0, etc.) Enjoy

]]>https://drc0s.wordpress.com/2015/05/06/30-time-faster-hadoop-mapreduce-application-with-bigtop-and-ingite/feed/0drc0sApache Ignite vs Apache Sparkhttps://drc0s.wordpress.com/2015/04/29/apache-ignite-vs-apache-spark/
https://drc0s.wordpress.com/2015/04/29/apache-ignite-vs-apache-spark/#commentsWed, 29 Apr 2015 00:45:00 +0000http://drc0s.wordpress.com/2015/04/29/apache-ignite-vs-apache-sparkContinue reading "Apache Ignite vs Apache Spark"]]>Complimentary to my earlier post on Apache Ignite in-memory file-system and caching capabilities I would like to cover the main differentiation points of the Ignite and Spark. I see questions like this coming up repeatedly. It is easier to have them answered, so you don’t need to fish around the Net for the answers.

– The main different is, of course, that Ignite is an in-memory computing system, e.g. the one that treats RAM as the primary storage facility. Whereas others – Spark included – only use RAM for processing. The former, memory-first approach, is faster because the system can do better indexing, reduce the fetch time, avoid (de)serializations, etc.

– Also, unlike Spark’s the streaming in Ignite isn’t quantified by the size of RDD. In other words, you don’t need to form an RDD first before processing it; you can actually do the real streaming. Which means there’s no delays in a stream content processing in case of Ignite

– Spill-overs are a common issue for in-memory computing systems: after all memory is limited. In Spark where RDDs are immutable, if an RDD got created with its size > 1/2 node’s RAM then a transformation and generation of the consequent RDD’ will likely to fill all the node’s memory. Which will cause the spill-over. Unless the new RDD is created on a different node. Tachyon was essentially an attempt to address it, using old RAMdrive tech. with all its limitations.Ignite doesn’t have this issue with data spill-overs as its caches can be updated in atomic or transactional manner. However, spill-overs are still possible: the strategies to deal with it are explained here

– as one of its components Ignite provides the first-class citizen file-system caching layer. Note, I have already addressed the differences between that and Ignite, but for some reason my post got deleted from their user list. I wonder why?

– Ignite supports full SQL99 as one of the ways to process the data w/ full support for ACID transactions

– Ignite supports in-memory SQL indexes functionality, which lets to avoid full-scans of data sets, directly leading to very significant performance improvements (also see the first paragraph)

– with Ignite a Java programmer shouldn’t learn new ropes of Scala. The programming model also encourages the use of Groovy. And I will withhold my professional opinion about the latter in order to keep this post focused and civilized

I can keep on rumbling for a long time, but you might consider reading this and that, where Nikita Ivanov – one of the founders of this project – has a good reflection on other key differences. Also, if you like what you read – consider joining Apache Ignite (incubating) community and start contributing!

]]>https://drc0s.wordpress.com/2015/04/29/apache-ignite-vs-apache-spark/feed/2drc0sApache Ignite (incubating) vs Tachyonhttps://drc0s.wordpress.com/2015/04/28/apache-ignite-incubating-vs-tachyon/
https://drc0s.wordpress.com/2015/04/28/apache-ignite-incubating-vs-tachyon/#commentsTue, 28 Apr 2015 01:51:00 +0000http://drc0s.wordpress.com/2015/04/28/apache-ignite-incubating-vs-tachyonAfter the discovery that my explanation of the differences between Apache Ignite (incubating) and Tachyon caching project, I found out that my attempt to clarify the situation was purged as well.
About the same time I got a private email from tachyon-user google group explaining to me that my message “was deleted because it was a marketing message”.

So, looks like any messages even slightly critical to the Tachyon project will be deleted as ‘marketing msgs’ in true FOSS spirit! Looks like the community building got off the wrong foot on that one. So, I have decided to post the original message that of course was sent back via email the moment it got posted in the original thread.

Apache Ignite (incubating) is a fully developed In-Memory Computing (IMC) platform (aka data fabric). “Supporting for Hadoop ecosystem” is one of the components of the fabric. And it has two parts:
– file system caching: fully transparent cache that gives a significant performance boost to HDFS IO. In a way it’s similar to what Tachyon tries to achieve. Unlike Tachyon, the cached data is an integral part of bigger data fabric that can be used by any Ignite services.
– MR accelerator that allows to run “classic” MR jobs on Ignite in-memory engine. Basically, Ignite MR (much list its SQL and other computation components) is just a way to work with data stored in the cluster memory. Shall I mention that Ignite MR is about 30 times – that’s 3000% – faster than Hadoop MR? No code changes is need, BTW

When you say about “Tachyon… support big data stack natively.” you should keep in mind that Ignite Hadoop acceleration is very native as well: you can run MR, Hive, HBase, Spark, etc. on top of the IgniteFS without changing anything.

And here’s the catch BTW: file system caching in Ignite is a part of its ‘data fabric’ paradigm like the services, advanced clustering, distributed messaging, ACID real-time transactions, etc. Adding HDFS and MR acceleration layer was pretty straight-forward as it was build on the advanced Ignite core, which has been in the real-world production for 5+ years. However. it is very hard to achieve the same level of enterprise computing when you start from an in-memory file system like Tachyon. Not bashing anything – just saying.

I would encourage you to check ignite.incubator.apache.org: read the docs, try version 1.0 from https://dist.apache.org/repos/dist/release/incubator/ignite/1.0.0/ (setup is a breeze) and join our Apache community. If you are interested in using Ignite with Hadoop – Apache Bigtop offers this integration, including seamless cluster deployment which let you get started with fully functional cluster in a few minutes.

In the full disclosure: I am an Apache Incubator mentor for the Ignite project.