Edward Capriolo

Feature Flag dark launch library from the guy who has his own everything

Hopefully you do not deploy "dangerous" code, but sometimes if you want to "get dangerous" you might only want to role the code out to a portion of your users. So if someone says "lets get dangerous" think

First, huffpost bloggers are not the same as huffpost reporters. I have a huffpo blog and I am not affiliated with huffpost anymore.(I am not sure if the writer mentioned in the article is a blogger or an employee)

Second, The huffingtonpost has been openly against trump from the beginning. At the end of every trump article is this footer: http://www.huffingtonpost.com/entry/donald-trump-women-sick_us_5804d6ece4b0e8c198a8fb66

secret or illegal cooperation or conspiracy, especially in order to cheat or deceive others.

The huffingtonpost has declared it does not like trump it is not a secret . I can not speak for what is legal, and without the secret you can not have a conspiracy, and I also do not see how the information is deceptive.

I have decided to change gears a bit and review one of my favorite andriod games Deus Ex: The Fall
. I was a big fan of Deus Ex 3 which came out on the xbox. For those not familiar, Deux Ex is a sneak shooter. I actually play 'the fall' on train rides home, it took me a few months of playing it periodically to beat it.

What makes this game special?

In the near future humans can be outfitted with augmentations "augs". They do things like steady your gun arm, mimetic camouflage etc. The way the Deus Ex game balances is you can not afford all the augs, so you pick and chose ones that match your game play. For example if you like run and gun type, you focus on body armor, speed enhancements and take downs, but if you want to sneak around you focus on stealth enhancements.

What makes a BAD sneak shooter ?

What makes a bad sneak shooters is huge missions, when your walking through a warehouse and you have to choke out 500 people over 4 hours of game play , this is just annoying. Think about it, could you image that in three or four hours no one realized that 500 security guards have not checked in? Or in 4 hours that one guy at the computer would not go for a bathroom break and just happen to look in one of the 90 lockers you have hidden bodies in? Just not possible and kinda silly.

Why does 'The Fall' avoid this ?

Well obviously this is an Android game, so by its nature it avoids huge levels. This actually gives the game the right feel, they are small levels with a few rooms, you execute a few tactical take downs and you get a reward! In the xbox game a lot of time is spent moving/hiding bodies, so as not to alert others and bring about a free for all. In 'the fall' the bodies just vanish after a few seconds. Bodies vanishing is not realistic, but I think it goes with the style you knock someone out and you move on. When I play I simply force myself in the mind of a character and play a 'realistic' way, there is no way an augmented human is going to huddle in a corner waiting for 3 hours for 3 different people to be in the "perfect place",. You just make a move and be dammed with the consequences.

Controls

I was rather impressed with the controls in fact I enjoyed them more than the console version. On screen you can switch weapons fast, icons appear when you are in take down range. A rather cool thing is that in the settings menu you can adjust the placement of each of the on screen controls. I was super impressed by this. I really did not have to move anything but the fact you could I thought was pretty neat.

Tidbits

One thing I enjoy is that around the game there are PDAs and computers that you can read or hack into to get some back story into the game and hints into what is unfolding. I really like that in all games, they did this in a gears of war with Journals and cogs, the nice part is this is always optional. You are not forced to watch 10 minute movies but if you care you can review the data in the world better. You can also talk to random people like a standard RPG and while they do not offer a ton to say that is still pretty cool.

Plot

You are an ex special forces character with augs drawn into something bigger than you. You are living below the radar and have to go on a variety of missions to acquire the drugs that keep you from rejecting your augs. As that goes down you have to deal with people who offer you what you need in exchange for your services and you are free to embark on side quests.For a 99 cent andriod game this plot is on a amazing and it would still be a fairly in depth plot for a console game.

Pros

Flexible game play, large environment to explore, up gradable character attributes, upgradeable weapons. Nice graphics and controls for a cell phone game. Retained a lot of the feel from the xbox game while moving to a cell platform.

Cons

While it is a sneak shooter the game is more biased towards the sneak, even with armor upgrades a couple well placed shots from enemies can put you down. The game is less fun to play as a shooter IMHO. Environments seem more detailed than characters.

Overall

If you like the console game and you have a 30 minute train ride home everyday this game is amazing. Since it is an older game it is totally worth the cost ~ 0.99 cents. I would still happily pay 3 or 4 dollars for it.

A sick blind devotion to python complete unchecked by reason

I was talking to a Python user about Spark: Me: "What were you looking to use spark for"Them: I hear there is PySparkMe: Yes very interesting, what are you looking to use it for,Them: PySpark

ROFL: The only take away about the spark platform is PySpark? Nothing else seemingly was interesting or caught your attention? Really nothing about streaming or in memory processing, just PySpark? lol #blinders

Your would think [data] scientist want to learn things?

I encounter this debate mostly with hive-streaming. When someone asks me about hive streaming I look at the problem. Admittedly there are actually a couple of tasks most easily addressed with streaming. But the majority of streaming things can be solved much more efficiently and correctly by writing a simple UDF UDAF in Java. What normally is a common reply when a Hive Committer, who wrote a book on hive, explains unequivocally that a UDF is better for performance, debugging, test ability, and is not that hard to write?

"I don't want learn how to compile things | learn about java | learn about what you think is the right way to do things", You would think that a data scientist who is trying to search for great truths would actually want to find the best way to use a tool they have been working with for years.

Just to note: In hive streaming everything moving in between processes via pipes and is like 4 context switches and two serializations for each row (not including the processing that has to happen in the pipe).

I don't care that 100% of the environment is Java, im f*ckin special

A few years back someone (prototyping in python) suggested we install LibHDFS. later someone suggested we install WebHDFS. The only reason to install these things is they must use python to do things, even if there already is prior examples of doing this exact task in java in our code base. Sysadmins should install new libraries, open new ports, monitor new services, and we should change our architecture, just because the python user does want to use Java for a task that 10 previous people have used java for.

"I'm Just prototyping"

This is the biggest hand waiver. When scoping out a new project don't bother looking for the best tool for the job. Just start hacking away at something and then whatever type of monstrosity appears, just say its already done, someone will just have you jam it into production anyway. Good lucky supporting the "prototype" with no unit tests in production for next 4 years. You would think that someone would take lead from a professional coder and absorb their best practices. No of course not, they instead will just tell you how best practices don't apply to them.#ThisISSparta!

Anyway its 7:00 am and I woke up to write this so that I can vent. But yea its not python, its not data scientists, but there is just a hybrid intersection of the two that is so vexing.

After working for a few companies a few things have become clear to me. Some background, I have been at small companies with no code, large companies with little code, small companies with a lot of code, and large companies where we constantly re-write the same code.

I was watching an episode of 'shark tank'. Contestant X had a product, call it 'Product X', and four of the five sharks offered nothing. The 5th shark, being very shark like, used this opportunity to offer a 'bad' deal. The maker of 'Product X' thought it over, refused the deal, and left with no deal. The other sharks were more impressed with 'Contestant X' than Product X'. They remarked that , "No deal is better than a Bad Deal". This statement is profound and software products should be managed the same way.

Think about the phrase tech-debt. People might say tech-debt kills your agility. But it is really not the tech-debt alone that kills your agility, it is 'bad deals' that lead to tech debt. As software gets larger it becomes harder to shape and harder to manage. At some point software becomes very big, and change causes a cascade of tech debt. Few people want to remove a feature. Think about Mokeys on a Ladder, and compare this to your software. Does anyone ever ask you to remove a feature? Even if something is rarely used or never used someone might advocate keeping it, as it might be used later. Removing something is viewed as a loss, even if it really is addition by subtraction. Even if no one knows who asked for this rule people might advocate keeping it anyway! Heck even if you find the person who wanted the feature and they are no longer at the company, and no one else uses it, people might advocate keeping it anyway!

The result of just-keep-it thinking is you end up keeping around code you won't use, which prevents you from easily adding new code. How many times have your heard someone say, 'Project X (scoff)!? That thing is a mess! I can re-write that in scala-on-rails in 3 days'. 4 weeks later when Project X on-scala-on-rails is released a customer contacts you about how they were affected because some small business rule was not ported correctly due to an over-site.

The solution to these over-sites is not test-coverage or sprints dedicated to removing tech-dept. The solution is never to make a bad deal. Do not write software with niche cases. Do not write software with surprising rules. The way I do this is a mental litmus test: Take the exit criteria of an issue and ask yourself, "Will I remember this rule in one year". If someone asks you to implement something and you realize it was implemented a year ago and no one ever used it, push back let them know the software has already gone in this direction and it led no where. If your a business and your struggling to close deals because the 'tech people' can not implement X in time, close a deal that does not involve X.

I used to be fairly anti-cloudera. I was never really convinced you needed someone to package up hadoop for you and your admins should just learn it. These days Hadoop is N degrees harder and I don't really have as much give-a-crap for learning to configure all the nobs that change names all the time. Thus I am more or less happy to let cloudera handle installing the 9000 hadoop components.

But really cloudera's testing is not that great. In my last version of cdh, decomissioning NodeManagers causes yarn to stop accepting jobs. ::Major fail:: Upgrade and in the new version the version hive can not support custom hive serde's because of an upstream Hive bug.

Still no out of the box tez support even though its clearly the way forward (and would make everything umpteeth times faster)

Does not really look like cloudera can/wants to keep up with Hive's release cycle

Sabotaging features by adding check boxes and disabling things that work out of the box "Check the box for Enable Hive on Spark (Unsupported)."

Constant complaints in manager that you should have a metastore server or should have zookeeper when truth is most users wont need either. (and I sure do not need this)

N day wait to cofirm bugs, "Whenever we get to it" fixes

1 zillion unneeded jars in classpath , hbase etc that Im not actually using with hive.

Im tired of dealing with backreved revsions and cloudera's "Why aren't you just using impala" type stance.

I am going back to rolling my own. I will still use cdh to manager hdfs proper and YAWN, but this hive situation is unmanagable. Hive on cloudera is like Python on Redhat 5. You are painted into an annoying box and you have no direct way to make it better other than ignoring it entirely and rolling your own!

Hello again! The last blog in this series was about cleanup compaction. While cleanup and compaction is interesting I do not think it has that web scale 'pop'. Definitely not sexy enough for Nibiru, the worlds first Internet of Things NoSql. I decided to treat myself and do something fun, so I decided now would be a good time to build triggers/coprocessor support.

We might as well start by defining some terminology. Many databases have trigger support, typically a trigger is a type of insert or update query that happens inside the RDBMS as a result of another insert or update operation. I first saw the term co-processor used in Google's BigTable white paper. Hbase is an open source implementation based on the BigTable spec has different types of coprocessors.

HBase has a region server that serves the region (shard), and replication is provided by the file system. In the Cassandra/Dynamo style a row key has multiple natural endpoints, no replicated file system, and the system needs to actively execute the operation across N replicas as we showed here.

Triggers/CoProcessors were batted around with Cassandra for a while. The implementation can be debated, for example should the trigger run be closer to the storage layer or closer to the coordinator level? Unlike Hbase where we can be sure one region server is "in charge" of a key, we would need a distributed locking mechanism to be "in charge" of a key in Cassandra and distributed locking is "heavy". Another potential implementation would be leveraging idempotent and retry-able operations like writes and deletes with timestamps. There are probably other ways to go about triggers as well.

Pick your poison

I decided to take the approach of coordinator triggers. In a previous blog we showed the coordinator is the piece that receives the request from the client and dispatches it to multiple servers. The good parts of this implementation are we can easily hook into the code before the result is returned to the client. The downside is that the trigger could timeout after the initial user operation (and it can not be easily unrolled if we wanted to try that). Maybe in a later blog we can build triggers closer to the storage layer.

public enum TriggerLevel { /** Request will block while trigger is executing, trigger can timeout, **/ BLOCKING, /** Request will not block while trigger is executing. * Triggers operations may be dropped if back pressure**/ NON_BLOCKING_VOLATILE, /** Request will not block while trigger is executing. * Trigger operations retry, potentially later */ NON_BLOCKING_RETRYABLE}

Next, the user needs an interface to plug the trigger logic into. We give the user access the message, the response, and the server. In most cases we have avoided passing the Server to make interfaces very discrete, but here we are going for flexible.

Lets get testing

A typical use case for triggers is building a reverse index during an insert. For each insert to a column family named Pets we will check to see if the column name is "age". If the column name matches we make another insert into another column family that organized the data by age.

One of the best NoSql-isms is when someone tells you about some elaborate feature, next they tell you NOT to use it. EVER!

Here is an example:

The second workaround is to add ?search_type=dfs_query_then_fetch to your search requests. The dfs standsfor Distributed Frequency Search, and it tells Elasticsearch to first retrieve the local IDF from each shard in order to calculate the global IDF across the whole index

Sounds great! Until you read the next advice:

Don’t use dfs_query_then_fetch in production. It really isn’t required. Just having enough data will ensure that your term frequencies are well distributed. There is no reason to add this extra DFS step to every query that you run.

In our last blog we showed how we can have nodes dynamically join our cluster to achieve a web scale, Internet Of Things, (Internet Of Things is now a buzzword I say once an hour) NoSql database. That blog had an incredible info graphic that demonstrated what happens when a node joins our cluster. Here it is again in techNoSQLcolor:

Now, remember our data files are write once so we can't change them after the fact. After the split, requests that get sent to node1 are cut in half, but the data files on node1 contain more data than they need to. What we need is a way to remove all the data that is on a node that is no longer needed.

------------------------------------------------Cassandra has a command for this called 'cleanup' that needs to be run on each node. The theory, in the olden days, a node join could go bad in some way and the system could be "recovered" by manually adjusting the tokens on each nodes and doing various repair process. In practice not many people (including myself) know exactly what to do when node joins go wrong, adjust tokens, move files, run repairs? The system SHOULD be able to automatically remove the old data, but no one has gotten to this yet as far as I can tell.------------------------------------------------

To handle cleanup we need two things:

A command that can iterate the data files (SsTables) and remove data that no longer belongs on the node.

A variable that can control allow normal compaction processes to cleanup data automatically.

You may want to look back at our previous blog on compaction to get an idea of how we merge SsTables.

Lets get to it

We are going to enhance the compaction process to handle this special case. First, we have a boolean that controls cleanup. If the token does not belong on this node we do not write it during compaction.

Testing time

For our coordinator, we made more of an integration test by launching a second server and joining it to the first. I typically like to be about 20% integration tests, 80% unit tests, and 0% mock tests. Why do I take this approach?

First, I believe mock tests are cheating. That is not say that mocking does not have it's uses, but I feel it is used to cover up code smells. If you have a good design and good API, not much mocking should not be needed. Integration tests are good at proving the entire process can run end-to-end, but they are long and redundant.

Unit tests do two some imporant for me: they test things, and work like tripwire for bad code and bad assumptions, they document things. They document things because they show what components should do. They tell a story.

The story for cleanup is simple: A system has some data on disk. After topology changes (like node join or leave or change of replication factor) some of that data is no longer required to be on a given node and can be removed.

I wrote this test by hiding code in methods with friendly names that say what they are doing. It is a little cute I know but why not? We insert 10 rows directly to the server to start things off. When the test is done only 1 of the 10 rows should still be on disk.

Rather than writing a long involved integration test to move data off the node, we implement a router that routes token "1" locally and routes everything else nowhere! This way when we Cleanup the data everything else should go. (No need for mocking libraries, just good old Object Oriented Design)