I like some of the freedom pytorch affords in terms of putting together your models. Keras seems like whoever they got working on documentation cares a lot more. Also, the author / originator of keras is pretty active on twitter daily and posts good stuff. As weird as it seems, for similar network architectures, I apparently hit the out of memory gpu error on keras more than on pytorch. Also, for complicated or deep networks, building the computation graph takes forever. The way I figure, if we're going to be functional / bleeding edge data scientists, and if we already know how to program, then it's in our best interest to just bite the bullet and learn both. I hope someone passionate goes in and submits a bunch of decent pull requests for pytorch's docs though..........