Tag: database

Socket.IO 1.0: Socket.IO has hit version 1.0 – the Node.js and browser library which started life as an implementation of the WebSockets interface and has gone on to “become the EventEmitter of the web”. The 1.0 release and changes are broken down in a blog posting, the first on a newly redesigned, and much more useful, Socket.IO website. In brief, modularisation, tighter code, binary support (so you can emit blobs and buffers), automated testing, better scalability using redis, more integration (including PHP support), better debugging support (and silence by default), sleeker APIs and CDN delivery. And the future plans include handling Node.js streams, Socket.IO support in Web Inspector and Firefox Dev Tools and more language and framework support. A splendid tool to have in your arsenal.

Git 2.0: The distributed version control system which made distributed version control systems cool before even version control could be cool, Git, has reached version 2.0. In the announcement of the new release, there’s a long list of all the changes and notes on backward compatibility. The 2.0 release has been anticipated by the developers for a while so a lot of ground work had already been done in previous 1.x versions making the 2.0 release look more like a minor release than a major version bump but there’s still plenty of changes and a foundation prepared for future changes. On that subject, there’s a promise of a shorter release cycle for the next release as delays have meant a number of features ‘cooking’ for longer in the ‘next’ branch.

OrientDB 1.7: Version 1.7 of the Document/Graph/Sql/NoSQL database OrientDB is available. The announcement for 1.7 notes better perforamnce, new clustering options, support for SSL and sharding, simplified configuration, new SQL commands including parallel queries, plugins for Lucene-based full text searching and more. There’s an Apache 2 licensed community edition of the database and commercially sold and supported professional and enterprise editions.

ElasticSearch 1.0 springs out: The search-oriented NoSQL database, built upon Lucene, ElasticSearch has hit version 1.0. It’s a big release with a lot of changes and a lot of new features – an API for selective snapshot/restore, federated search, aggregation, distributed percolation and software “circuit breakers” to stop some more dangerous actions from overwhelming the system. An interesting post from Found.no on ElasticSearch sums up the pros and cons (like no authentication or authorisation) places ElasticSearch in the domain of “secondary store” to be used alongside a primary database.

TokuMX 1.4: Tokutek’s “MongoDB-with-Toku-engine”, TokuMX, has hit version 1.4 and is addressing the performance of sharding and replication. Toku’s engine is reputed to be very good for particular use cases and it’s interesting to see alternative storage engines under the MongoDB infrastructure.

Python 3.4 on final approach: Out of beta and hitting release candidate a few days ago, Python 3.4 is now imminent. It’s expected to land just over a month from now on March 16. Check back to our coverage of the last beta for more details of what’s coming.

Python 3.4’s beta days: The first beta of Python 3.4 has arrived and it has got the good stuff. Pathlib lets coders work with pure paths or filesystem dependent paths with the selection of the latter taken care of for them. There’s a standardised enum module along with new statistics, asyncio and tracemalloc modules. Throw in a new pickling protocol, new string and binary hashing algorithms, a C API for custom memory allocators and standardise on pip as a packaging format and you are talking a tasty new Python due to land at the end of February 2014.

Neo4J 2.0 goes RC: The Neo4J graph database is heading into the home straight with a 2.0.0 release candidate and a warning that if you’ve been tracking their version 2.0 milestones you will need to perform a manual update on your database before using 2.0.0RC1. Now tagged as feature complete, the new RC will be bringing matching with properties, optional matches, relationship merges and more simplified syntax to Neo4J’s Cypher query language. That’s in addition to the Neo4J browser and other changes made over the five other milestones (5, 4,3, 2, 1).

Redis 2.8.0: Salvatore Sanfilippo has announced that, after almost a year of development, Redis 2.8.0 is done. If you don’t know it, Redis is a key/value store which can also handle hashes, lists and sets. The new version include a partial resync for slaves option, iterable collections, a rewritten config system, IPv6 suppport, pub/sub keyspace notifications and better consistency support and key expiration. Actually 2.8.1 is out for download too – see the release notes for more on the BSD licensed key/value store.

Facebook Rocks: Another database open sourced by Facebook? Yup, and demonstrating that the term “database” covers a lot of ground, Facebook’s latest is RocksDB, an embedded key-value store for those userfacing situations where you need a lot of woosh, little latency. Lead developer, Dhurba Borthakur, explains in a blog posting that RocksDB is based on Google’s LevelDB and is tuned to run on many-core servers which making efficient use of storage to cut down on write wear. It’s implemented as a C++ library with arbitrary byte streams for keys and values and all the major components are pluggable and replaceable. It’s published under a BSD licence and comes with an additional patent licence.

OSI gets a GM: The Open Source Initiative has long been a purely volunteer organisation and that has limited what it has been able to do. But that’s changing with the appointment of the first employee, Patrick Masson, who’s taken on the post of General Manager at the OSI. Masson has introduced himself to the membership and is setting out on his tasks of running working groups, expanding membership and updating the OSI’s communications. It’ll be interesting to see what a difference it makes.

Cosmic Sans Neue: Who doesn’t like programmer fonts with their mono-spaced elegance? But maybe you want something a tiny bit quirkier. Check out Cosmic Sans Neue Mono, which has a tiny bit of quirkyness, not only in it’s name but in some of the character shapes. You can also find it on GitHub and it’s available under the SIL Open Font Licence

Facebook, in their now traditional goal of taking on big data problems, solving them and then open sourcing the result, have open-sourced Presto, a distributed SQL query engine “optimized for ad-hoc analysis at interactive speed”. This type of app is designed for the folks who need to work out what people who like chips and cheese and rock but dont like bagels or opera also have, statistically, in common. Its a simple enough question, but when you get up to Facebook scale, its a hard question to answer. This is the land of Hadoop and Hadoop has its own SQL-like query engine, Hive.

But unlike Hive which converts queries into MapReduce tasks saving intermediate results to disk, Presto has a query and execution engine which runs in memory and is pipelined through the network. Presto is implemented in Java for easy integration with other parts of Facebook that are also built in Java and compiles parts of queries down to bytecode, letting the JVM JIT compile to machine code to get the best out of the Java environment. Although it doesn’t need Hive, Presto does need a datasource for its queries and it includes a plugin for Hive, though it only uses the Hive metastore service, presumably to obtain structural information, and then accesses the data over HDFS.

The Facebook announcement says “Presto is 10x better than Hive/MapReduce in terms of CPU efficiency and latency for most queries at Facebook” and has been in use internally since Spring of this year with multiple deployments and one cluster scaled to a thousand nodes. A thousand users actively use it with 30000 queries and processing a petabyte a day. Thats a good work out for any big data offering.

There’s plenty missing from Presto; various joins and aggregations are restricted and there’s no way to write results back into tables – they go straight to the client. Those issues, plus improved performance, query accelerators, hot cached data subsets and a high performance HBase connector are all on the roadmap for Presto.

Presto is licensed under the Apache License 2.0 but does not appear to be heading to the foundation with active development taking place around Facebook’s GitHub repository.

Slackware updated: The venerable Slackware Linux has had its annual update for 2013 announced by Patrick Volkerding and a fine update it appears to be. A 3.10.17 Linux kernel, X11R7.7 X Windows, 64-bit UEFI installation support and updates across the board for dev tools, applications, desktops (Xfce 4.10.1 and KDE 4.10.5) and more. And Slackware ARM 14.1 is also available.

MariaDB 10.0 goes Beta: As MariaDB, the community-supported and developed MySQL fork, branches away from MySQL with version 10.0, the first 10.0 Beta has been released with enhanced replication, more storage engines supported, engine independent query statistics, regexps with PCRE, admin improvements with roles and more. Google sponsored one enhancement (parallel replication) and blogged about the release noting it is already deploying 10.0 into non-production MySQL instances to aid the MariaDB debugging and development process. In beta, the focus should be on stabilising the 10.x feature set, so if you are considering MariaDB 10.x for future use, now is a good time to check it out.

Glassfish goes open only: Oracle have pulled commercial support from the Glassfish server for future releases and are pointing users over at their commercial WebLogic Server. They are carrying on development of the server as the reference implementation of future Java EE platforms, but the fear is the quality of the RI will suffer with no commercial imperative to keep quality and performance high. Oracle may well have backed the wrong Java EE web server from a community point of view – I know no one who goes “Hey, lets do that on Weblogic” – but now the competitive field is wide open. The X-EE Factor auditions for series… One other takeaway comes from Tomitribe – Open source isn’t free and if we want it to be industrially healthy, then the industry needs to make sure some money ends up in the open source communities.

Android Crypto Misuse: Develop for Android (or Java in general)? Write code that uses cryptography? Then read this paper – An Empirical Study of Cryptographic Misuse in Android Applications(pdf). From the abstract, “We develop program analysis techniques to automatically check programs on the Google Play marketplace, and find that 10,327 out of 11,748 applications that use cryptographic APIs – 88% overall – make at least one mistake”. Scary eh. Very worth a read though.

Catching up on Codescaling with some of the less mentioned things worth noting…

FreeBSD 10.0’s latest beta: It’s into the home/RC straight for FreeBSD 10 with the release of the third and hopefully last beta of the development cycle. The original schedule would have seen RC2 available around now, but with a focus on a quality release, there’s been a bit of slippage. Check out this FreeBSD News item from September for a feel of what’s going in. I’m looking forward to the switch to LLVM/Clang and seeing how the tickless kernel works out.

SQL injection attacks by Google?: Sucuri have come across an odd thing, Google doing SQL Injection attacks. Basically, Google’s bots crawl a site with links which would carry out an SQLi attack if followed… and then follow them like the bots they are which carries out the attack. Google may want to add at least some filtering to their bots in future, but its something to remind any application that ingests URLs from the web to follow them that URLs are not necessarily passive.

Rust reworks stack plan: For those interested in the implementation of languages, the Rust developers have decided to drop segmented stacks. Segmented stacks were stacks that were allocated small and expanded as needed. This would have allowed threads to have a much smaller footprint, but it didn’t quite work out that way. Followups on the thread discuss the cost of memory, both having it and accessing it, and alternative strategies.

InfluxDB: Databases for time series data are in and the latest open source addition to the game is InfluxDB which prides itself in no external dependencies. The Go-based MIT-licensed code has a JSONic HTTP API, an SQLish query language and a playground server to get running with. Its early days for InfluxDB, but its off to a good start.

Mozilla’s Circus Renewed: Mozilla’s Services project has announced a new version of its process/socket manager called Circus. Built using Python and ZeroMQ and recently redeveloped to be Python 3 compatible and fully asynchronous, the software lets an administrator manage processes and sockets on servers through a command line, Python API or web console. You can find the code on mozilla-services github.