Leveraging DeepMind’s breakthrough AI approaches takes some work, but the results are astounding. In this article, Toptal Freelance Deep Learning Engineer Neven Pičuljan guides us through the building blocks of reinforcement learning, training a neural network to play Flappy Bird using the PyTorch framework.

The Protein Data Bank (PDB) bioinformatics database is the world’s largest repository of experimentally-determined structures of proteins, nucleic acids, and complex assemblies. All data is gathered using experimental methods such as X-ray, spectroscopy, crystallography, NMR, etc. This article explains how to extract, filter, and clean data from the PDB to make it suitable for further analysis.

Employing Python to make machine learning predictions can be a daunting task, especially if your goal is to create a real-time solution. However, Tensorflow and Scikit-Learn can significantly speed up implementation.

In this article, Toptal Python Developer Guillaume Ferry outlines a simple architecture that should help you progress from a basic proof of concept to a minimal viable product without much hassle.

Machine learning and artificial intelligence are popular topics, vast domains with multiple paradigms to solve any given challenge.

In this article, Toptal Machine Learning Expert Adam Stelmaszczyk walks us through implementing deep Q-learning, a fundamental algorithm in the AI/ML world, with modern libraries such as TensorFlow, TensorBoard, Keras, and OpenAI Gym.

Launch yourself into Ethereum smart contract development with the Truffle framework! In this tutorial, Toptal Freelance Ethereum Developer Radek Ostrowski shows you how to create a practical ĐApp in the Solidity language, complete with its own ERC20 token.

For a large codebase, managing database schema can become tedious, especially if you maintain multiple testing environments or customers that update the product at different paces. Sometimes, documenting the latest schema or database changes isn’t enough.

In this article, Toptal Database Engineer Ivan Pavlov introduces us to concepts that help manage database states.

Open-source, IoT, and Ethereum smart contracts work together with a new utility coin to make transportation more accessible and reduce vehicle waste. In this article, Toptal Freelance Ethereum Developer Michał Mikolajczyk explains the motivations and methodology behind his startup’s latest initiative.

The AI revolution is transforming the consumer world. Some developers may shy away from AI, which has traditionally been a heavily technical specialty. In this article, Toptal Freelance Salesforce Developer Fahad Munawar Khan explores how accessible to developers the AI cloud has become, even for non-Salesforce apps, with Salesforce Einstein.

The goal of this article is to introduce you to the underlying principles of accessibility and help you flawlessly implement web accessibility guidelines and standards on your next project. Even minor improvements can help your content rank better, reach more people, and improve the overall user experience.

The Hungarian graph algorithm solves the linear assignment problem in polynomial time. By modeling resources (e.g., contractors and available contracts) as a graph, the Hungarian algorithm can be used to efficiently determine an optimum way of allocating resources.

This article will give you an introduction to the principles behind public-key cryptosystems and introduce you to the Santana Rocha-Villas Boas (SRVB) cryptosystem, developed by the author of the article and prof. Daniel Santana Rocha. The algorithm authors are making a campaign that includes a financial reward to anyone who manages to crack the code.

Sharing related information among isolated systems has become increasingly important. There are many methods to choose from to perform that task for SQL Server, but it’s important to know which is better for each use case.

WordPress is a very popular way to get a site up and running quickly. However, in their haste, plenty of developers end up making horrible decisions. Some mistakes, like leaving WP_DEBUG set to “true,” may be easy to make. Others, like lumping all your JavaScript into a single file, are as common as lazy engineers. Whichever mistake you manage to make, read on to find out the 12 most common WordPress mistakes that new and seasoned developers make.

Social networks are among the biggest sources of data today, and this means they are an extremely valuable asset for marketers, big data specialists, and even individual users like journalists and other professionals. Harnessing the potential of real-time Twitter data is also useful in many time-sensitive business processes.

In this article, Toptal Freelance Software Engineer Hanee’ Medhat explains how you can build a simple Python application to leverage the power of Apache Spark, and then use it to read and process tweets to identify trending hashtags.

Security has always been a primary concern for database experts, and with the advent of new, decentralized services, it’s become even more crucial. Microsoft addressed the need for an added level of security in SQL with the introduction of Always Encrypted functionality in SQL Server 2016.

Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table. It powers many high-traffic dynamic websites and web applications.

In this tutorial, Toptal Freelance Software Engineer Juan Pablo Carzolio will walk us through what it is and how hashing, distributed hashing and consistent hashing work.

Limited SQL scalability has prompted the industry to develop and deploy a number of NoSQL database management systems, with a focus on performance, reliability, and consistency. The trend was driven by proprietary NoSQL databases developed by Google and Amazon. Eventually, open-source systems like MongoDB, Cassandra, and Hypertable brought NoSQL within reach of everyone.

In this post, Senior Software Engineer Mohamad Altarade dives into some of them and explains why NoSQL will probably be with us for years to come.

With the rise of big data and data science, storage and retrieval have become a critical pipeline component for data use and analysis. Recently, new data storage technologies have emerged. But the question is: Which one should you choose? Which one is best suited for data engineering?

In this article, Toptal Data Scientist Ken Hu compares three prominent storage technologies within the context of data engineering.

The Hadoop Distributed File System (HDFS) is a scalable, open source solution for storing and processing large volumes of data. With its built-in replication and resilience to disk failures, HDFS is an ideal system for storing and processing data for analytics.

In this step-by-step tutorial, Toptal Database Developer Dallas H. Snider details how to migrate existing data from a PostgreSQL database into the more efficient HDFS.

Genome data is one of the most widely analyzed datasets in the realm of Bioinformatics. The SciPy stack offers a suite of popular Python packages designed for numerical computing, data transformation, analysis and visualization, which is ideal for many bioinformatic analysis needs.

In this tutorial, Toptal Software Engineer Zhuyi Xue walks us through some of the capabilities of the SciPy stack. He also answers some interesting questions about the human genome, including: How much of the genome is incomplete? How long is a typical gene?

As a language, R is strongly tied to data and is thus used mostly by statisticians and data scientists. Many who already use R for machine learning, though, are not aware that data munging can be done faster in R, meaning another tool is not required for that task.

In this article, Freelance Software Engineer Jan Gorecki explores tabular data transformations and introduces us to one of the fastest open-source data wrangling tools available.

Ever tried to create a JSON data structure that includes entities with bidirectional relationships? If you have, you know that this often results in errors or exceptions being thrown.

In this article, Toptal Freelance Software Engineer Nirmel Murtic provides a robust working approach to avoiding these errors when creating JSON structures that included entities with bidirectional (i.e. circular) relationships.

More than 60 percent of trading activities with different assets rely on automated trading and machine learning instead of human traders. Today, specialized programs based on particular algorithms and learned patterns automatically buy and sell assets in various markets, with a goal to achieve a positive return in the long run.

In this article, Toptal Freelance Data Scientist Andrea Nalon explains how to predict, using machine learning and Python, which trade should be made next on the S&P 500 to get a positive gain.

Clustering algorithms are very important to unsupervised learning and are key elements of machine learning in general. These algorithms give meaning to data that are not labelled and help find structure in chaos. But not all clustering algorithms are created equal; each has its own pros and cons.

In this article, Toptal Freelance Software Engineer Lovro Iliassich explores a heap of clustering algorithms, from the well known K-Means algorithm to the elegant, state-of-the-art Affinity Propagation technique.

When dealing with data analysis, most companies rely on MS Excel or Google Sheets, but dealing with data presented this way isn’t very eye-catching or intuitive. It’s once you add visualizations to this data that things become a little easier to manage. That’s the topic of today’s tutorial by our guest author from Adobe, Rohit Boggarapu. Join us as he guides us though the process of making interactive drill-down charts using jQuery and FusionCharts.

Today, a massive amount of data is available in the form of networks or graphs. For example, the World Wide Web, with its web pages and hyperlinks, social networks, semantic networks, biological networks, citation networks for scientific literature, and so on.

A tree is a special type of graph, and is naturally suited to represent many types of data. The analysis of trees is an important field in computer and data science. In this article, we will look at the analysis of the link structure in trees. In particular, we will focus on tree kernels, a method for comparing tree graphs to each other, allowing us to get quantifiable measurements of their similarities or differences. This an important process for many modern applications such as classification and data analysis.

Hackathons often inspire engineers to create amazing software. By blending various technologies together, really useful and often fun projects can be realized in a short period of time.

In this article, Toptal engineer Radek Ostrowski shares his experience participating in the IBM Sparkathon, and walks us through how he elegantly combined the power of Apache Spark and Docker in IBM Bluemix to build a weather app.

Machine Learning, in computing, is where art meets science. Perfecting a machine learning tool is a lot about understanding data and choosing the right algorithm. But why choose one algorithm when you can choose many and make them all work to achieve one thing: improved results.

In this article, Toptal Engineer Necati Demir walks us through some elegant techniques of ensemble methods where a combination of data splits and multiple algorithms is used to produce machine learning results with higher accuracy.

In this article, Toptal engineer Ivan Voras provides a useful overview and comparison of multi-processing network server models, with the goal being to take some of the mystery out of writing high performance networking code. The article is intended for “system programmers”, i.e., back-end developers who will work with the low-level details of their applications, implementing network server code.

Developers often work on only one machine, and have their whole development environment on that machine. Testing database replication before deploying changes in this kind of a development environment can be a challenging task.

In this article, Toptal engineer Ivan Bojovic guides us through a step-by-step tutorial on how to implement MySQL master-slave replication on one machine.

Developers sometimes run into tasks that require access to email mailboxes. In most cases, this is done using the Internet Message Access Protocol, or IMAP. As a PHP developer, I first turned to PHP’s built in IMAP library, but this library is buggy and impossible to debug or modify. So today we will create a working IMAP email client from the ground up using PHP. We will also see how to use Gmail’s special commands.

Image processing algorithms are often very resource intensive due to fact that they process pixels on an image one at a time and often requires multiple passes. Successive Mean Quantization Transform (SMQT) is one such resource intensive algorithm that can process images taken in low-light conditions and reveal details from dark regions of the image.

In this article, Toptal engineer Daniel Angel Munoz Trejo gives us some insight into how the SMQT algorithm works and walks us through a clever optimization technique to make the algorithm a viable option for handheld devices.

When Mike Bostock created D3.js, he introduced a tried and true reusable charts pattern for implementing the same chart in any number of selections. However, the limitations of this pattern are realized once the chart is initialized. In this article, Toptal engineer Rob Moore presents a revised reusable charts pattern that leverages the full power of D3.js.

Bitcoin created a lot of buzz on the Internet. It was ridiculed, it was attacked, and eventually it was accepted and became a part of our lives. However, Bitcoin is not alone. At this moment, there are over 700 AltCoin implementations, which use similar principles of CryptoCurrency.

Google’s new cloud code platform does not appear to be taking on GitHub head on. Instead, Cloud Source Repositories (CSR) will allow users to connect to repositories hosted on GitHub or Bitbucket. However, everything is automatically synced to the Google Cloud Source Repository.

The good news is that a Google CSR can be connected to another Git repository hosted on GitHub or Bitbucket. All changes will be synchronised on both platforms, as you can set Google CSR to automatically mirror from GitHub and Bitbucket.

In-memory data collection manipulation is something that we often need to do in data-centric reporting and visualization applications. When needed, we often tend to resort to complex loops, list comprehensions, and other suboptimal means, which can easily end up being a huge mess of hard-to-maintain spaghetti code. Supergroup.js is an in-memory data manipulation library that can be used to solve some common data manipulation challenges on limited datasets.

To retain its users, any application or website must run fast. For mission critical environments, a couple of milliseconds delay in getting information might create big problems. As database sizes grow day by day, we need to fetch data as fast as possible, and write the data back into the database as fast as possible. To make sure all operations are executing smoothly, we have to tune Microsoft SQL Server for performance.

Microsoft Bond is a modern data serialization framework. It provides powerful DSL and flexible protocols, code generators for C++ and C#, efficient protocol implementations for Windows, Linux, and Mac OS X. This article is a quick guide of the features and use of this framework.

Analysts have come to recognize social network data as a virtual treasure trove of information for sensing public opinion trends and groundswells of support. In this article, Toptal Engineer Elder Santos describes the techniques he employed for a proof-of-concept that effectively analyzed Twitter Trend Topics to predict, as a sample test case, regional voting patterns in the 2014 Brazilian presidential election.

Machine learning has changed the way we deal with data. Data driven problems, that are difficult to solve using standard methods, can often be tackled with much more ease using machine learning algorithms. In this article, we will explore Azure Machine Learning features and capabilities through solving one of the problems that we face in our everyday lives.

In today’s data driven world, researches are busy answering interesting questions by churning through huge volumes of data. Some obvious challenges they face are due the sheer size of dataset that they have to deal with. In this article, we take a peek at a simple business intelligence platform implemented on top of the MongoDB Aggregation Pipeline.

Although the most wide-spread and supported way of running Django is on a Linux system (e.g., with uwsgi and nginx), it actually doesn’t take much work to get it to run on IIS. In this article, Toptal Engineer Ivan Voras walks you through a step-by-step tutorial, clearly explaining how to install Django on IIS.

When coming across the term “text search”, one usually thinks of a large body of text, which is indexed in a way that makes it possible to quickly look up one or more search terms when they are entered by a user. This is a classic problem in computer science, to which many solutions exist.

But how about a reverse scenario? What if what’s available for indexing beforehand is a group of search phrases, and only at runtime is a large body of text presented for searching?

Bitcoin blockchain is the backbone of the network and provides a tamper-proof data structure, providing a shared public ledger open to all. This article provides insight in blockchain technology, current status and its potential.

Since almost all smartphones today are equipped with location sensors, motion sensors, bluetooth, and wifi, today’s mobile apps can use context awareness to dramatically increase their capabilities and value. This article walks you through building a context aware app that employs complex event processing.

There are thousands of different Android powered devices, with different screen sizes, chip architectures, hardware configurations, and software versions. Unfortunately, segmentation is the price to pay for openness, and there are thousands ways your app can fail on different devices.

Regardless of such huge segmentation, the majority of bugs are actually introduced because of logic errors. These bugs are easily prevented, as long as we get the basics right!

Here’s a quick rundown of the 10 most common mistakes Android developers make.

How do we understand and interpret the huge amounts of data coming out of simulations? How do we visualize potential gigabytes of datapoints in a large dataset? In this article I will give a quick introduction to VTK and its pipeline architecture, and go on to discuss a real-life visualization example.

React.js is a fantastic library. It is only one part of a front-end application stack, however. It doesn’t have much to offer when it comes to managing data and state. Facebook, the makers of React, have offered some guidance there in the form of Flux. I’ll introduce basic Flux control flow, discuss what’s missing for Stores, and how to use Backbone Models and Collections to fill the gap in a “Flux-compliant” way.

Data conversion, translation, and mapping is by no means rocket science, but it is by all means tedious. This article introduces MetaDapper, a .NET library that strives to simplify, streamline, and automate the data conversion process to the greatest extent possible.

Once you step beyond the comfortable confines of English-only character sets, you quickly find yourself entangled in the wonderfully wacky world of UTF-8.

Indeed, navigating through UTF-8 related issues can be a frustrating and hair-pulling experience. This post provides a concise cookbook for addressing these issues when working with PHP and MySQL in particular, based on practical experience and lessons learned.

A potentially critical problem, nicknamed “Heartbleed”, has surfaced in the widely-used OpenSSL cryptographic library. The vulnerability is particularly dangerous in that potentially critical data can be leaked and the attack leaves no trace.

As a user, chances are that sites you frequent regularly are affected and your data may have been compromised. As a developer or sys admin, sites or servers you’re responsible for are likely to have been affected.

Here are the key facts you need to know about this dangerous bug and how to mitigate your vulnerability.

The recent resurgence in Artificial Intelligence has been powered in no small part by a new trend in machine learning, known as “Deep Learning”. In this article, I’ll introduce you to the key concepts and algorithms behind Deep Learning, beginning with the simplest building block.

The need to adapt legacy code and systems to meet modern day performance and processing demands is widespread. This post provides a case study of the use of Erlang and CloudI to adapt legacy code, consisting of a decades-old collection of multi-user game software written in C, to the 21st century.

Testing. It always seems to get left to the last minute, then cut because you’re out of time, budget, or whatever else. Management wonders why developers can’t just “get it right the first time”, and developers (especially on large systems) can be taken off-guard when different stakeholders describe different parts of the system.

With behavior-driven development, you can turn testing into a shared process that focuses on the behaviors of the system, why they matter, and who cares.

As a veteran telecommuter through multiple jobs in my career, I have witnessed and experienced the many joys of being a remote worker. As for the horror stories, I have more than a few I could tell. With a bit of artistic inclination and a talent for mathematics, I also have a fascination with patterns: design patterns, architectural patterns, behavioral patterns, social patterns, weather patterns—all sorts of patterns!

When I first encountered anti-patterns, I discovered a trove of wisdom I wish I had known before I had learned the hard way. Anti-patterns are recognizable repeated patterns that contribute significantly to failure. For example, the manager that keeps interrupting the employee in order to see if the employee is getting any work done is engaging in an anti-pattern that serves to prevent the employee from getting any work done!

Based on my own experiences and experiences of friends and co-workers, I am assembling descriptions of anti-patterns related to telecommuting.

In 2007, Bennett Haselton revealed a minor hack with major implications: querying ranges of numbers on Google would return pages of sensitive information, including Credit Card numbers, Social Security numbers, and more. While Haselton’s hack was addressed and patched, I was able to tweak his original technique to bypass Google’s filter and return the same old dangerous results.

Database tuning can be an incredibly difficult task, particularly when working with large-scale data where even the most minor change can have a dramatic (positive or negative) impact on performance.

In mid-sized and large companies, most database tuning will be handled by a Database Administrator (DBA). But there are plenty of developers who have to perform DBA-like tasks; meanwhile, DBAs often struggle to work well with developers.

In this article, Toptal Freelance Software Engineer Rodrigo Koch provides developers with database tuning tips and explains how developers and DBAs can work together effectively.

From the very first days in our lives as programmers, we’ve all dealt with data structures: Arrays, linked lists, trees, sets, stacks and queues are our everyday companions, and the experienced programmer knows when and why to use them.

In this article we’ll see how an oft-neglected data structure, the trie, really shines in application domains with specific features, like word games.

Web Developers often fail to consider the consequences of thousands of users accessing our applications at the same time. Perhaps it’s because we love to rapidly prototype; perhaps it’s because testing such scenarios is simply hard.

Regardless, I’m going to argue that ignoring scalability is not as bad as it sounds—if you use the proper set of tools and follow good development practices. In this case: the Play! framework and the Scala language.

But this isn’t just another article about cohort analysis. If you already know the importance of the topic and want to skip the introduction, you can jump to the simulator, where you can either simulate startup growth based on retention, churn, and a number of other factors, or analyze your own PayPal logs with the code I’ve open sourced.

If, however, you don’t realize that these are some of the most important metrics around–continue reading.

Porn is a big industry. There aren’t many sites on the Internet that can rival the traffic of its biggest players.

And juggling this immense traffic is tough. To make things even harder, much of the content served from porn sites is made up of low latency live streams rather than simple static video content. But for all of the challenges involved, rarely have I read about the developers who take them on. So I decided to write about my own experience on the job.