Extract the Value of Cryptocurrency from Sentiment Analysis

This is a computer translation of the original content. It is provided for general information only and should not be relied upon as complete or accurate.

Sorry, we can't translate this content right now, please try again later.

Deep learning provides a way to analyze sentiments about cryptocurrencies by scanning and evaluating comments across the web, including news headlines, Twitter* posts, and Reddit* posts.

"I have learned that there is correlation between sentiment and cryptocurrency prices. This is something that may be helpful to other developers: to explore new industries and see how current AI technologies can be applied to create a solution or insight within a new area."

Challenge

Determining the valuation of cryptocurrencies is difficult because the worth does not tightly correspond to factors such as cash flow or available assets, as it does in the conventional stock market. With dozens of currencies available in the market, a system is needed to methodically evaluate their worth.

Solution

Emerging neural network models, including recursive neural tensor networks (RNTN), provide a promising mechanism for determining the feelings about any given currency by scanning and parsing comments in the social media. Favorable sentiments about a currency can be shown to correspond with an uptick in the currency’s value across digital coin exchanges.

Background and Project History

Teju Tadi, an Intel AI Ambassador with a strong interest in blockchain and cryptocurrencies, launched a project in May 2017, applying deep learning techniques to investigate the correlation between trader sentiments for cryptocurrencies and their market value. Sentiment analysis as a way of helping gauge trading trends has already gained adherents in the traditional stock market. Because the valuation data available for cryptocurrencies is more nebulous, Teju is refining techniques to combine trader sentiments with other factors to create better ways to anticipate trends.

"Many firms in the equities space have been employing techniques," he said, "which make key investment decisions based on social media data and news headlines. There are algorithms that make decisions to instantaneously buy a certain equity as soon as positive news is released—faster than any one person could. The same methodology, I thought, could be applied to the cryptocurrency space, because a lot of these currencies are sentiment driven. Social media and news inevitably affect the prices of various currencies greatly."

"I first got interested in blockchain technology and cryptocurrencies back when I was high school," Teju said. "At that time there were fewer than 100 cryptocurrencies and the industry was still in its infancy. The market evolved once the Ethereum blockchain protocol launched. Ethereum and other platform tokens enabled developers to deploy software for blockchain protocols much faster. As of today, there are over 1600 cryptocurrencies and blockchain protocols and many projects still launching."

Teju’s interest in machine learning and deep learning grew as he began working on projects outside of school. A project using machine-learning strategies, ProvidR, won first place in the Google* Community Leaders Program (CLP) Case Competition. ProvidR offered a scalable solution to food insecurity, aggregating requests by people who visited a particular food bank. He then began a project working with Intel, Face It, which employed deep learning techniques to map an individual’s facial structure and then recommend a hairstyle suited to their appearance. This early work led to his current endeavor, Deep Learning for Cryptocurrency Trading, that capitalizes on the expertise he has gained in finance, blockchain architecture, cryptocurrency, and AI.

"Working with Intel’s AI Student Ambassador program, Teju said, "has opened so many opportunities for me. It’s allowed me to connect with peers working in AI space across the world. It gives me access to Intel engineers who are more than willing to help me on my projects. Intel also provides me with access to hardware which helps me test and deploy any applications that I am working on. Most of all it gives me a great environment, which fosters and enables me to pursue my AI-related interests and projects."

The Evolution of Cryptocurrencies and Blockchain

Cryptocurrencies serve as a medium for exchanging digital assets; transactions within this medium are recorded using blockchain techniques, which operate as an encrypted, electronic ledger providing a permanent history of all activities. Because cryptocurrencies are issued based on a finite supply, investors anticipate that rising demand will generate increasing value in the long term. The tamper-resistant nature of the blockchain's historical entries—logically linked together in a continuous chain—provide a mechanism to securely authorize and log transfers of cryptocurrencies from one party to another.

Even as the potential of blockchain as a secure digital alternative to traditional financial processes is being realized, the underlying blockchain architecture—enabling distributed database capabilities—presents opportunities in other applications. Enterprises are exploring ways in which blockchain could be used as a part of Internet of Things (IoT) solutions to manage and gain insights into supply chain activities and global operations. Blockchain architecture could also be used in healthcare information systems to distribute information from medical devices and consolidate and distribute an individual's healthcare records.

"Because blockchain is based on a distributed, peer-to-peer topology where data can be stored globally on thousands of servers—and anyone on the network can see everyone else’s entries in real-time—it’s virtually impossible for one entity to gain control of or game the network."1

—Lucas Mearian, senior reporter, Computerworld

Commercial Opportunities

In an article for the Intel® Developer Zone (Intel® DZ), Teju stated, "The long-term vision of this project is to be able to develop an AI cryptocurrency trading bot that can not only consider trader sentiment to make trading decisions but also take advantage of other opportunities such as arbitrage, which is the purchase and sale of an asset to profit from a difference in the price."

Teju took the insights gained from his research as he helped establish a business, Mycointrac*, focused on providing cryptocurrency market intelligence. "Once the product is fully developed," he said, "I plan to utilize the data provided by it as one of the factors to make key investment decisions for my new cryptocurrency hedge fund, Sentience Investments L.P., which has been operational since January first. The plan is to develop trading strategies based on a number of high-frequency, machine-learning techniques, as well as deep learning and sentiment analysis."

Each individual exchange, Teju explained, has its own supply and demand and its own set of buyers and sellers. The market as it stands is very inefficient. A cryptocurrency in a certain exchange might be trading at USD 3 and then, in another market, it will be trading at USD 5, for the same currency. Traders use these differential valuations to advantage by executing arbitrage—buying on the exchange where it is trading for USD 3 and then selling it at the exchange where it is trading for USD 5 for a riskless profit of USD 2.

"In Mycointrac, if you click on one of the coins on the market and scroll," said Teju, "you'll see the gap price versus the index. That, basically, is arbitrage. That is showing you how much it is trading up or down at a certain exchange. Sometimes it is like a few percent. Sometimes it could be 50 percent or 100 percent. You will easily see 5 to 10 percent differences on most coins." Figure 2 shows live sentiment tracking displayed in Mycointrac. Working with a team, Teju is taking insights from this project to launch a new site, Mycoinrisk*, focusing on fraud prevention and risk mitigation in the cryptocurrency space.

Figure 2. Live sentiment mapping on Mycointrac gauges cryptocurrency chatter on the web.

"As machine learning and artificial intelligence (AI), applications continue to increase and impact accounting and finance responsibilities, the human professionals have an opportunity as well. Not only will they be more productive and proficient, but they will be able to handle more clients and deliver more value because they can determine actionable insight rather than just crunch numbers. Machines will be able to propel innovation in the industry."2

—Bernard Marr, author and keynote speaker on Business and Technology

As is the case with all AI projects, models are continually refined over time, with iterative training to strengthen the results and improve precision of the output. Teju explored several neural network models before deciding on an RNTN as the most effective way to perform natural language processing of social media feeds and news items.

"Many of these cryptocurrency price movements," Teju said, "could be determined by herd instinct. Herd Instinct, according to behavioral finance, is a mentality characterized by lack of individual decision making, causing people to think and act in the same way as the majority of those around them. The price movements tend to be based on market sentiment and the opinions of the communities surrounding the cryptocurrency. Based on these reasons, I believe that sentiment analysis of news headlines, Reddit posts, and Twitter* posts should be the best indicator of the direction of cryptocurrency price movements."

Recurrent neural networks (RNNs) have been a prominent technique for sentiment analysis, Teju noted. RNNs parse a string of text and tokenize the words, determining the frequency of words used and creating what is called a bag-of-words model, often used in document classification with word frequency being used to train a classifier. The subjectivity of each word is searched from a lexicon in which emotional values were prerecorded by researchers. From this data, the overall sentiment is gauged.

"RNNs work well for longer texts," Teju said, "but are ineffective at analyzing sentiment in shorter texts, such as news headlines, Reddit posts, and Twitter posts. RNNs fail to consider all the semantics of linguistics by failing to consider compositionality—the order of words in a string. Because of this, RNNs are ineffective at identifying change in sentiment and understanding the scope of negation."

Recursive Neural Tensor Network

After considering the alternatives, Teju decided that an RNTN would be the best option for his project because of its capability of being able to assess the semantic compositionality of text. For shorter pieces of text, such as a tweet, the compositionality is vital to being able to accurately determine sentiment from a sparse set of information.

"RNTNs," Teju said, "are great at considering syntactical order. RNTNs are made up of multiple parts including the parent group known as the root, the child groups known as the leaves, and the scores. Leaf groups receive input and the root group uses a classifier to determine the class and score."

Recursive neural tensor networks (RNTNs) are neural nets useful for natural-language processing. They have a tree structure with a neural net at each node. You can use recursive neural tensor networks for boundary segmentation, to determine which word groups are positive and which are negative. The same applies to sentences as a whole.

Word vectors are used as features and serve as the basis of sequential classification. They are then grouped into subphrases, and the subphrases are combined into a sentence that can be classified by sentiment and other metrics.

Data that is ingested by the sentiment analyzer is parsed into a binary tree. Specific vector representations are formed of all the words and are represented as leaves. From the bottom up, these vectors become the parameters to optimize and serve as feature inputs to a softmax classifer*. Vectors are classified into five classes and assigned a score.

"The next step," Teju said, "is where recursion occurs. When similarities are encoded between two words, the two vectors move across to the next root. A score and class are outputted. A score represents the positivity or negativity of a parse while the class encodes the structure in current parses. The first leaf group receives the parse and then the second leaf receives the next word. The score of the parse with all three words are outputted and it moves on to the next root group."

"The recursion process continues until all inputs are used up, with every single word included. In practical applications RNTN’s end up being more complex than this. Rather than using the immediate next word in a sentence for the next leaf group, an RNTN would try all the next words and eventually checks vectors that represent entire sub-parses. Performing this at every step of the recursive process, the RNTN can analyze every possible score of the syntactic parse."

Figure 3 shows an example of how a sentence is parsed and analyzed using an RNTN approach.

Enabling Technologies

Teju gave a nod to the benefits of using of Intel® technologies in his projects. "My solution utilizes Intel® AI DevCloud, Intel® Distribution for Python*, and Intel® Optimization for Caffe*," he said.

Intel AI DevCloud served as the development platform during the early stages of the project. "At the start of the project, for the sandbox version, I used Intel AI DevCloud to run the recurrent neural networks and experiment with Twitter data to see how the models were working. For my initial project with Intel and for my current project with Intel, it is completely using Intel AI DevCloud and its supporting technologies."

Intel AI DevCloud, a server cluster featuring Intel® Xeon® Scalable processors, available to Intel® AI Developer Program members free of charge, is preloaded with frameworks and tools to quickly launch machine learning and deep learning projects. Pre-installed components include neon™ framework, Intel® Optimization for Theano*, Intel® Optimization for TensorFlow*, Intel® Optimization for Caffe*, and the Keras library. Connections, once approved, take only about 10 minutes to set up, by means of a Linux® terminal or graphical user interface client, such as PuTTY. Access through Microsoft Windows* is also supported. At this point, you're ready to begin training models or running Python* code. For a thorough introduction to the process, read Getting started with the Intel AI DevCloud. Deep learning can be a challenge for those just getting started, but fortunately there are many resources for gaining an understanding of models and initiating training.

Connections with libraries, code, and examples also proved invaluable during the project development. "Intel® Developer Zone (Intel® DZ) has been a great resource to learn about how I could utilize Intel technologies to better build my product," Teju said. "There are also a great number of projects out there built by peers in the AI industry from which I got both motivation and valuable insight into the various ways AI and ML were being used in a wide range of industries and use cases."

"I recommend checking out the Stanford Sentiment Treebank to learn about recursive neural tensor networks. I also recommend taking a look at the Intel Developer Zone as there are many libraries, tutorials, technical articles, and a plethora of digital content that you could learn from."

AI is Providing New Opportunities in the Financial Sector

Through the design and development of specialized chips, sponsored research, educational outreach, and industry partnerships, Intel is firmly committed to advancing the state of AI to solve difficult challenges in medicine, manufacturing, agriculture, scientific research, and other industry sectors. Intel works closely with government organizations, non- government organizations, educational institutions, and corporations to uncover and advance solutions that address major challenges in the sciences.

For example, consistently gaining high returns on stock market investments has been phenomenally difficult, even for very experienced investors. Bringing AI techniques to bear on this challenge, an international team devised algorithms using past market data to simulate real-time investment. These techniques demonstrated a 73 percent return on investment compared with the 9 percent typical of real market scenarios. The algorithms proved particularly effective during times of extreme market volatility, suggesting that AI can detect and respond to patterns that human investment managers fail to recognize. The lead author of the study was Dr. Christopher Kraus, chair for Statistics and Econometrics at the School of Business and Economics at Germany’s Friedrich-Alexander- Universität Erlangen-Nürnberg.4

"AI's role in finance may not attract as much attention in Hollywood, but it is likely to have a far greater economic impact than consumer tech. From extending investment opportunities to the underbanked to thwarting fraud to mitigating investment risks, AI has the potential to not only revolutionize the industry, but also to improve the financial health of millions of people in the US and across the world."5

For Intel® AI Developer Program members, the Intel® AI DevCloud provides a cloud platform and framework for machine learning and deep learning training. Powered by Intel Xeon Scalable processors, the Intel AI DevCloud is available for up to 30 days of free remote access to support projects by academy members.