In the last course of our specialization, Overview of Advanced Methods of Reinforcement Learning in Finance, we will take a deeper look into topics discussed in our third course, Reinforcement Learning in Finance.
In particular, we will talk about links between Reinforcement Learning, option pricing and physics, implications of Inverse Reinforcement Learning for modeling market impact and price dynamics, and perception-action cycles in Reinforcement Learning. Finally, we will overview trending and potential applications of Reinforcement Learning for high-frequency trading, cryptocurrencies, peer-to-peer lending, and more.
After taking this course, students will be able to
- explain fundamental concepts of finance such as market equilibrium, no arbitrage, predictability,
- discuss market modeling,
- Apply the methods of Reinforcement Learning to high-frequency trading, credit risk peer-to-peer lending, and cryptocurrencies trading.

教學方

Igor Halperin

腳本

So, so far in this specialization our examples implicitly assumed a very classical model of a financial market. It's called a quote-driven markets and it works as follows: First, it's centralized market. It's centered around the market-maker also known as a specialist or a dealer. The dealer aggregates or buy and sell orders and sets quotes for bid and ask prices. This means that the dealer is committed to buy and sell the asset for prices quoted. This is called providing liquidity in the financial jargon. So, there are separate bid and ask prices for each asset and performance of a trade is guaranteed by the dealer. The price that we typically operate with in conventional financial models is the so-called mid-price which is just the average between the bid and ask prices in this scheme. An example of such system could be the specialized system of New York Stock Exchange or NYSE. So, on this photograph taken from the Wikipedia page you can see the trading floor of NYSE about a century ago. These gentleman standing there are market specialists. Back then they would get in, buy and sell orders and shouted out their quotes at the floor of the exchange. These days are long gone by now. So today's orders are executed electronically, but the essence of the mechanics is essentially the same as in this picture. Which means that in a quotes-driven driven market there are always two prices for the same security, to buy or to sell, and you can't submit the market order in this market, and it will be executed by a dealer on your behalf. This is also market assumed in a classical financial models such as the Black-Scholes option pricing model. We discussed this model in our course on reinforcement learning and also in the first two weeks of this course. Let me remind you some characteristics of this model. First, this model provides a single number as the price of an option. This unique option price is given by the expected cost of hedging of this option. This model also assumes infinite liquidity. This means that you can buy any number of shares and it will not have any impact on the price. Also, this model doesn't take into account transaction costs. In the course on reinforcement learning we looked into how to bring risk of miss-hedging back into option pricing. We also mentioned that in certain regimes the assumptions of the Black-Scholes model about the market become especially problematic, and two such cases will be very large auction traits and high-frequency trading. In both cases, non-vanishing market impact and finite market liquidity become critical factors. In both cases, it turns out that the mechanics of execution of buy and sell orders is important to understand these aspects of market behavior. This is usually referred to as market microstructure. This is what we are going to talk about a bit in this lecture. So, let me start with another photograph of a part of New York Stock Exchange taken now these days rather than a 100 years ago. This is a photograph of a data center of New York Stock Exchange. There are no more people shouting on the floor. The trading is now done electronically. All market orders are now registered and stored as historical data, sampled at a very high resolution. So, fast execution of market orders and the availability of transaction data is one characteristic of modern markets with electronic execution, but it turns out that it's not the only new thing that electronic stock exchange trading platforms bring to us. It also changes the ways the market operates. Modern exchanged even markets are so-called order-driven markets. Electronic platforms such as New York Stock Exchange or NASDAQ aggregate all available orders into a limit order book or LOB for short. We will talk at lengths about the LOB in our next video while here we start with the wider picture. Also very importantly, such platforms are examples of decentralized markets. So, the market is not locked anymore around one dealer that just takes on all trades by providing liquidity on both the buy and sell side. These days order-driven markets amount to more than 80 percent of all traded stock volumes. The same stock can be traded at different exchanges or venues. Now, what changes drastically in such markets relatively to the classical quotes-driven markets are several things. First, both buyers and sellers now have nearly real-time access to microscopic level of the market state. Market now operates as a two-side market platform known as a double auction. It's called double auction because it essentially matches buyers and sellers both ways by auctioning, as we will see shortly. Because it works as a double auction, such markets lead to highly complex dynamics. The dynamics arise as a result of interactions between many heterogeneous agents. Therefore, the limit order book can be viewed as an example of a highly complex system that can be modeled using a number of appropriate methods. Some of them are described in the paper by Rama Cont that I quote on this slide and you can consult this paper for more details. So, the first thing that should be emphasized about electronic trading with a limit order book is that the amount of data available increases dramatically. A table that you see here is taken from the paper that I cited which shows data for three stocks; Citigroup, General Electric and General Motors. It shows that the average number of orders for each of these highly liquid stocks is measured in thousands within a 10 second interval. Price changes within one days are in tens of thousands. Please note that this table refers to almost 10 years old data. So to days, these numbers are even higher. Another very important point to understand here is the hierarchy of time scales in the markets. Here's another table taken from the same paper or by Rama Cont that might help to understand that. There are three broad categories of time scales here. The first one can be referred to as ultra-high frequency regime and the paper by Cont defines those as time scales below 0.1 second. For this regime called market microstructure latency of execution and liquidity become the most important factors. The secondary regime is between one sec and 100 sec, and it's called the high frequency regime. For example, such problems as optimal execution fall into these time scale regime. Finally, wherever regime of time scales above 100 secs which we can call collective daily regime. This regime is mostly relevant for such tasks as trading strategies, option hedging, optimal portfolio management and so on. So, let's take a short stop here and continue momentarily after our Q&A session.