The problem with the paper is not overfit. They claim to have run their simulation with out of band ("live") data. The actual problem with the paper is that we have no idea if their simulator is any good, which means that their result (89% return in 50 days) could be totally bogus. In other words, we don't know if the actual bitcoin exchange would fill their orders at the same prices (if at all) as their simulator does. A decent simulator for a high-frequecy strategy like this is not trivial because you have to incorporate all the exchange behavior (documented or not) into your simulator, and then you have to validate your results by comparing your simulated results to the results of some actual trading. The fact that they spent none of the paper on the details of the simulator makes me extremely skeptical.
People so often overlook the role that execution plays in trading. As time scales shrink the impact of execution increases, and indeed many HFT strategies are not profitable without top tier execution (both in terms of fee schedules and technology). This is also one of the places that most academics fail when analysing a trading strategy. They make typical "assume a frictionless surface" types of assumptions that break down in a real market.
To your point, high-frequency simulation is damn hard and it is highly likely that they failed here and have tainted their results with bad simulations. Most of these papers would be well served to avoid dollars and cents and simply analyse the statistical qualities of their signal to their target over time. That is the first step in this business, anyway. Simulations only happen once the signal has been through the wringer.
I see no mention of the spread, costs or even ability to put on a short position, exchange lag variability (huge issue when simulating even on modern exchanges, let alone fly-by-night bitcoin markets). Additionally, it is very easy to find trend-following signals that "work" but break down when conditions change very quickly. That seems to make up most of the prediction, along with an order book imbalance signal that might be more stable.
I think a better approach would be looking at order book features and lags vs. other markets. The costs are high on BTC markets so it would be pretty tough to overcome those though. Anyone with experience to do this is probably doing it somewhere more lucrative. I think BTC markets only trade a few million USD a day.
About 35 million USD in trades on the public bitcoin-paired markets today. About 40 million USD today in total volume for all pairs on these exchanges.
Public spot FX markets and OTC trading do huge volumes too.
Some BTC exchanges charge like 60 bps per trade. Finding a signal to overcome that cost and do enough trading to make it worthwhile would be quite difficult.
Even with live data as a validation set (and ignoring your point about execution friction), it's inappropriate to naively infer expected future return from total return during the validation run, since asset price fluctuations don't follow a normal distribution. Very high volatility events happen far more commonly than a Bell curve predicts, hence a few lucky or unlucky outliers can skew your total. The Sharpe ratio which they used to measure risk exposure also dangerously assumes a normal distribution.
>we don't know if the actual bitcoin exchange would fill their orders at the same prices
This is the problem with many simulated trading algorithm "success" stories. No one actually knows the effect that the simulated trades would have on the real market, or how others trading against the algorithm would react to these trades. The numbers may be somewhat close to reality if the simulated trades are very small in relation to the trading volume for the asset. However, Bitcoin markets are small to begin with, which probably makes the results in this paper unreliable.
>Specifically, every two seconds they predicted the average price movement over the following 10 seconds. If the price movement was higher than a certain threshold, they bought a Bitcoin; if it was lower than the opposite threshold, they sold one; and if it was in-between, they did nothing.
If they are indeed predicting the price at X+10 seconds at second X, they have more than enough time to act on that info without having to do HFT.
That assumes there is no bias to their signal returns over time, which is extremely unlikely. 90% of the moves that go their way could happen in the first second, for example.
Many HFTs hold positions for seconds or minutes, and predict prices out over similar time horizons, but they'll lose their best winners to competitors if they aren't fast and adept at executing.
This is short-term trading. "Every two seconds they predicted the average price movement (on OKcoin) over the following 10 seconds. If the price movement was higher than a certain threshold, they bought a Bitcoin; if it was lower than the opposite threshold, they sold one; and if it was in-between, they did nothing." I don't see them allowing for commissions and fees.
OKcoin, at peak, had a trading volume so high that it's generally considered to be fake - the exchange operators manipulating the price. What this group at MIT may have done is reverse-engineered the fake trade generation algorithm.
> What this group at MIT may have done is reverse-engineered the fake trade generation algorithm.
Just to be clear, there is nothing wrong with this. Infact, sitting around and reverse engineering what other traders are doing is what many funds do. I'm in this group so I"m happy to answer questions if anyone has any.
> Every two seconds they predicted the average price movement (on OKcoin) over the following 10 seconds. If the price movement was higher than a certain threshold, they bought a Bitcoin; if it was lower than the opposite threshold, they sold one; and if it was in-between, they did nothing.
To be clear, this is the core of what most HFT systems do these days. Consume many different factors, give each factor a decay factor to tell the system when the signal goes stale and distill them all into a value that says, but or sell or stay.
> Just to be clear, there is nothing wrong with this. Infact, sitting around and reverse engineering what other traders are doing is what many funds do. I'm in this group so I"m happy to answer questions if anyone has any.
His point is that the posted trades were not in fact tradeable.
Where can I read about the actual strategies used? Perhaps some that aren't used any more. I've read an absolute ton about how to set up a trading system with hadoop and cassandra and blah blah but that's all pretty trivial. I'd be more interested in the strategies... this decay factor is interesting. Where did you find out about this?
A simple signal would be more shares bid than offered at the inside market. Imagine a market with 1000000 shares bid at $4 and 100 shares offered at $4.01, it's more likely to tick up than down in the very near-term. This isn't tradable though since the predictions aren't large enough to overcome the spread and you can't expect to join the 1000000 share bid and trade when it's good (the 1000000 shares that made you think it was going to move up all have to trade or cancel for you to trade, not looking so great anymore).
Real HFTs use other features of the order book, movements in related products (maybe a move in oil futures in the last minute impacts the price of an airline stock in the next 10 seconds, etc.), and so on. The signals are not really that complicated, but competitors eventually converge on knowing the same signals and compete them out of existence/profitability. Also, how you execute around your prediction matters just as much if not more.
This is how I trade currencies, usually a Yen pair on a 30min timeframe. It does work and money can be made, but it is a lot less profitable then they claim (on the Yen pairs I trade). Generally, a very outsized event can erase quite a lot of gains. Even smaller ones can make you give back a bit (such as NY ebola news or surprise BoJ news). Over time those commissions/fees add up. Still, over time I would bet they could be profitable on this strategy on some timeframe. Of course, just profitability is a very different claim than their own.
This seems like massive historical overfit, which can lead to arbitrarily precise fit, but no predictive capability.
Any model, if given enough parameters, can be made to match historical data to an arbitrary degree.
I also run several Bitcoin bots. I can tell you that slippage is not insignificant. If you make transactions every ~10 seconds and incur 0.1% fees each time, this is an extremely significant effect in aggregate. Also bid-ask spreads, while usually small, often aren't in periods of high volume.
Agreed. According to the paper: They trained the data, once, with 3-4 months of data from Feb to May. Then they tested the data, once, with ~6 weeks of data from May 6 - June 24.
There was no cross validation involved in assessing the performance of this model. Using one subset for training, and one subset for testing and calling that conclusive is naive.
No interesting conclusions can be taken away from this paper due to flawed methodology and failure to take into account real world variables like bid-ask spread, commissions, how quickly trades can be executed, whether orders will even be filled, etc.
Cross validation too uses training sets and test sets. This sort of time-ordered data will not be independent, so the prequential approach seems to be a more suitable approach to measuring forecast accuracy. (I haven't read the paper.)
Anyways, as somebody who has spent GIANT amounts of time experimenting with machine learning using market data (mainly stocks), I can give you TONS algorithms I've created that would show similar, even much higher, gains when you only test on 6 weeks of data. Been there done that. For example 6 weeks you make a 100% return, then in the next 3 weeks you suffer a 50% loss. Reality sets in...
Well, they did trade it out of sample, so that probably means the model wasn't overfit. But, I agree, they probably used unrealistic execution assumptions
"With four parameters I can fit an elephant, and with five I can make him wiggle his trunk." -- John von Neumann
The green "real" signal in your picture is amusing when juxtaposed with the red "zero-noise-assumption" signal in the last frame. TBH this accounts for most of my distrust of e.g. climate modeling.
In red is your model whereas in green is the real one, M being the number of parameters.
The technical term for the last one is "overfitting" if I remember correctly. But in the case you have an enormous amount of data, it is unlikely to happen.
If the parameter space for my model includes, let's say 10 binary decisions (which is very conservative), that's 1024 possible states of my model. If I tested all 1024 states against historical data, it is likely that some of them might do very well (depending on the general architecture of the model of course). What if I then selected the successful minority and held them up as clever strategies? Their success would very likely have been arbitrary. By basically brute-forcing enough strategies, I will inevitably come across some that were historically successful. But these same historically successful strategies are unlikely to outperform another random strategy in the future. It's not impossible you'll find a nugget of wisdom hidden from everyone else, just much less likely than the more simple explanation I'm offering.
So to your point, it's not just the size of the parameter space versus the data set that matters. Brute-forcing the former alone will likely produce a deceptive minority of winners.
"Predicting the future without a causal understanding of the system is epistemologically questionable."
That is an intriguing assertion, but it is circular. One can only demonstrate a causal understanding of a system by making usefully accurate predictions about the system's behavior.
To attempt that with a system consisting of market prices of tradeable securities is an exercise in frustration, because such markets do not operate by consistent, unchanging causal rules. In fact, financial markets are not systems at all in the usual sense, because their parts and connections are continually changing.
Sometime in 2013, before the bitcoin prices exploded, I downloaded some bitcoin historical price data and ran symbolic regression on it with Eureqa. It came up with a formula that fit the observed data fairly well, and wasn't very complicated.
But when I extrapolated it forward a few months, it predicted the price would explode to unreasonable levels. I was disappointed and threw it away, assuming that it must be wrong.
I'm late to comment, but something which I'd like to point out is that this is done by the same team behind the Twitter trending topic prediction technique from a few years back, as mentioned also in the article [1].
When their Twitter technique was released, I spend a few weeks reading through Nikolov's PhD thesis (the advisor gets most of the the fame in the press articles but Nikolov's thesis has all the details) and trying to implement it in R. My observations at the time: extremely simple algorithm which would be shot down by most peer reviewers for being not very novel (the affiliation helps a lot here). That said, I believe greatly in pragmatism, and the approach was actually working well. What I did find out however is that their was a great deal of data selection and pre-processing involved making the approach hard to implement in a real-life, real-time setup. I get similar feelings from this work.
The paper states that the strategy was simulated with live data and makes no mention of slippage. I've never traded bitcoin so I'm not sure how difficult it is to get fills, but that along with spreads are non-trivial components of real trading.
it would have been better if they set up an wallet and put bitcoin in it and demonstrated the trades that were automatically executed by this wallet. Since bitcoin is a ledger history, we could be 100% certain that SOME algorithm 'did the right thing', although we couldn't be certain that they spun up more than one algorithm and then just showed us the best one.
...That's not how Bitcoin trading works. In order to trade, your bitcoins need to be deposited into an exchange. Trades are not recorded on bitcoin's public ledger because there isn't a transaction for every trade. Bitcoin cannot currently handle that many transactions.
OKcoin (the exchange they relied on in the paper for data) claims 0% commissions and a bid-ask spread of just 0.04 RMB. how real is that, I have no idea. but if you relied on this information you could believe that you can still take a few additional spreads worth of slippage (in addition to crossing) to compensate for execution latency and still come out very far ahead
Success in predicting markets is measured in profit.
This research team should start a company that offers a service that allows users to deposit bitcoins, which the company then invests according to their alleged predictions, and then pay interest on deposits, and keep a part of the profit for themselves.
Doubling the initial investment one time is one thing, but this hypothetical company being able to double its investment every 50 days for years is something else. I doubt they can. A doubling every 50 days is x160 every year.
I think claims of being able to predict market prices should be met with great skepticism. Especially prices of easily traded commodities, including bitcoins.
The only proper measure of an ability to predict market prices is profit, because profit also measures the extent of the predictions: how much can you move the market (by trading according to your predictions) until you can no longer predict what will happen? Obviously, there's a limit. No one can extract unlimited profit from any market. So there definitely is a limit to how much you can earn from your algorithm. If you can earn 10% p.a. on an investment of maximum $5000, your algorithm isn't really worth much. If you can earn 1000% p.a. on an investment of up to $100M, your algorithm is great. But without knowing these figures we really only have a claim, seemingly a claim of them being able to make a lot of money, but choosing not to do so.
If everyone began using the paper's strategy, would the strategy still work?
Also, the strategy seems less effective than portrayed in the news article. If you look at the "results" section, it seems like the profit flatlined shortly after starting, then had success due to some major trading event, then eventually flatlined again: http://i.imgur.com/CBjEjgo.png
Wouldn't it be more accurate to say "this strategy is effective under some very specific circumstances"?
It seems like the key insight of the paper, but there's no mention of where it originated from. Is it a common equation in statistical modeling? I'd like to learn more about it. Does anyone have any suggested reading or coursework I should study?
I think equation 4 is just derived from equation 3. The top bit counts every time before that y_i has taken a particular value y, and multiplies it by the distance of the x value then, x_i, from the value of x now. (Squared and exponentiated because this is the pdf of the normal distribution.)
Here's an interpretation: if there were no noise, you might just count the number of times x took on a particular value and y took on a particular value, and divide that by the total number of times x took on that value. This would give you an empirical estimate of the prob of y given x.
Because there's noise, they weight the counts by the pdf of the normal distribution of x - x_i. So, whenever x_i was close to current x, and y_i was a given y, that increases the probability of y occuring now.
How many of the people who are bashing the paper in this discussion are machine learning experts? Especially the overfitting crowd - looks like emotional attachment to one's favorite topics is not really impacted by said person's overall education and expertise.
When you predict the future of a market, you change the future of that market. People start investing on the basis of your predictions and whatever opportunity for profit you found is closed. This is why HFT people iterate constantly and also why they put their servers as physically close to the market as possible.
This kind of innovation is cool, but it's a zero sum game. The bitcoin markets are already driven by competing bots. Their profits will be reduced as other bots iterate on their algorithms.
Pleas don't spout out 'zero sum' like you think it means something insightful here. Any marketplace is 'zero sum' if you think about it. There's a buyer and seller. So what?
I agree that his usage of zero sum was inaccurate, but so is yours.
Marketplaces are not at all inherently zero sum games. Wikipedia's definition (which is a fine one) is:
"a participant's gain (or loss) of utility is exactly balanced by the losses (or gains) of the utility of the other participant(s). If the total gains of the participants are added up and the total losses are subtracted, they will sum to zero."
The key thing here that is is measured in utility (not in the price of the good). Of course every trade or transaction is flat in that I sold it to you for the price you bought it at (that is obvious and non-what a zero sum game means). The issue is whether we are both made better off or not.
If I don't want to bear the risk of holding bitcoins and am happier mitigating that risk by selling them for dollars, my utility increased. Similarly, if you purchased them from me because you want to bear such risk, your utility increased. Even removing the risk/uncertainty from the equation, my willingness to pay for a good is not the same as the market price. Thus, if I sell bitcoins bc my value of them is less than the current market price, then I am improved. Similarly, if you value bitcoins above the current market price and buy them from me, then your willingness to pay was higher than the price you paid. Thus, you have significant consumer surplus from that trade.
There are called pareto efficient transactions. Most transactions are actually pareto efficient where at least one of the individuals increased their utility and no one involved in the trade decreased their utility.
This excerpt does a good job articulating this:
"Specifically, all trade is by definition positive sum, because when two parties agree to an exchange each party must consider the goods it is receiving to be more valuable than the goods it is delivering. In fact, all economic exchanges must benefit both parties to the point that each party can overcome its transaction costs, or the transaction would simply not take place."
For example, futures and options are zero-sum because every dollar of profit that one trader makes is offset by a dollar of loss from another trader.
But stock markets are not zero-sum. Prices can be bid up without a single share changing hands, creating new wealth out of thin air, and everyone wins. Conversely, prices can go to zero, destroying wealth and making everyone a loser.
Same thing for most other markets you can think of, from real estate to used cars to your local flea market. As with bitcoin exchanges, none of these are zero-sum since there doesn't have to be a loser for every winner.
You're mixing the concepts of realized and un-realized profit. A market price change marks your open position and gives you an unrealized (loss)gain. But to realize that you have to trade, plain and simple. And someone must take the other side of that.
A more compelling argument against zero-sum is that traders and investors have different time frames and objectives. And, indeed, the classic argument in the futures market is that a farmer who buys a hedge from a speculator represents a potential positive outcome for both.
I start a company to make cheap autonomous flying cars. I invest $1,000 of my own money and build a working prototype. It looks promising, so you buy 50% of my shares for $10,000.
I just made a 20x return on my investment, and you own 50% of a great opportunity. Who's the loser here that makes this a zero-sum market?
Our autonomous flying cars go into production, and now your 50% is worth $100 billion, and you sell your shares to Elon Musk. Where's the zero sum?
First, I never claimed the markets were or were not zero-sum, only that your attempt to explain was wrong as it mixed realized and unrealized profit.
Second, zero sum requires a definition of utility for each participant. If those definitions vary (and they almost assuredly do) then a reasonable argument can be made that to define whether the activity is a zero-sum game or not requires a commonality amongst those utilities (dollars, for example). This is often not the case in trading (for example, the farmer hedging his crop with a speculator who seems to profit short term).
Once your company has shares, you made $10,000 for 50% of your stock which you paid $500 for. But the investor merely has your shares. So you are up $9000, whoever got your $1000 is up $1000, and the investor is down $10,000. That adds up to zero.
Again, not saying you're wrong, but your example is useless.
Sorry, very badly phrased by me. I'm trying to say 'zero sum - so what?' and 'not zero sum - so what?' I regret writing the second part, I was trying to say (badly) that any simple model of a market could be zero sum. You can add in more things to the model (commission, whatever) and say now it's not zero sum. And you can add in opportunity costs and claim it's zero sum again. But who cares? It means nothing.
Now I've unintentionally started an argument about what is / is not zero sum and what bits of a market you take your definitions from. That was the exact opposite of what I was trying to say. If it is zero-sum, why does that even change anything or make the trading (from the original paper) worthwhile or not? I'm trying to say that 'zero sum' or not, it changes nothing and gives no additional insight.
Zero sum markets are not really a thing. The normal phrase is 'zero sum game'. Buying and holding stocks is not a zero sum game - you make money. Trading stocks between speculators where the trading commissions exceed the dividends and long term market appreciation is a zero sum game. Same market, different games.
They should have made more money rather than publishing more quickly. It used to be possible to do these sorts of things to the stock market but when these sorts of regularities are discovered the process of exploiting them also eliminates them once enough money is being made. Heck, a major trading firm got started by noticing that stocks went down on the weekend (and of course they don't any more).
Can a HFT-knowledgeable commenter chime in on the viability of the Sharpe ratio here?
From a physics perspective, it appears that the Sharpe ratio of 4.1 is roughly equivalent to a 4.1-sigma claim that their algorithm is better than random trading. I can't check easily, but I'd guess that the movement of Bitcoin prices isn't normally-distributed (looking at the paper's time series suggests that there's more low-frequency power there). If so, I'd guess that a more robust measure of the claim's significance would show it to be less significant.
Put differently, I'd guess more than 1 in 15,000 random sets of 2872 trades (their number of trades) would yield comparable profit. Furthermore, a simple buy-and-hold would've yielded a 20+% return over the same period.
A sharpe of 4 is ok. Most "HFT" type strategies have sharpes so high they don't talk about sharpe anymore as it is meaningless. It is simply a different type of trading. See Virtu's pnl distribution in their S1. As a point of reference, Blair Hull is on record saying they are interested in nothing below a sharpe of 10 (r-finance talk, should be googleable).
Ever thought how our big data overlords (Google, Facebook, MS, Twitter, etc) just need to check for correlations between their users' data input and stock exchange movement?