tag:blogger.com,1999:blog-51081799898817447252024-03-14T02:45:47.202-07:00Trading with Pythonsjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.comBlogger24125tag:blogger.com,1999:blog-5108179989881744725.post-2392843141886085592017-05-20T13:58:00.001-07:002017-05-22T14:47:37.501-07:00Yahoo is dead, long live Yahoo!<div dir="ltr" style="text-align: left;" trbidi="on">
On 18 May 2017 the ichart data api of yahoo finance went down, without any notice. And it does not seem like it is coming back. This has left many (including me) with broken code and without a descent free end-of-day data source. Something needs to be done. Now.<br />
<br />
Apparently Yahoo! does not want us to download free data automatically, but it is still possible to download it by hand, clicking the 'download' button on the <a href="https://uk.finance.yahoo.com/quote/SPY/history?p=SPY" target="_blank">ticker webpage</a>. Automatic downloading is made more difficult by using a <i>cookie-crub</i> pair, but luckily still possible.<br />
<br />
I'm now working on updating the <i>tradingWithPython.yahooFinance </i>library and as a part of that work I've created a proto script that shows how to get the data. Because so many are now scrambling to get their code working, I'm sharing this as soon as possible.<br />
<br />
<a href="https://github.com/sjev/trading-with-python/blob/master/scratch/get_yahoo_data.ipynb" target="_blank">The notebook can be found on Github,</a> enjoy!<br />
<br />
<b>Update: </b><i style="font-weight: bold;"><a href="https://github.com/sjev/trading-with-python/blob/fix_yahoo/lib/yahooFinance.py">yahooFinance.py</a> </i>has been fixed! <br />
<br />
<br />
<i>Note: </i>the data provided seems to be adjusted for splits, but not for dividends.</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-24149759709560334302016-02-20T08:09:00.000-08:002016-02-20T08:09:03.059-08:00A simple statistical edge in SPY<div dir="ltr" style="text-align: left;" trbidi="on">
I've recently read a great post by the turinginance blog on <a href="http://www.turingfinance.com/how-to-be-a-quant/" target="_blank">how to be a quant</a>. In short, it describes a scientific approach to developing trading strategies. For me personally, observing data, thinking with models and forming hypothesis is a second nature, as it should be for any good engineer.<br />
<br />
In this post I'm going to illustrate this approach by explicitly going through a number of steps (just a couple, not all of them) involved in development of a trading strategy.<br />
<br />
Let's take a look at the most common trading instrument, the S&P 500 ETF 'SPY' . I'll start with observations.<br />
<br />
<b>Observations</b><br />
It occurred to me that most of the time that there is much talk in the media about the market crashing (after big losses over several days timespan), quite a significant rebound sometimes follows.<br />
In the past I've made a couple of mistakes by closing my positions to cut losses short, just to miss out a recovery in the following days.<br />
<br />
<b>General theory</b><br />
After a period of consecutive losses, many traders will liquidate their positions out of fear for even larger loss. Much of this behavior is governed by fear, rather than calculated risk. Smarter traders come in then for the bargains.<br />
<br />
<b>Hypothesis: </b>Next-day returns of SPY will show an upward bias after a number of consecutive losses.<br />
<br />
To test the hypothesis, I've calculated the number of consecutive 'down' days . Everything under -0.1% daily return qualifies as a 'down' day.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglZ80s1NRCXsvZJ_eMreZq-z6r-a4R2gMtLGTlsFT8rNIXkkhBzNkYvH8SgmagSu1mMGsHxDFr4O9ZszldTV2cX0k_cQ_KjEYqjqLqM18THKkzxZGTgliO-CogJTzH7I-DoL629AlLZOS5/s1600/nr_days.png" imageanchor="1"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglZ80s1NRCXsvZJ_eMreZq-z6r-a4R2gMtLGTlsFT8rNIXkkhBzNkYvH8SgmagSu1mMGsHxDFr4O9ZszldTV2cX0k_cQ_KjEYqjqLqM18THKkzxZGTgliO-CogJTzH7I-DoL629AlLZOS5/s320/nr_days.png" width="320" /></a><br />
<br />
The return series are near-random, so as one would expect, the chances of 5 or more consecutive down days are low, resulting in a very limited number of occurrences. Low number of occurrences will result in unreliable statistical estimates, so I'll stop at 5.<br />
<br />
Below is a visualisation of nex-tday returns as a function of number of down days.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRjlKAIb6XAX0iArK7A3U0XhSBDr0gvj7Ye65mIedRaAU5Gw8bykmbrQiPRPP3qHk-r6-TpUC-RJylJpaZAIhDmMQCy0gAWiHFus3_MwyALP3_p3Nsq59LVtmSZqveXUv7B5ZT6Ltlly1v/s1600/next_day_return.png" imageanchor="1"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRjlKAIb6XAX0iArK7A3U0XhSBDr0gvj7Ye65mIedRaAU5Gw8bykmbrQiPRPP3qHk-r6-TpUC-RJylJpaZAIhDmMQCy0gAWiHFus3_MwyALP3_p3Nsq59LVtmSZqveXUv7B5ZT6Ltlly1v/s320/next_day_return.png" width="320" /></a><br />
<br />
I've also plotted 90% confidence interval of the returns between the lines. It turns out that the average return *is* positively correlated with the number of down days. Hypothesis <b>confirmed</b>.<br />
<br />
However, you can clearly see that this extra <i>alpha</i> is very small compared to the band of the probable return outcomes. But even a tiny edge can be exploited (find a statistical advantage and repeat as often as possible). Next step is to investigate if this edge can be turned in a trading strategy.<br />
<br />
Given the data above, a trading strategy can be forumlated:<br />
<b>After consectutive 3 or more losses, go long. Exit on next close.</b><br />
<br />
Below is a result of this strategy compared to pure buy-and-hold.<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_c6B0NkDQaq9wSLvhNe05hO5KSY8p9wzpwmUoEpWutDUWCeb_mj0VZqP4XQJXBumAenkDy4pcyo2YRIVPHd56pQ0EAUVMkmdAGT194ZZ8_AvkCpMlhcAoW5PDjFIqITA31NofOWQ4ZoUr/s1600/strat.png" imageanchor="1"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_c6B0NkDQaq9wSLvhNe05hO5KSY8p9wzpwmUoEpWutDUWCeb_mj0VZqP4XQJXBumAenkDy4pcyo2YRIVPHd56pQ0EAUVMkmdAGT194ZZ8_AvkCpMlhcAoW5PDjFIqITA31NofOWQ4ZoUr/s320/strat.png" width="320" /></a><br />
This does not look bad at all! Looking a the sharpe ratios the strategy scores a descent 2.2 versus 0.44 for the B&H. This is actually pretty good! ( don't get too excited though, as I did not account for commision costs, slippage etc ).<br />
<br />
<br />
While the strategy above is not something that I would like to trade simply because of the long time span, the theory itself provokes futher thoughts that could produce something useful. If the same principle applies to intraday data, a form of scalping strategy could be built. In the example above I've oversimplified the world a bit by only counting the *number* of down days, without paying attention to the depth of the drawdown. Also, position exit is just a basic 'next-day-close' . There is much to be improved, but the essence in my opinion is this:<br />
<br />
<i>future returns of SPY are ifluenced by drawdown and drawdown duration over the previous 3 to 5 days.</i><br />
<br />
<br />
<br />
<br />
<br /></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com7tag:blogger.com,1999:blog-5108179989881744725.post-32408372415809841082014-11-17T15:34:00.000-08:002014-11-17T15:43:41.196-08:00Trading VXX with nearest neighbors prediction<div dir="ltr" style="text-align: left;" trbidi="on">
An experienced trader knows what behavior to expect from the market based on a set of indicators and their interpretation. The latter is often done based on his memory or some kind of model. Finding a good set of indicators and processing their information poses a big challenge. First, one needs to understand what factors are correlated to <i>future </i>prices. Data that does not have any predictive quality only intorduces noise and complexity, decreasing strategy performance. Finding good indicators is a science on its own, often requiring deep understandig of the market dynamics. This part of strategy design can not be easily automated. Luckily, once a good set of indicators has been found, the traders memory and 'intuition' can be easily replaced with a statistical model, which will likely to perform much better as computers do have flawless memory and can make perfect statistical estimations.<br />
<br />
Regarding volatility trading, it took me quite some time to understand what influences its movements. In particular, I'm interested in variables that <i>predict</i> future returns of VXX and XIV. I will not go into full-length explanation here, but just present a conclusion : my two most valuable indicators for volatility are the term structure slope and current volatility premium.<br />
My definition of these two is:<br />
<br />
<ul style="text-align: left;">
<li><i>volatility premium = VIX-realizedVol</i></li>
<li><i>delta (term structure slope) = VIX-VXV</i></li>
</ul>
<div>
<i>VIX & VXV</i> are the forward 1 and 3 month implied volatilities of the S&P 500. <i>realizedVol</i> here is a 10-day realized volatility of SPY, calculated with Yang-Zhang formula. <i>delta </i> has been often discussed on <a href="http://vixandmore.blogspot.nl/" target="_blank">VixAndMore </a>blog, while <i>premium</i> is well-known from option trading.</div>
<div>
<br /></div>
<div>
It makes sense to go short volatility when <i>premium </i> is high and futures are in contango (<i>delta</i> < 0). This will cause a tailwind from both the premium and daily roll along the term structure in VXX. But this is just a rough estimation. A good trading strategy would combine information from both <i>premium</i> and <i>delta</i> to come with a prediction on trading direction in VXX.</div>
<div>
I've been struggling for a very long time to come up with a good way to combine the noisy data from both indicators. I've tried most of the 'standard' approaches, like linear regression, writing a bunch of 'if-thens' , but all with a very minor improvements compared to using only one indicator. A good example of such 'single indicator' strategy with simple rules can be found on <a href="http://www.tradingtheodds.com/2014/11/ddns-volatility-risk-premium-strategy-revisited-3/" target="_blank">TradingTheOdds </a>blog . Does not look bad, but what can be done with multiple indicators?</div>
<div>
<br /></div>
<div>
I'll start with some out-of-sample VXX data that I got from <a href="http://marketsci.wordpress.com/2012/04/18/free-historical-vxx-data/" target="_blank">MarketSci</a>. Note that this is simulated data, before VXX was created. </div>
<div>
<br /></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPcz2ZUHN5AmzAZOn3qyqQI7ylrFcp81o7bc0W2luP2RMmr4PBf_MEItGDNcgpJknc7nN-8dA4o3zZyNfF48hqH9ER5oTjTexyuSvCSIKPI2VfK_cBiODcrNW0aE3zkHKXtrCAaicoXAFc/s1600/vxx_est.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPcz2ZUHN5AmzAZOn3qyqQI7ylrFcp81o7bc0W2luP2RMmr4PBf_MEItGDNcgpJknc7nN-8dA4o3zZyNfF48hqH9ER5oTjTexyuSvCSIKPI2VfK_cBiODcrNW0aE3zkHKXtrCAaicoXAFc/s1600/vxx_est.png" height="170" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
The indicators for the same period are plotted below:<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdtZncfgj4pB0XjwOGODVaiP2JCjZRS0xPTTpbFS8_2BUnLFFFp9I031rLwxqCKJQlPKvFDurMrR6qupOrNi7XYKZynJ-pr_pX8YcFqNaqB3IQvgXtYBjyeBMG8nU73YovUwt35hBr3s_1/s1600/indicators.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdtZncfgj4pB0XjwOGODVaiP2JCjZRS0xPTTpbFS8_2BUnLFFFp9I031rLwxqCKJQlPKvFDurMrR6qupOrNi7XYKZynJ-pr_pX8YcFqNaqB3IQvgXtYBjyeBMG8nU73YovUwt35hBr3s_1/s1600/indicators.png" height="170" width="320" /></a></div>
<br />
<br />
If we take one of the indicators (premium in this case) and plot it against future returns of VXX, some correlation can be seen, but the data is extremely noisy:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6pO4wYT3KyVl41ouGu59OOqFKX1bZoHhOKGtSBGFxXzjcAWHV2hicLpcTVFVwmBaP222TvcnwlL1e8NpVfj4KeLvYwmbXytza99l9UdZ2Dk_nD_ijvn1xUcq8oQXnx6sY77B2-WGVUcRe/s1600/premium-futreturn.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6pO4wYT3KyVl41ouGu59OOqFKX1bZoHhOKGtSBGFxXzjcAWHV2hicLpcTVFVwmBaP222TvcnwlL1e8NpVfj4KeLvYwmbXytza99l9UdZ2Dk_nD_ijvn1xUcq8oQXnx6sY77B2-WGVUcRe/s1600/premium-futreturn.png" height="170" width="320" /></a></div>
<br />
<br />
Still, it is clear that negative premium is likely to have positive VXX returns on the next day.<br />
Combining both premium and delta into one model has been a challenge for me, but I always wanted to do a statistical approximation. In essence, for a combination of (delta,premium), I'd like to find all historic values that are closest to the current values and make an estimation of the future returns based on them. A couple of times I've started writing my own nearest-neighbor interpolation algorithms, but every time I had to give up... until I came across the <a href="http://scikit-learn.org/stable/auto_examples/neighbors/plot_regression.html#example-neighbors-plot-regression-py" target="_blank">scikit nearest neighbors regression</a>. It enabled me to quickly build a predictor based on two inputs and the results are so good, that I'm a bit worried that I've made a mistake somewhere...<br />
<br />
Here is what I did:<br />
<br />
<ol style="text-align: left;">
<li>create a dataset of [<i>delta,premium</i>] -> [<i>VXX next day return</i>] (in-of-sample)</li>
<li>create a nearest-neighbor predictor based on the dataset above</li>
<li>trade strategy (out-of-sample) with the rules:</li>
<ul>
<li>go long if predicted return > 0</li>
<li>go short if predicted return <0</li>
</ul>
</ol>
<div>
The strategy could not be simpler...</div>
<div>
<br /></div>
<div>
The results seem extremely good and get better when more neigbors are used for estimation.</div>
<div>
First, with 10 points, the strategy is excellent in-sample, but is flat out-of-sample (red line in figure below is the last point in-sample)</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3Y6ydXsnxM0XCmmMLgKjH8dw8Vp6PNR8ztLsBoBd3nwcfuyz3Ks459U8Asa4-Yzt6bS1QPG4cUpyai7yDs4t1ONjXlR-y_iann-Pw8eIX9_4EFNfubeybpG-nV3cYE_Y2_qYZGTqTRvb5/s1600/pnl_n10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3Y6ydXsnxM0XCmmMLgKjH8dw8Vp6PNR8ztLsBoBd3nwcfuyz3Ks459U8Asa4-Yzt6bS1QPG4cUpyai7yDs4t1ONjXlR-y_iann-Pw8eIX9_4EFNfubeybpG-nV3cYE_Y2_qYZGTqTRvb5/s1600/pnl_n10.png" height="170" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
Then, performance gets better with 40 and 80 points:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEge17SGB0NHqZ9ohp7vodsarK6yuuBH0UEs9oICry2vnHg9NvoSvzTy3g-zeNZoTBx656xCvfo0NoELr-xKh_E0nudQgMLPrSs1tp-JGAecGVmk9w6UklluBVdFlLNwXluTMGh8ZEo4WHjg/s1600/pnl_n40.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEge17SGB0NHqZ9ohp7vodsarK6yuuBH0UEs9oICry2vnHg9NvoSvzTy3g-zeNZoTBx656xCvfo0NoELr-xKh_E0nudQgMLPrSs1tp-JGAecGVmk9w6UklluBVdFlLNwXluTMGh8ZEo4WHjg/s1600/pnl_n40.png" height="170" width="320" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQhwhGRxBueAFGptCRbT7vVrPNQjg15Q8EMtvqOTaUsLKYCOfAy5fAj_cM2ED07k-I0Sy3bNmblUiV6qgfcSMVKOFo3_0VW2fLdbL_KDiidC6H1qSBuoyS5zRRMbmeoSLUiFIS6_ZZ2dIQ/s1600/pnl_n80.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQhwhGRxBueAFGptCRbT7vVrPNQjg15Q8EMtvqOTaUsLKYCOfAy5fAj_cM2ED07k-I0Sy3bNmblUiV6qgfcSMVKOFo3_0VW2fLdbL_KDiidC6H1qSBuoyS5zRRMbmeoSLUiFIS6_ZZ2dIQ/s1600/pnl_n80.png" height="170" width="320" /></a></div>
<div>
<br /></div>
<br />
In the last two plots, the strategy seems to perform the same in- and out-of-sample. Sharpe ratio is around 2.3.<br />
I'm very pleased with the results and have the feeling that I've only been scratching the surface of what is possible with this technique.<br />
<br />
<br />
<br /></div>
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com11tag:blogger.com,1999:blog-5108179989881744725.post-73958752530973967642014-07-16T14:41:00.000-07:002014-07-16T14:41:08.078-07:00Simple backtesting module<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
My search of an ideal backtesting tool (my definition of 'ideal' is described in the earlier 'Backtesting dilemmas' posts) did not result in something that I could use right away. However, reviewing the available options helped me to understand better what I really want. Of the options I've looked at, <a href="https://github.com/ematvey/pybacktest" target="_blank">pybacktest</a> was the one I liked most because of its simplicity and speed. After going through the source code, I've got some ideas to make it simpler and a bit more elegant. From there, it was only a small step to writing my own backtester, which is now available in the <a href="http://www.tradingwithpython.com/?page_id=504" target="_blank">TradingWithPython library</a>.<br />
<br />
I have chosen an approach where the backtester contains functionality which all trading strategies share and that often gets copy-pasted. Things like calculating positions and pnl, performance metrics and making plots.<br />
<br />
Strategy specific functionality, like determining entry and exit points should be done outside of the backtester. A typical workflow would be:<br />
<i>find entry and exits -> calculate pnl and make plots with backtester -> post-process strategy data</i><br />
<br />
At this moment the module is very minimal (take a look at the source <a href="https://code.google.com/p/trading-with-python/source/browse/trunk/lib/backtest.py" target="_blank">here</a>), but in the future I plan on adding profit and stop-loss exits and multi-asset portfolios.<br />
<br />
Usage of the backtesting module is shown in this <a href="http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/11352905/notebooks/twp_302b_backtesting.ipynb" target="_blank"><b><span style="color: blue;">example notebook</span></b></a></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com5tag:blogger.com,1999:blog-5108179989881744725.post-44240169158894275982014-06-07T17:03:00.003-07:002014-06-07T17:04:59.759-07:00Boosting performance with Cython<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<br /></div>
Even with my old pc (AMD Athlon II, 3GB ram), I seldom run into performance issues when running vectorized code. But unfortunately there are plenty of cases where that can not be easily vectorized, for example the <i>drawdown</i> function. My implementation of such was extremely slow, so I decided to use it as a test case for speeding things up.
I'll be using the SPY timeseries with ~5k samples as test data.
Here comes the original version of my <i>drawdown</i> function (as it is now implemented in the <i>TradingWithPython</i> library)
<br />
<pre class="brush:python">def drawdown(pnl):
"""
calculate max drawdown and duration
Returns:
drawdown : vector of drawdwon values
duration : vector of drawdown duration
"""
cumret = pnl
highwatermark = [0]
idx = pnl.index
drawdown = pd.Series(index = idx)
drawdowndur = pd.Series(index = idx)
for t in range(1, len(idx)) :
highwatermark.append(max(highwatermark[t-1], cumret[t]))
drawdown[t]= (highwatermark[t]-cumret[t])
drawdowndur[t]= (0 if drawdown[t] == 0 else drawdowndur[t-1]+1)
return drawdown, drawdowndur
%timeit drawdown(spy)
1 loops, best of 3: 1.21 s per loop
</pre>
Hmm 1.2 seconds is not too speedy for such a simple function.
There are some things here that could be a great drag to performance, such as a list *highwatermark* that is being appended on each loop iteration. Accessing Series by their index should also involve some processing that is not strictly necesarry.
Let's take a look at what happens when this function is rewritten to work with numpy data
<br />
<pre class="brush:python">def dd(s):
# ''' simple drawdown function '''
highwatermark = np.zeros(len(s))
drawdown = np.zeros(len(s))
drawdowndur = np.zeros(len(s))
for t in range(1,len(s)):
highwatermark[t] = max(highwatermark[t-1], s[t])
drawdown[t] = (highwatermark[t]-s[t])
drawdowndur[t]= (0 if drawdown[t] == 0 else drawdowndur[t-1]+1)
return drawdown , drawdowndur
%timeit dd(spy.values)
10 loops, best of 3: 27.9 ms per loop
</pre>
Well, this is <b>much</b> faster than the original function, approximately 40x speed increase. Still there is much room for improvement by moving to compiled code with <i><a href="http://cython.org/">cython</a></i> Now I rewrite the dd function from above, but using optimisation tips that I've found on the <a href="http://docs.cython.org/src/tutorial/numpy.html">cython tutorial</a> . Note that this is my first try ever at optimizing functions with Cython.
<br />
<pre class="brush:python">%%cython
import numpy as np
cimport numpy as np
DTYPE = np.float64
ctypedef np.float64_t DTYPE_t
cimport cython
@cython.boundscheck(False) # turn of bounds-checking for entire function
def dd_c(np.ndarray[DTYPE_t] s):
# ''' simple drawdown function '''
cdef np.ndarray[DTYPE_t] highwatermark = np.zeros(len(s),dtype=DTYPE)
cdef np.ndarray[DTYPE_t] drawdown = np.zeros(len(s),dtype=DTYPE)
cdef np.ndarray[DTYPE_t] drawdowndur = np.zeros(len(s),dtype=DTYPE)
cdef int t
for t in range(1,len(s)):
highwatermark[t] = max(highwatermark[t-1], s[t])
drawdown[t] = (highwatermark[t]-s[t])
drawdowndur[t]= (0 if drawdown[t] == 0 else drawdowndur[t-1]+1)
return drawdown , drawdowndur
%timeit dd_c(spy.values)
10000 loops, best of 3: 121 µs per loop
</pre>
Wow, this version runs in 122 <i>micro</i>seconds, making it <b>ten thousand</b> times faster than my original version!
I must say that I'm very impressed by what the Cython and IPython teams have achieved! The speed compared with ease of use is just awesome!<br />
P.S. I used to do code optimisations in Matlab using pure C and .mex wrapping, it was all just pain in the ass compared to this.
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com1tag:blogger.com,1999:blog-5108179989881744725.post-38794471731926038082014-05-27T15:17:00.000-07:002014-05-27T15:17:45.826-07:00Backtesting dilemmas: pyalgotrade review<p>Ok, moving on to the next contestant: <a href="http://gbeced.github.io/pyalgotrade/"><strong>PyAlgoTrade</strong></a></p>
<p>First impression: actively developed, pretty good documentation, more than enough feautures ( TA indicators, optimizers etc) . Looks good, so I went on with the installation which also went smoothly. </p>
<p>The tutorial seems to be a little bit out of date, as the first command <code>yahoofinance.get_daily_csv()</code> throws an error about unknown function. No worries, the documentation is up to date and I find that the missing function is now renamed to <code>yahoofinance.download_daily_bars(symbol,year,csvFile)</code>. The problem is that this function only downloads data for <em>one</em> year instead of everything from that year to current date. So pretty useless. <br>
After I downloaded the data myself and saved it to csv, I needed to adjust the column names because apparently pyalgotrade expects <code>Date,Adj Close,Close,High,Low,Open,Volume</code> to be in the header. That is all minor trouble.</p>
<p>Following through to performance testing on an SMA strategy that is provided in the tutorial. My dataset consists of 5370 days of SPY:</p>
<pre>%timeit myStrategy.run()
1 loops, best of 3: 1.2 s per loop
</pre>
<p>That is actually pretty good for an event-based framework. </p>
<p>But then I tried searching documentation for functionality needed to backtest spreads and multiple asset portfolios and just could not find any. Then I tried to find a way to feed pandas DataFrame as an input to a strategy and it happens to be <a href="https://github.com/gbeced/pyalgotrade/issues/4">not possible</a>, which is again a big disappointment. I did not state it as a requirement in the previous post, but now I come to realisation that <a href="http://pandas.pydata.org/">pandas</a> support is a must for any framework that works with time series data. Pandas was a reason for me to switch from Matlab to Python and I never want to go back. </p>
<p><strong>Conclusion</strong> pyalgotrade does not meet my requrement for flexibility. It looks like it was designed with classic TA in mind and single instrument trading. I don’t see it as a good tool for backtesting strategies that involve multiple assets, hedging etc. </p>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com1tag:blogger.com,1999:blog-5108179989881744725.post-50466778021293620682014-05-26T14:51:00.000-07:002014-05-27T15:18:12.250-07:00Backtesting dilemmas<div dir="ltr" style="text-align: left;" trbidi="on">
A quantitative trader faces quite some challenges on a way to a successful trading strategy. Here I’ll discuss a couple dilemmas involved in backtesting. A good trading simulation must :<br />
<ol>
<li>Be good approximation of the real world. This one is of course the most important requirement .</li>
<li>Allow unlimited flexibility: the tooling should not stand in the way of testing out-of-the-box ideas. Everything that can be quantified should be usable.</li>
<li>Be easy to implement & maintain. It is all about productivity and being able to test many ideas to find one that works.</li>
<li>Allow for parameter scans, walk-forward testing and optimisations. This is needed for investigating strategy performance and stability depending on strategy parameters.</li>
</ol>
The problem with satisfying all of the requirements above is that #2 and #3 are conflicting ones. There is no tool that can do everything without the cost of high complexity (=low maintainablity). Typically, a third party point-and-click tool will severely limit freedom to test with custom signals and odd portfolios, while at the other end of the spectrum a custom-coded diy solution will require tens or more hours to implement with high chances of ending up with cluttered and unreadable code. So in attempt to combine the best of both worlds, let’s start somewehere in the middle: use an existing backtesting framework and adapt it to our taste.<br />
In the following posts I’ll be looking at three possible candidates I’ve found:<br />
<ul>
<li><a href="https://github.com/quantopian/zipline">Zipline</a> is widely known and is the engine behind Quantopian</li>
<li><a href="http://gbeced.github.io/pyalgotrade/">PyAlgotrade</a> seems to be actively developed and well-documented</li>
<li><a href="https://github.com/ematvey/pybacktest">pybacktest</a> is a light-weight vector-based framework with that might be interesting because of its simplicity and performance.</li>
</ul>
I’ll be looking at suitability of these tools benchmarking them against a hypothetical trading strategy. If none of these options fits my requirements I will have to decide if I want to invest into writing my own framework (at least by looking at the available options I’ll know what does <em>not</em> work) or stick with custom code for each strategy.<br />
First one for the evaluation is <strong>Zipline</strong>. <br />
My first impression of Zipline and <a href="https://www.quantopian.com/">Quantopian </a> is a positive one. Zipline is backed by a team of developers and is tested in production, so quality (bugs) should be great. There is good documentation on the <a href="https://www.quantopian.com/help#api-doco">site</a> and an example <a href="http://nbviewer.ipython.org/github/twiecki/financial-analysis-python-tutorial/blob/master/3.%20Backtesting%20using%20Zipline.ipynb">notebook on github</a> . <br />
To get a hang of it, I downloaded the exampe notebook and started playing with it. To my disappointment I quickly run into trouble at the first example <em>Simplest Zipline Algorithm: Buy Apple</em>. The dataset has only 3028 days, but running this example just took forever. Here is what I measured:<br />
<pre>dma = DualMovingAverage()
%timeit perf = dma.run(data)
1 loops, best of 3: 52.5 s per loop
</pre>
I did not expect stellar performance as zipline is an event-based backtester, but almost a minute for 3000 samples is just too bad. This kind of performance would be prohibitive for any kind of scan or optimization. Another problem would arise when working with larger datasets like intraday data or multiple securities, which can easily contain hundreds of thousands of samples.<br />
Unfortunately, I will have to drop Zipline from the list of useable backtesters as it does not meet my requirement #4 by a fat margin. <br />
In the following post I will be looking at PyAlgotrade.<br />
<em>Note: My current system is a couple of years old, running an AMD Athlon II X2 @2800MHZ with 3GB of RAM. With vector-based backtesting I’m used to calculation times of less than a second for a single backtest and a minute or two for a parameter scan. A basic walk-forward test with 10 steps and a parameter scan for 20x20 grid would result in a whooping 66 hours with zipline. I’m not that paitient.</em></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-31259189328077413012014-01-15T14:20:00.002-08:002014-01-15T14:20:58.213-08:00Starting IPython notebook from windows file exlorer<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
I organize my IPython notebooks by saving them in different directories. This brings however an inconvenience, because to access the notebooks I need to open a terminal and type '<i>ipython notebook --pylab=inline' </i> each and every time. I'm sure the ipython team will solve this in the long run, but in the meantime there is a pretty descent way to quickly access the notebooks from the file explorer.<br />
<br />
All you need to do is add a context menu that starts ipython server in your desired directory:<br />
<br />
<br /></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWRL46CkGpCw6K6RyIs8ZfUeNRS4sra9DfXndTUr7BHUfs1Dsf4vWettQFCyZVZnpLrttczsJaWFvV9EAuAnEiCp9DztaCFSMxDHbTOWXvtvaDcTzd2l_C7iuPPDlgIlUOn-mmSbJyPmRY/s1600/launch_ipython_server.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWRL46CkGpCw6K6RyIs8ZfUeNRS4sra9DfXndTUr7BHUfs1Dsf4vWettQFCyZVZnpLrttczsJaWFvV9EAuAnEiCp9DztaCFSMxDHbTOWXvtvaDcTzd2l_C7iuPPDlgIlUOn-mmSbJyPmRY/s320/launch_ipython_server.png" /></a><br />
<br />
A quick way to add the context item is by running this <a href="https://dl.dropboxusercontent.com/u/11352905/registerIpythonNotebook.reg">registry patch</a>. (<b>Note</b>: the patch assumes that you have your python installation located in C:\Anaconda . If not, you’ll need to open the .reg file in a text editor and set the right path on the last line).<br />
<br />
Instructions on adding the registry keys manually can be found on <a href="http://flothesof.github.io/IPythonNotebook-shortcut-Windows7-explorer.html">Frolian's blog</a>.</div>
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com0tag:blogger.com,1999:blog-5108179989881744725.post-56680814932240150682014-01-13T14:56:00.000-08:002014-01-14T14:22:11.254-08:00Leveraged ETFs in 2013, where is your decay now?<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">Many people think that leveraged etfs in the long term underperform their benchmarks. This is true for choppy markets, but not in the case of trending conditions, either up or down. Leverage only has effect on the <i>most likely</i> outcome, not on the <i>expected</i> outcome. For more background please read <a href="http://matlab-trading.blogspot.nl/2011/05/guess-what-leveraged-etfs-dont-decay.html">this post</a>.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span></div>
<div style="text-align: justify;">
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">2013 has been a very good year for stocks, which trended up for most of the year. Let's see what would happen if we shorted some of the leveraged etfs exactly a year ago and hedged them with their benchmark. </span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">Knowing the leveraged etf behavior I would expect that leveraged etfs outperformed their benchmark, so the strategy that would try to profit from the decay would lose money.</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span>
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">I will be considering these pairs:</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span>
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">SPY 2 SSO -1 </span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">SPY -2 SDS -1</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">QQQ 2 QLD -1</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">QQQ -2 QID -1</span></span><br />
<span style="font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">IYF -2 SKF -1</span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span>
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">Each leveraged etf is held short (-1 $) and hedged with an 1x etf. Notice that to hedge an inverse etf a negative position is held in the 1x etf.</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span>
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">Here is one example: SPY vs SSO. </span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;">Once we normalize the prices to 100$ at the beginning of the backtest period (250 days) it is apparent that the 2x etf outperforms 1x etf.</span></span><br />
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ5SFUrfZBXTR8TW8icbnzOjnA70MYR5aYuhpDfJgCXsPyZeiC3RuazctBrXWJzb46m-YkZAI45nNOGtFoBH_zIF4YlLKX9e91w3wQlo1RS7hpgECcW_JBqlFaiGYv0U8XG5dOQvePoTEV/s1600/leveraged_pair.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ5SFUrfZBXTR8TW8icbnzOjnA70MYR5aYuhpDfJgCXsPyZeiC3RuazctBrXWJzb46m-YkZAI45nNOGtFoBH_zIF4YlLKX9e91w3wQlo1RS7hpgECcW_JBqlFaiGYv0U8XG5dOQvePoTEV/s320/leveraged_pair.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Now the results of the backtest on the pairs above:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6PJM64Ok86qgToemjo7tIERLjVGebGIqpecCLwgrMOlbQ0Bou6MTksngMeQ-qrNi43ENwBh3WBkh_-32TSkEQr5_vce27TuGFcajzOZki_z0YkUF0TVhaclc_Nv-gBUyxBBfVSzodE2jL/s1600/leveraged_pnl.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6PJM64Ok86qgToemjo7tIERLjVGebGIqpecCLwgrMOlbQ0Bou6MTksngMeQ-qrNi43ENwBh3WBkh_-32TSkEQr5_vce27TuGFcajzOZki_z0YkUF0TVhaclc_Nv-gBUyxBBfVSzodE2jL/s320/leveraged_pnl.png" width="320" /></a></div>
All the 2x etfs (including inverse) have outperformed their benchmark over the course of 2013. According to expectations, the strategy exploiting 'beta decay' would not be profitable.<br />
<br />
I would think that playing leveraged etfs against their unleveraged counterpart does not provide any edge, unless you know the market conditions beforehand (trending or range-bound). But if you do know the coming market regime, there are much easier ways to profit from it. Unfortunately, nobody has yet been really succesful at predicting the market regime at even the very short term.<br />
<br />
<br />
<i>Full source code of the calculations is available for the subscribers of the <a href="http://www.tradingwithpython.com/">Trading With Python</a> course. Notebook #307</i><br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<span style="font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="font-size: 14px; line-height: 20px;"><br /></span></span></div>
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com5tag:blogger.com,1999:blog-5108179989881744725.post-69677006315100488182014-01-02T05:31:00.001-08:002014-01-02T06:25:53.828-08:00Putting a price tag on TWTR<div dir="ltr" style="text-align: left;" trbidi="on">
Here is my shot at Twitter valuation. I'd like to start with a disclaimer: at this moment a large portion of my portrolio consists of short TWTR position, so my opinion is rather skewed. The reason I've done my own analysis is that my bet did not work out well, and Twitter made a parabolic move in December 2013. So the question that I'm trying to answer here is "should I take my loss or hold on to my shorts?".<br />
<br />
At the time of writing, TWTR trades around $64 mark, with a market cap of 34.7 B$. Up till now the company has not made any profit, loosing 142M$ in 3013 after making 534M$ in revenues. The last two numbers give us yearly company spendings of 676M$.<br />
<br />
<h3 style="text-align: left;">
Price derived from user value</h3>
Twitter can be compared with with Facebook, Google and LinkedIn to get an idea of user numbers and their values. The table below summarises user numbers per company and a value per user derived from the market cap. (source for number of users: Wikipedia, number for Google is based on number of unique searches)<br />
<table border="1" class="dataframe" style="background-color: white; border-collapse: collapse; border-spacing: 0px; border: 1px solid black; color: black; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14.44444465637207px; line-height: 20px; margin: 1em 2em; max-width: 100%;"><thead>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; text-align: right;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;"></th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">users [millions]</th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">user value [$]</th></tr>
</thead><tbody>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">FB</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">1190</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">113</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">TWTR</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">250</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">139</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">GOOG</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2000</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">187</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">LNKD</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">259</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">100</td></tr>
</tbody></table>
It becomes apparent that the market valuation per user is very similar for all of the companies, however my personal opinion is that:<br />
<div>
<ul style="text-align: left;">
<li>TWTR is currently more valuable per user thatn FB or LNKD. This is not logical as both competitors have more valuable personal user data at their disposal. </li>
<li>GOOG has been excelling at extracting ad revenue from its users. To do that, it has a set of highly diversified offerings, from search engine to Google+ , Docs and Gmail. TWTR has nothing resembling that, while its value per user is only 35% lower thatn that of Google.</li>
<li>TWTR has a limited room to grow its user base as it does not offer products comparable to FB or GOOG offerings. TWTR has been around for seven years now and most people wanting an accout have got their chance. The rest just does not care.</li>
<li>TWTR user base is volatile and is likely to move to the next hot thing when it will become available.</li>
</ul>
<div>
I think the best reference here would be LNKD, which has a stable niche in the professional market. By this metric TWTR would be <b>overvalued. </b>Setting user value at 100$ for TWTR would produce a fair TWTR<b> price of 46 $.</b></div>
</div>
<div>
<b><br /></b></div>
<h3 style="text-align: left;">
<b>Price derived from future earnings</b></h3>
<div>
There is enough data available of the future earnings estimates. One of the most useful ones I've found is<a href="http://blogs.wsj.com/moneybeat/2013/11/06/twitter-ipo-wall-streets-revenue-forecasts/"> here</a>.</div>
<div>
Using those numbers while subtracting company spendings, which I assume to remain constant , produces this numbers:</div>
<div>
<table border="1" class="dataframe" style="background-color: white; border-collapse: collapse; border-spacing: 0px; border: 1px solid black; color: black; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14.44444465637207px; line-height: 20px; margin: 1em 2em; max-width: 100%;"><thead>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; text-align: right;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;"></th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">banks</th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">independents</th></tr>
</thead><tbody>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2013</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">-51</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">-43</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2014</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">292</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">462</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2015</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">612</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">1120</td></tr>
</tbody></table>
</div>
<div>
<b>Net income in M$</b></div>
<div>
<b><br /></b></div>
<div>
With an assumption that a healthy company will have a final PE ratio of around 30, we can calculate share prices:</div>
<div>
<table border="1" class="dataframe" style="background-color: white; border-collapse: collapse; border-spacing: 0px; border: 1px solid black; color: black; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14.44444465637207px; line-height: 20px; margin: 1em 2em; max-width: 100%;"><thead>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; text-align: right;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;"></th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">banks</th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">independents</th><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; text-align: left; vertical-align: middle;">average</th></tr>
</thead><tbody>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2013</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">-2.81</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">-2.37</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">-2.59</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2014</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">16.08</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">25.45</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">20.76</td></tr>
<tr style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em;"><th style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">2015</th><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">33.71</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">61.69</td><td style="border-collapse: collapse; border: 1px solid black; margin: 1em 2em; padding: 4px; vertical-align: middle;">47.70</td></tr>
</tbody></table>
</div>
<div>
<b>TWTR price in $ based on PE=30</b></div>
<div>
<b><br /></b></div>
<div>
Again, average price estimate is around <b>46-48 $ </b>mark which is what it was around the IPO. Current price of 64$ is around <b>36% too high to be reasonable.</b></div>
<div>
<b><br /></b></div>
<div>
<br /></div>
<h3 style="text-align: left;">
<b>Conclusion</b></h3>
<div>
Based on available information, <b>optimistic valuation of</b> <b>TWTR</b> should be in the <b>46-48 $ </b>range. There are no clear reasons it should be trading higher and many operational risks to trade lower.</div>
<div>
My guess is that during the IPO enough professionals have reviewed the price, setting it at a fair price level. What happened next was an irrational market move not justified by new information. Just take a look at the bullish frenzy on <a href="http://stocktwits.com/symbol/TWTR?q=twtr">stocktwits</a>, with people claiming things like 'this bird will fly to $100'. Pure emotion, which never works out well.</div>
<div>
<br /></div>
<div>
The only thing that rests me now is to put my money where my mouth is and stick to my shorts. Time will tell.</div>
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-74106158814537533212013-09-19T14:47:00.001-07:002013-09-19T14:47:53.915-07:00Trading With Python course available!<div dir="ltr" style="text-align: left;" trbidi="on">
The Trading With Python course is now available for <a href="http://tradingwithpython.us7.list-manage2.com/track/click?u=f68da0ba659633adaebb96a78&id=91084d3223&e=2dcafb482c" style="color: #6dc6dd; word-wrap: break-word !important;" target="_self">subscription</a>! I have received very positive feedback from the pilot I held this spring, and this time it is going to be even better. The course is now hosted on a new <a href="http://tradingwithpython.us7.list-manage1.com/track/click?u=f68da0ba659633adaebb96a78&id=f5d1406a0a&e=2dcafb482c" style="color: #6dc6dd; word-wrap: break-word !important;" target="_self">TradingWithPython </a>website,
and the material has been updated and restructured. I even decided to
include new material, adding more trading strategies and ideas.<br />
<br />
For an overview of the included topics please take a look at the course <a href="http://tradingwithpython.us7.list-manage.com/track/click?u=f68da0ba659633adaebb96a78&id=06d2eaf1f2&e=2dcafb482c" style="color: #6dc6dd; word-wrap: break-word !important;" target="_self">contents</a> .<br />
<br /></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com0tag:blogger.com,1999:blog-5108179989881744725.post-13155584393784727832013-08-18T12:53:00.001-07:002013-09-18T13:57:16.829-07:00Short VXX strategy<div dir="ltr" style="text-align: left;" trbidi="on">
Shorting the short-term volatility etn VXX may seem like a great idea when you look at the chart from quite a distance. Due to the contango in the volatility futures, the etn experiences quite some headwind most of the time and looses a little bit its value every day. This happens due to daily rebalancing, for more information please look into the prospect.<br />
In an ideal world, if you hold it long enough, a profit generated by time decay in the futures and etn rebalancing is guaranteed, <b>but</b><i style="font-weight: bold;"> </i>in the short term, you'd have to go through some pretty heavy drawdowns. Just look back at the summer of 2011. I have been unfortunate (or foolish) enough to hold a short VXX position just before the VIX went up. I have almost blown my account by then: 80% drawdown in just a couple of days resulting in a threat of margin call by my broker. Margin call would mean cashing the loss. This is not a situation I'd ever like to be in again. I knew it would not be easy to keep head cool at all times, but experiencing the stress and pressure of the situation was something different. Luckily I knew how VXX tends to behave, so I did not panic, but switched side to XIV to avoid a margin call. The story ends well, 8 month later my portfolio was back at strength and I have learned a very valuable lesson.<br />
<br />
To start with a word of warning here: <b>do not trade volatility unless you know exactly how much risk you are taking.</b><br />
Having said that, let's take a look at a strategy that minimizes some of the risks by shorting VXX only when it is appropriate.<br />
<br />
<b>Strategy thesis: </b>VXX experiences most drag when the futures curve is in a steep contango. The futures curve is approximated by the VIX-VXV relationship. We will short VXX when VXV has an unusually high premium over VIX.<br />
<br />
First, let's take a look at the VIX-VXV relationship:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQrG9jZJRdyjEPODF9KvSemX-B_984Xs0V5dPtLgBFrjelL2FDFlzqpOfdZigu40OYQDEbfQeGXaCXIVC_HhnrwEgKfRuKVsxEOG7Uv_a3KuLsWiq8podOK09n__orWPDSdbwMriwD9jxw/s1600/vix_vs_vxv.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQrG9jZJRdyjEPODF9KvSemX-B_984Xs0V5dPtLgBFrjelL2FDFlzqpOfdZigu40OYQDEbfQeGXaCXIVC_HhnrwEgKfRuKVsxEOG7Uv_a3KuLsWiq8podOK09n__orWPDSdbwMriwD9jxw/s320/vix_vs_vxv.png" width="320" /></a></div>
<br />
The chart above shows VIX-VXV data since January 2010. Data points from last year are shown in red.<br />
I have chosen to use a quadratic fit between the two, approximating <i>VXV =</i> <i> f(VIX)</i> . The <i>f(VIX)</i> is plotted as a blue line.<br />
The values above the line represent situation when the futures are in stronger than normal contango. Now I define a <i>delta</i> indicator, which is the deviation from the fit: <i>delta = VXV-f(VIX).</i><br />
<i><br /></i>
Now let's take a look at the price of VXX along with delta:<br />
<i><br /></i>
<i><br /></i>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVtuUjQsW4djSGC7p4H5EKJqed8olctHdeWgdU5Kk6EE_mhnahoMPiUXF8Ucsdu_8X_TChMsNV744Jsl1zvxNUiNLvHEq80NvOEfUVf30Lo7VxZ5-xTfzWTcwFwnoiHiS04xHKCntJPPN6/s1600/vxx_vs_delta.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVtuUjQsW4djSGC7p4H5EKJqed8olctHdeWgdU5Kk6EE_mhnahoMPiUXF8Ucsdu_8X_TChMsNV744Jsl1zvxNUiNLvHEq80NvOEfUVf30Lo7VxZ5-xTfzWTcwFwnoiHiS04xHKCntJPPN6/s320/vxx_vs_delta.png" width="320" /></a></div>
Above: price of VXX on log scale. Below: delta. Green markers indicat delta > 0 , red markers delta<0.<br />
It is apparent that green areas correspond to a negative returns in the VXX.<br />
<br />
Let's simulate a strategy with this these assumptions:<br />
<br />
<ul style="text-align: left;">
<li>Short VXX when delta > 0</li>
<li>Constant capital ( bet on each day is 100$)</li>
<li>No slippage or transaction costs</li>
</ul>
<div>
This strategy is compared with the one that trades short every day, but does not take delta into account.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgCzkVlQqdElEdScODll2QVI6SDhfkANrZyBmxhToz9bRc37g3cUOA8vTACyApxfIbQuCuF8lVTrJswHf10tILwWVRRhXsFMagW8csbGjhzxq2_UVD8G4OYGbwD58IvS8jKH9rDj7KReOE/s1600/short_VXX_pnl.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgCzkVlQqdElEdScODll2QVI6SDhfkANrZyBmxhToz9bRc37g3cUOA8vTACyApxfIbQuCuF8lVTrJswHf10tILwWVRRhXsFMagW8csbGjhzxq2_UVD8G4OYGbwD58IvS8jKH9rDj7KReOE/s320/short_VXX_pnl.png" width="320" /></a></div>
<div>
The green line represents our VXX short strategy, blue line is the dumb one.<br />
<br />
Metrics:</div>
<pre> Delta>0 Dumb
Sharpe: 1.9 1.2
Max DD: 33% 114% (!!!)
</pre>
<div>
Sharpe of 1.9 for a simple end-of-day strategy is not bad at all in my opinion. But even more important is that the gut-wrenching drawdowns are largely avoided by paying attention to the forward futures curve.<br />
<br />
<b>Building this strategy step-by-step will be discussed during the coming <a href="http://www.tradingwithpython.com/">Trading With Python course. </a></b><br />
<br /></div>
<div>
<br /></div>
</div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com5tag:blogger.com,1999:blog-5108179989881744725.post-28767739584066646742013-08-18T11:46:00.000-07:002013-08-18T11:46:02.268-07:00Getting short volume from BATS<div dir="ltr" style="text-align: left;" trbidi="on">
In my last post I have gone through the steps needed to get the short volume data from the BATS exchange. The code provided was however of the quick-n-dirty variety. I have now packaged everything to <a href="https://code.google.com/p/trading-with-python/source/browse/trunk/lib/bats.py">bats.py </a>module that can be found on google code. (you will need the rest of the TradingWithPython library to run bats.py)<br />
<br />
Usage:<br />
<pre class="brush:python">
import tradingWithPython as twp # main library
import tradingWithPython.lib.bats as bats # bats module
dl = bats.BATS_Data('C:\\batsData') # init with directory to save data
dl.updateDb() # update data
s = dl.loadData() # process zip files
</pre>
<br /></div>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com0tag:blogger.com,1999:blog-5108179989881744725.post-35223070944126043792013-08-15T15:21:00.001-07:002013-08-15T15:23:23.521-07:00Building an indicator from short volume data<div dir="ltr" style="text-align: left;" trbidi="on">
Price of an asset or an ETF is of course the best indicator there is, but unfortunately there is only only so much information contained in it. Some people seem to think that the more indicators (rsi, macd, moving average crossover etc) , the better, but if all of them are based at the same underlying price series, they will all contain a subset of the same limited information contained in the price.<br />
We need more information <i>additional</i> to what is contained the price to make a more informed guess about what is going to happen in the near future. An excellent example of combining all sorts of info to a clever analysis can be found on the <a href="http://theshortsideoflong.blogspot.nl/">The Short Side of Long blog</a>. Producing this kind of analysis requires a great amount of work, for which I simply don't have the time as I only trade part-time.<br />
So I built my own 'market dashboard' that automatically collects information for me and presents it in an easily digestible form. In this post I'm going to show how to build an indicator based on short volume data. This post will illustrate the process of data gathering and processing.<br />
<br />
<b>Step 1: Find data source. </b><br />
BATS exchange provides daily volume data for free on their <a href="http://www.batstrading.com/market_data/shortsales/">site</a>.<br />
<br />
<b>Step 2: Get data manually & inspect</b><br />
Short volume data of the BATS exchange is contained in a text file that is zipped. Each day has its own zip file. After downloading and unzipping the txt file, this is what's inside (first several lines):<br />
<br />
<pre>Date|Symbol|Short Volume|Total Volume|Market Center
20111230|A|26209|71422|Z
20111230|AA|298405|487461|Z
20111230|AACC|300|3120|Z
20111230|AAN|3600|10100|Z
20111230|AAON|1875|6156|Z
</pre>
....<br />
<br />
In total a file contains around 6000 symbols.<br />
This data is needs quite some work before it can be presented in a meaningful manner.<br />
<br />
<b>Step 3: Automatically get data</b><br />
What I really want is not just the data for one day, but a ratio of short volume to total volume for the past several years, and I don't really feel like downloading 500+ zip files and copy-pasting them in excel manually.<br />
Luckily, full automation is only a couple of code lines away:<br />
First we need to dynamically create an url from which a file will be downloaded:<br />
<br />
<pre class="brush:python">from string import Template
def createUrl(date):
s = Template('http://www.batstrading.com/market_data/shortsales/$year/$month/$fName-dl?mkt=bzx')
fName = 'BATSshvol%s.txt.zip' % date.strftime('%Y%m%d')
url = s.substitute(fName=fName, year=date.year, month='%02d' % date.month)
return url,fName
</pre>
Output:
<br />
<pre>http://www.batstrading.com/market_data/shortsales/2013/08/BATSshvol20130813.txt.zip-dl?mkt=bzx
</pre>
<br />
Now we can download multiple files at once:<br />
<br />
<pre class="brush:python">import urllib
for i,date in enumerate(dates):
source, fName = createUrl(date)# create url and file name
dest = os.path.join(dataDir,fName)
if not os.path.exists(dest): # don't download files that are present
print 'Downloading [%i/%i]' %(i,len(dates)), source
urllib.urlretrieve(source, dest)
else:
print 'x',
</pre>
Output:<br />
<pre>Downloading [0/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110103.txt.zip-dl?mkt=bzx
Downloading [1/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110104.txt.zip-dl?mkt=bzx
Downloading [2/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110105.txt.zip-dl?mkt=bzx
Downloading [3/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110106.txt.zip-dl?mkt=bzx
</pre>
<br />
<b>Step 4. Parse downloaded files</b><br />
<br />
We can use zip and pandas libraries to parse a single file:<br />
<pre class="brush:python">import datetime as dt
import zipfile
import StringIO
def readZip(fName):
zipped = zipfile.ZipFile(fName) # open zip file
lines = zipped.read(zipped.namelist()[0]) # unzip and read first file
buf = StringIO.StringIO(lines) # create buffer
df = pd.read_csv(buf,sep='|',index_col=1,parse_dates=False,dtype={'Date':object,'Short Volume':np.float32,'Total Volume':np.float32}) # parse to table
s = df['Short Volume']/df['Total Volume'] # calculate ratio
s.name = dt.datetime.strptime(df['Date'][-1],'%Y%m%d')
return s
</pre>
<br />
It returns a ratio of Short Volume/Total Volume for all symbols in the zip file:
<br />
<pre>Symbol
A 0.531976
AA 0.682770
AAIT 0.000000
AAME 0.000000
AAN 0.506451
AAON 0.633841
AAP 0.413083
AAPL 0.642275
AAT 0.263158
AAU 0.494845
AAV 0.407976
AAWW 0.259511
AAXJ 0.334937
AB 0.857143
ABAX 0.812500
...
ZLC 0.192725
ZLCS 0.018182
ZLTQ 0.540341
ZMH 0.413315
ZN 0.266667
ZNGA 0.636890
ZNH 0.125000
ZOLT 0.472636
ZOOM 0.000000
ZQK 0.583743
ZROZ 0.024390
ZSL 0.482461
ZTR 0.584526
ZTS 0.300384
ZUMZ 0.385345
Name: 2013-08-13 00:00:00, Length: 5859, dtype: float32
</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<b>Step 5: Make a chart:</b><br />
<b><br /></b>
Now the only thing left is to parse all downloaded files and combine them to a single table and plot the result:<br />
<br />
<span id="goog_738397283"></span><span id="goog_738397284"></span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5jzx6lWJCJSaeTFAQN2Cc_rm1Rsf44NX2IDBKxbJzqAFNLnuR799pyXge8obarBO81LtoXUM5JUn2ASCkCI_r975w7auo21DY0VyOWF2wyLxb3XBMg8E2R2GaVVGXv21TrYFx3RPsavLe/s1600/bats_short.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5jzx6lWJCJSaeTFAQN2Cc_rm1Rsf44NX2IDBKxbJzqAFNLnuR799pyXge8obarBO81LtoXUM5JUn2ASCkCI_r975w7auo21DY0VyOWF2wyLxb3XBMg8E2R2GaVVGXv21TrYFx3RPsavLe/s320/bats_short.png" width="320" /></a></div>
In the figure above I have plotted the <i>average</i> short volume ratio for the past two years. I also could have used a subset of symbols if I wanted to take a look at a specific sector or stock. Quick look at the data gives me an impression that high short volume ratios usually correspond with market bottoms and low ratios seem to be good entry points for a long position.<br />
<br />
Starting from here, this short volume ratio can be used as a basis for strategy development.<br />
<br /></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-40957798886344990502013-03-17T07:07:00.000-07:002013-03-17T07:07:07.676-07:00Trading With Python course - status update<div dir="ltr" style="text-align: left;" trbidi="on">
I am happy to announce that a sufficient number of people have showed their interest in taking the course. This means that the course will definitely take place.<br />
Starting today I will be preparing a new website and material for the course, which will start in the second week of April.<br />
<br /></div>
sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com1tag:blogger.com,1999:blog-5108179989881744725.post-47183925360748328272012-01-12T13:02:00.000-08:002012-01-12T13:02:20.173-08:00Reconstructing VXX from CBOE futures dataMany people in the blogsphere have reconstructed the VXX to see how it should perform before its inception. The procedure to do this is not very complicated and well-described in the VXX prospectus and on the <a href="http://investing.kuchita.com/2011/08/16/how-the-vxx-is-calculated-and-why-backwardation-amplfies-it/" target="_blank">Intelligent Investor Blog</a>. Doing this by hand however is a very tedious work, requiring to download data for each future separately, combine them in a spreadsheet etc. <br />
The scripts below automate this process. The first one, <i>downloadVixFutures.py</i> , gets the data from cboe, saves each file in a data directory and then combines them in a single csv file, <i>vix_futures.csv</i><br />
The second script <i>reconstructVXX.py</i> parses the <i>vix_futures.csv, </i>calculates the daily returns of VXX and saves results to <i>reconstructedVXX.csv</i>.<br />
To check the calculations, I've compared my simulated results with the SPVXSTR index data, the two agree pretty well, see the charts below.<br />
<br />
<b>Note: For a fee, I can provide support to get the code running or create a stand-alone program, contact me if you are interested.</b><br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga9jk564OYT4uac9H4XpvBTzbDTpsld6nRIZm5bUqpd9EOZtqtlWaBohctr6n6UxcEhvoooWOjS1FjizXkmiulZZmH5gxKvo9HUprboMjKLAozMQbjUg40yAc6-wt9VTVeI01Rm2fumy6l/s1600/verify_results.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEga9jk564OYT4uac9H4XpvBTzbDTpsld6nRIZm5bUqpd9EOZtqtlWaBohctr6n6UxcEhvoooWOjS1FjizXkmiulZZmH5gxKvo9HUprboMjKLAozMQbjUg40yAc6-wt9VTVeI01Rm2fumy6l/s320/verify_results.png" width="320" /></a></div><br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeZzRe4HmBiU-Nq1ou2y_NLgPi_ohF8qmjo6l96qC6y1_C8JmVO-VJNMpeRoCjWRoKizsr4clLtkrcPdqQtoSqby2xVFhdwl0bLi1FDxnKqydDRnVJLd1v94N-iU8n014D5qg69VOqE_yl/s1600/verify_2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeZzRe4HmBiU-Nq1ou2y_NLgPi_ohF8qmjo6l96qC6y1_C8JmVO-VJNMpeRoCjWRoKizsr4clLtkrcPdqQtoSqby2xVFhdwl0bLi1FDxnKqydDRnVJLd1v94N-iU8n014D5qg69VOqE_yl/s320/verify_2.png" width="320" /></a></div><br />
<br />
<br />
--------------------------------source codes--------------------------------------------<br />
<br />
Code for getting futures data from CBOE and combining it into a single table<br />
<i>downloadVixFutures.py</i><br />
<br />
<pre class="brush:python">#-------------------------------------------------------------------------------
# Name: download CBOE futures
# Purpose: get VIX futures data from CBOE, process data to a single file
#
#
# Created: 15-10-2011
# Copyright: (c) Jev Kuznetsov 2011
# Licence: BSD
#-------------------------------------------------------------------------------
#!/usr/bin/env python
from urllib import urlretrieve
import os
from pandas import *
import datetime
import numpy as np
m_codes = ['F','G','H','J','K','M','N','Q','U','V','X','Z'] #month codes of the futures
codes = dict(zip(m_codes,range(1,len(m_codes)+1)))
dataDir = os.path.dirname(__file__)+'/data'
def saveVixFutureData(year,month, path, forceDownload=False):
''' Get future from CBOE and save to file '''
fName = "CFE_{0}{1}_VX.csv".format(m_codes[month],str(year)[-2:])
if os.path.exists(path+'\\'+fName) or forceDownload:
print 'File already downloaded, skipping'
return
urlStr = "http://cfe.cboe.com/Publish/ScheduledTask/MktData/datahouse/{0}".format(fName)
print 'Getting: %s' % urlStr
try:
urlretrieve(urlStr,path+'\\'+fName)
except Exception as e:
print e
def buildDataTable(dataDir):
""" create single data sheet """
files = os.listdir(dataDir)
data = {}
for fName in files:
print 'Processing: ', fName
try:
df = DataFrame.from_csv(dataDir+'/'+fName)
code = fName.split('.')[0].split('_')[1]
month = '%02d' % codes[code[0]]
year = '20'+code[1:]
newCode = year+'_'+month
data[newCode] = df
except Exception as e:
print 'Could not process:', e
full = DataFrame()
for k,df in data.iteritems():
s = df['Settle']
s.name = k
s[s<5] = np.nan
if len(s.dropna())>0:
full = full.join(s,how='outer')
else:
print s.name, ': Empty dataset.'
full[full<5]=np.nan
full = full[sorted(full.columns)]
# use only data after this date
startDate = datetime.datetime(2008,1,1)
idx = full.index >= startDate
full = full.ix[idx,:]
#full.plot(ax=gca())
print 'Saving vix_futures.csv'
full.to_csv('vix_futures.csv')
if __name__ == '__main__':
if not os.path.exists(dataDir):
print 'creating data directory %s' % dataDir
os.makedirs(dataDir)
for year in range(2008,2013):
for month in range(12):
print 'Getting data for {0}/{1}'.format(year,month+1)
saveVixFutureData(year,month,dataDir)
print 'Raw wata was saved to {0}'.format(dataDir)
buildDataTable(dataDir)
</pre><br />
Code for reconstructing the VXX <br />
<i>reconstructVXX.py</i><br />
<pre class="brush:python">"""
Reconstructing VXX from futures data
author: Jev Kuznetsov
License : BSD
"""
from __future__ import division
from pandas import *
import numpy as np
class Future(object):
""" vix future class, used to keep data structures simple """
def __init__(self,series,code=None):
""" code is optional, example '2010_01' """
self.series = series.dropna() # price data
self.settleDate = self.series.index[-1]
self.dt = len(self.series) # roll period (this is default, should be recalculated)
self.code = code # string code 'YYYY_MM'
def monthNr(self):
""" get month nr from the future code """
return int(self.code.split('_')[1])
def dr(self,date):
""" days remaining before settlement, on a given date """
return(sum(self.series.index>date))
def price(self,date):
""" price on a date """
return self.series.get_value(date)
def returns(df):
""" daily return """
return (df/df.shift(1)-1)
def recounstructVXX():
"""
calculate VXX returns
needs a previously preprocessed file vix_futures.csv
"""
X = DataFrame.from_csv('vix_futures.csv') # raw data table
# build end dates list & futures classes
futures = []
codes = X.columns
endDates = []
for code in codes:
f = Future(X[code],code=code)
print code,':', f.settleDate
endDates.append(f.settleDate)
futures.append(f)
endDates = np.array(endDates)
# set roll period of each future
for i in range(1,len(futures)):
futures[i].dt = futures[i].dr(futures[i-1].settleDate)
# Y is the result table
idx = X.index
Y = DataFrame(index=idx, columns=['first','second','days_left','w1','w2','ret'])
# W is the weight matrix
W = DataFrame(data = np.zeros(X.values.shape),index=idx,columns = X.columns)
# for VXX calculation see http://www.ipathetn.com/static/pdf/vix-prospectus.pdf
# page PS-20
for date in idx:
i =nonzero(endDates>=date)[0][0] # find first not exprired future
first = futures[i] # first month futures class
second = futures[i+1] # second month futures class
dr = first.dr(date) # number of remaining dates in the first futures contract
dt = first.dt #number of business days in roll period
W.set_value(date,codes[i],100*dr/dt)
W.set_value(date,codes[i+1],100*(dt-dr)/dt)
# this is all just debug info
Y.set_value(date,'first',first.price(date))
Y.set_value(date,'second',second.price(date))
Y.set_value(date,'days_left',first.dr(date))
Y.set_value(date,'w1',100*dr/dt)
Y.set_value(date,'w2',100*(dt-dr)/dt)
valCurr = (X*W.shift(1)).sum(axis=1) # value on day N
valYest = (X.shift(1)*W.shift(1)).sum(axis=1) # value on day N-1
Y['ret'] = valCurr/valYest-1 # index return on day N
return Y
##-------------------Main script---------------------------
Y = recounstructVXX()
print Y.head(30)#
Y.to_csv('reconstructedVXX.csv')
</pre>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com9tag:blogger.com,1999:blog-5108179989881744725.post-6812393452040047682011-12-26T04:20:00.000-08:002011-12-26T04:20:18.295-08:00howto: Observer patternThe <a href="http://en.wikipedia.org/wiki/Observer_pattern">observer pattern</a> comes very handy when dealing with complex systems. It allows for class-to class communication with a very simple structure. Even more important is the ability to separate functionality in different modules, for example running a single 'broker' as a wrapper to some api and letting multiple strategies subscribe to relevant broker events. There are some ready-made modules available, but the best way to understand how this process works is to write the whole system from scratch. In many languages this is a very tedious task, but thanks to the power of Python it only takes a couple of lines to do this.<br />
<br />
The following example code creates a <i>Sender</i> class (named Alice). Sender keeps track of interested listeners and notifies them accordingly. In more detail, this is achieved by a dictionary containing a function-event mapping, Sender.listeners.<br />
A listener class can be of any type, here I make a bunch of <i>ExampleListener</i> classes, named Bob,Dave & Charlie. All of them have a method, that is that is subscribed to <i>Sender</i>. The only special thing about the subscribed method is that it should contain three parameters: <i>sender, event, message</i>. Sender is the class reference of the <i>Sender</i> class, so a listener would know who sent the message. Event is an identifier, for which I usually use a string. Optionally, a message is the data that is passed to a function.<br />
A nice detail is that if a listener method throws an exception, it is automatically unsubscribed from further events.<br />
<br />
<br />
<pre class="brush:python">'''
Created on 26 dec. 2011
Copyright: Jev Kuznetsov
License: BSD
sender-reciever pattern.
'''
import tradingWithPython.lib.logger as logger
import types
class Sender(object):
"""
Sender -> dispatches messages to interested callables
"""
def __init__(self):
self.listeners = {}
self.logger = logger.getLogger()
def register(self,listener,events=None):
"""
register a listener function
Parameters
-----------
listener : external listener function
events : tuple or list of relevant events (default=None)
"""
if events is not None and type(events) not in (types.TupleType,types.ListType):
events = (events,)
self.listeners[listener] = events
def dispatch(self,event=None, msg=None):
"""notify listeners """
for listener,events in self.listeners.items():
if events is None or event is None or event in events:
try:
listener(self,event,msg)
except (Exception,):
self.unregister(listener)
errmsg = "Exception in message dispatch: Handler '{0}' unregistered for event '{1}' ".format(listener.func_name,event)
self.logger.exception(errmsg)
def unregister(self,listener):
""" unregister listener function """
del self.listeners[listener]
#---------------test functions--------------
class ExampleListener(object):
def __init__(self,name=None):
self.name = name
def method(self,sender,event,msg=None):
print "[{0}] got event {1} with message {2}".format(self.name,event,msg)
if __name__=="__main__":
print 'demonstrating event system'
alice = Sender()
bob = ExampleListener('bob')
charlie = ExampleListener('charlie')
dave = ExampleListener('dave')
# add subscribers to messages from alice
alice.register(bob.method,events='event1') # listen to 'event1'
alice.register(charlie.method,events ='event2') # listen to 'event2'
alice.register(dave.method) # listen to all events
# dispatch some events
alice.dispatch(event='event1')
alice.dispatch(event='event2',msg=[1,2,3])
alice.dispatch(msg='attention to all')
print 'Done.'
</pre>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com0tag:blogger.com,1999:blog-5108179989881744725.post-76900262134363105972011-12-14T14:22:00.000-08:002011-12-14T14:23:01.444-08:00Plotting with guiqwtWhile it's been quiet on this blog, but behind the scenes I have been very busy trying to build an interactive spread scanner. To make one, a list of ingredients is needed:<br />
<br />
gui toolkit: pyqt -check.<br />
data aquisition: ibpy & tradingWithPython.lib.yahooData - check.<br />
data container: pandas & sqlite - check.<br />
plotting library: matplotlib - ehm... No.<br />
<br />
After tinkering with matplotlib in pyqt for several days I must admit its use in interactive applications is far from optimal. Slow, difficult to integrate and little interactivity. PyQwt proven to work a little better, but it had its own quirks, a little bit too low-level for me.<br />
But as it often happens with Python, somebody, somewhere has already written a kick-ass toolkit that is just perfect for the job. And it looks like <a href="http://packages.python.org/guiqwt/index.html">guiqwt</a> is just it. Interactive charts are just a couple of code lines away now, take a look at an example here: <a href="http://code.google.com/p/trading-with-python/source/browse/trunk/cookbook/guiqwt_CurveDialog.py">Creating curve dialog</a> . For this I used guiqwt example code with some minor tweaks.<br />
<br />
And of course a pretty picture of the result:<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIekZOFykAVU3XsggtC57vZDyRy4JgG0mIHpsSD9w1KozN4AU4qYNUBboE6fkvbp35DSb_hYG4UTMmkCtQILXlJqQW7kK2AnY1jxb5q_r02Mde6dksdShX5roHA45y9PrCdKssc-T1Clqe/s1600/guiqwt_example.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="216" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIekZOFykAVU3XsggtC57vZDyRy4JgG0mIHpsSD9w1KozN4AU4qYNUBboE6fkvbp35DSb_hYG4UTMmkCtQILXlJqQW7kK2AnY1jxb5q_r02Mde6dksdShX5roHA45y9PrCdKssc-T1Clqe/s320/guiqwt_example.png" width="320" /></a></div><br />
...If only I knew how to set dates on the x-axis....sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com2tag:blogger.com,1999:blog-5108179989881744725.post-6926910721139526192011-11-04T13:24:00.000-07:002011-11-04T13:24:37.772-07:00How to setup Python development environmentIf you would like to start playing with the code from this blog and write your own, you need to setup a development environment first. I've already put a summary of tools and software packages on the <a href="http://tradingwithpython.blogspot.com/p/setting-up-development-environment.html">tools page</a> and to make it even easier, here are the steps you'll need to follow to get up and running:<br />
<br />
1. Install <a href="http://code.google.com/p/pythonxy/wiki/Downloads">PythonXY</a>. : this includes Python 2.7 and tools Spyder, Ipython etc.<br />
2. Install <a href="http://tortoisesvn.net/downloads.html">Tortoise SVN</a>. This is a utility that you need to pull the source code from Google Code<br />
3. Install <a href="http://pypi.python.org/packages/2.7/p/pandas/pandas-0.5.0.win32-py2.7.exe#md5=c2badf1d82d48a57abcff72228d28cd9">Pandas</a> (time series library)<br />
<br />
This is all you need for now.<br />
To get the code, use 'Svn Checkout' windows explorer context menu that is available after installing Tortoise SVN. Checkout like this (change Checkout directory to the location you want, but it should end with <i>tradingWithPython</i>):<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPSvwOay5nrgFqMkgOesIwyiWBzfISesjQCxXNM5AgyYOgumSFBq3DvU30EBbRs9E-hAavCO6HTp-nN5ZdfzHjRrI-TSHfDrHfvuJYbsPykjQ5uCSHHyWy8-K1ljXRwGP0UUiuJjtIJQTn/s1600/svn_checkout.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="248" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhPSvwOay5nrgFqMkgOesIwyiWBzfISesjQCxXNM5AgyYOgumSFBq3DvU30EBbRs9E-hAavCO6HTp-nN5ZdfzHjRrI-TSHfDrHfvuJYbsPykjQ5uCSHHyWy8-K1ljXRwGP0UUiuJjtIJQTn/s320/svn_checkout.png" width="320" /></a></div>If all goes well, the most recent version of the files will be downloaded. I'll be writing more code and improving current one, you'll be able to stay in sync with my code by using 'svn update' context menu.<br />
<br />
The final step is to launch Spyder (through pythonXY launcher or start menu) and add the directory just above the '<i>tradingWithPython' </i>(in my example C:\Users\jev\Desktop') dir to python path . Do this with 'tools'->'PYTHONPATH manager'.<br />
Ok, all done, now you can run the examples from \cookbok dir.sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com4tag:blogger.com,1999:blog-5108179989881744725.post-81017273498649984202011-10-28T13:33:00.000-07:002011-10-28T13:33:40.237-07:00kung fu pandas will solve your data problemsI love researching strategies, but sadly I spend too much time on low-level work like filtering and aligning datasets. In fact, about 80% of my time is spent on this mind numbing work. There had got to be a better way than hacking all the filtering code myself, and there is!<br />
Some time ago I've come across a data analysis toolkit<a href="http://pandas.sourceforge.net/"> pandas </a>especially suited for working with financial data. After just scratching the surface of its capabilities I'm already blown away by what it delivers. The package is being actively developed by <a href="http://wesmckinney.com/blog/">Wes McKinney </a> and his ambition is to create the most powerful and flexible open source data analysis/manipulation tool available. Well, I think he is well on the way!<br />
<br />
Let's take a look at just how easy it is to align two datasets:<br />
<br />
<pre class="brush:python">from tradingWithPython.lib import yahooFinance
startDate = (2005,1,1)
# create two timeseries. data for SPY goes much further back
# than data of VXX
spy = yahooFinance.getHistoricData('SPY',sDate=startDate)
vxx = yahooFinance.getHistoricData('VXX',sDate=startDate)
# Combine two datasets
X = DataFrame({'VXX':vxx['adj_close'],'SPY':spy['adj_close']})
# remove NaN entries
X= X.dropna()
# make a nice picture
X.plot()
</pre>Two lines of code! ( this could be even fit to one line, but I've split it for readability)<br />
Here is the result:<br />
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO7fzENTR-4Ly80jgYSAbWvC8d5-ECSwyylUi07Lv87BXB3B3238v_Q36Xem8UiupQsABYqIna3RgYlcRSdRgfKBzBiLAppPFFQJbb5YvykCMgmebQnzkFtHDso89bnnCOdeB8tWDiRzSM/s1600/pandasPlot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO7fzENTR-4Ly80jgYSAbWvC8d5-ECSwyylUi07Lv87BXB3B3238v_Q36Xem8UiupQsABYqIna3RgYlcRSdRgfKBzBiLAppPFFQJbb5YvykCMgmebQnzkFtHDso89bnnCOdeB8tWDiRzSM/s320/pandasPlot.png" width="320" /></a></div><br />
Man, this could have saved me a ton of time! But it still will help me in the future, as I'll be using its DataFrame object as a standard in my further work.sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com6tag:blogger.com,1999:blog-5108179989881744725.post-17267439858086512992011-10-17T13:38:00.000-07:002011-10-17T13:38:56.510-07:00Tools & CookbookI've added two pages specifically to help new users to get started.<br />
<b><a href="http://tradingwithpython.blogspot.com/p/setting-up-development-environment.html">Tools</a>: </b>here you'll find all the info you need to set up a development environment.<br />
<b><a href="http://tradingwithpython.blogspot.com/p/cookbook.html">Cookbook</a></b>: Overview of recipies I've written. The code itself is hosted on Google Code.sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-1482082132095115832011-10-15T10:38:00.000-07:002011-10-15T10:38:12.877-07:00How to download a bunch of csv filesI'll be analyzing VIX futures data, so to start with, I need all the files from CBOE. They use a consistent format for file names, which is easy to generate. The following script creates a data directory and then downloads all csv files for the period 2004-2011. As you can see, not much coding is required, less than 30 lines!<br />
<br />
<br />
<pre class="brush:python">
from urllib import urlretrieve
import os
m_codes = ['F','G','H','J','K','M','N','Q','U','V','X','Z'] #month codes of the futures
dataDir = os.getenv("USERPROFILE")+'\\twpData\\vixFutures' # data directory
def saveVixFutureData(year,month, path):
''' Get future from CBOE and save to file '''
fName = "CFE_{0}{1}_VX.csv".format(m_codes[month],str(year)[-2:])
urlStr = "http://cfe.cboe.com/Publish/ScheduledTask/MktData/datahouse/{0}".format(fName)
try:
urlretrieve(urlStr,path+'\\'+fName)
except Exception as e:
print e
if __name__ == '__main__':
if not os.path.exists(dataDir):
os.makedirs(dataDir)
for year in range(2004,2012):
for month in range(12):
print 'Getting data for {0}/{1}'.format(year,month)
saveVixFutureData(year,month,dataDir)
print 'Data was saved to {0}'.format(dataDir)
</pre>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com1tag:blogger.com,1999:blog-5108179989881744725.post-6810820344642557072011-10-14T11:54:00.000-07:002011-10-14T12:12:34.110-07:00Interactive work with IPythonPretty good screencast on python basics/<br />
There is more of this good stuff which can be found <a href="http://www.youtube.com/playlist?list=PL7E11B34616530F5E">here</a>.<br />
<br />
<iframe allowfullscreen="" frameborder="0" height="315" src="http://www.youtube.com/embed/v_3NjQB3q-Q?rel=0" width="560"></iframe>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com3tag:blogger.com,1999:blog-5108179989881744725.post-30716806531037070662011-09-25T14:09:00.000-07:002011-09-25T14:09:51.345-07:00Getting StartedFirst of all, I must admit that it took me quite some time even to consider switching from Matlab to Python. I've been working with Matlab for more than ten years, and I must say that it is an excellent tool for research. However, both in my professional and private work there where quite some cases where Matlab was not the best tool to do the job. This forced me to learn things like Labwindows and C# for gui programming, php for website building. Not to forget that Matlab is not a free (not as in speech, nor as in beer) product.<br />
Still, the main reason for me trying to make the switch, is Python being a 'swiss knife of programming'. It can be used to make excellent GUIs with Qt, conduct research with SciPy, program dynamic websites and so on.<br />
<br />
All posts on this blog will be based on the following tooling:<br />
<br />
<br />
<ul><li>As a base distribution I'm using <a href="http://code.google.com/p/pythonxy/wiki/Welcome">PythonXY</a>, which is an excellent package containing (almost) everything one needs to do scientific programming with Python.</li>
<li>To connect to Interactive Brokers you'll also need <a href="http://code.google.com/p/ibpy/">IbPy</a>.</li>
</ul>sjevhttp://www.blogger.com/profile/17452562180989360928noreply@blogger.com0