Trading with Python

Wednesday, January 15, 2014

Starting IPython notebook from windows file exlorer

I organize my IPython notebooks by saving them in different directories. This brings however an inconvenience, because to access the notebooks I need to open a terminal and type 'ipython notebook --pylab=inline' each and every time. I'm sure the ipython team will solve this in the long run, but in the meantime there is a pretty descent way to quickly access the notebooks from the file explorer.

All you need to do is add a context menu that starts ipython server in your desired directory:

A quick way to add the context item is by running this registry patch. (Note: the patch assumes that you have your python installation located in C:\Anaconda . If not, you’ll need to open the .reg file in a text editor and set the right path on the last line).

Instructions on adding the registry keys manually can be found on Frolian's blog.

Monday, January 13, 2014

Leveraged ETFs in 2013, where is your decay now?

Many people think that leveraged etfs in the long term underperform their benchmarks. This is true for choppy markets, but not in the case of trending conditions, either up or down. Leverage only has effect on the most likely outcome, not on the expected outcome. For more background please read this post.

2013 has been a very good year for stocks, which trended up for most of the year. Let's see what would happen if we shorted some of the leveraged etfs exactly a year ago and hedged them with their benchmark.
Knowing the leveraged etf behavior I would expect that leveraged etfs outperformed their benchmark, so the strategy that would try to profit from the decay would lose money.

I will be considering these pairs:

SPY 2 SSO -1
SPY -2 SDS -1
QQQ 2 QLD -1
QQQ -2 QID -1
IYF -2 SKF -1

Each leveraged etf is held short (-1 $) and hedged with an 1x etf. Notice that to hedge an inverse etf a negative position is held in the 1x etf.

Here is one example: SPY vs SSO.
Once we normalize the prices to 100$ at the beginning of the backtest period (250 days) it is apparent that the 2x etf outperforms 1x etf.

Now the results of the backtest on the pairs above:

All the 2x etfs (including inverse) have outperformed their benchmark over the course of 2013. According to expectations, the strategy exploiting 'beta decay' would not be profitable.

I would think that playing leveraged etfs against their unleveraged counterpart does not provide any edge, unless you know the market conditions beforehand (trending or range-bound). But if you do know the coming market regime, there are much easier ways to profit from it. Unfortunately, nobody has yet been really succesful at predicting the market regime at even the very short term.

Full source code of the calculations is available for the subscribers of the Trading With Python course. Notebook #307

Thursday, January 2, 2014

Putting a price tag on TWTR

Here is my shot at Twitter valuation. I'd like to start with a disclaimer: at this moment a large portion of my portrolio consists of short TWTR position, so my opinion is rather skewed. The reason I've done my own analysis is that my bet did not work out well, and Twitter made a parabolic move in December 2013. So the question that I'm trying to answer here is "should I take my loss or hold on to my shorts?".

At the time of writing, TWTR trades around $64 mark, with a market cap of 34.7 B$. Up till now the company has not made any profit, loosing 142M$ in 3013 after making 534M$ in revenues. The last two numbers give us yearly company spendings of 676M$.

Price derived from user value

Twitter can be compared with with Facebook, Google and LinkedIn to get an idea of user numbers and their values. The table below summarises user numbers per company and a value per user derived from the market cap. (source for number of users: Wikipedia, number for Google is based on number of unique searches)

	users [millions]	user value [$]
FB	1190	113
TWTR	250	139
GOOG	2000	187
LNKD	259	100

It becomes apparent that the market valuation per user is very similar for all of the companies, however my personal opinion is that:

TWTR is currently more valuable per user thatn FB or LNKD. This is not logical as both competitors have more valuable personal user data at their disposal.
GOOG has been excelling at extracting ad revenue from its users. To do that, it has a set of highly diversified offerings, from search engine to Google+ , Docs and Gmail. TWTR has nothing resembling that, while its value per user is only 35% lower thatn that of Google.
TWTR has a limited room to grow its user base as it does not offer products comparable to FB or GOOG offerings. TWTR has been around for seven years now and most people wanting an accout have got their chance. The rest just does not care.
TWTR user base is volatile and is likely to move to the next hot thing when it will become available.

I think the best reference here would be LNKD, which has a stable niche in the professional market. By this metric TWTR would be overvalued. Setting user value at 100$ for TWTR would produce a fair TWTR price of 46 $.

Price derived from future earnings

There is enough data available of the future earnings estimates. One of the most useful ones I've found is here.

Using those numbers while subtracting company spendings, which I assume to remain constant , produces this numbers:

	banks	independents
2013	-51	-43
2014	292	462
2015	612	1120

Net income in M$

With an assumption that a healthy company will have a final PE ratio of around 30, we can calculate share prices:

	banks	independents	average
2013	-2.81	-2.37	-2.59
2014	16.08	25.45	20.76
2015	33.71	61.69	47.70

TWTR price in $ based on PE=30

Again, average price estimate is around 46-48 $ mark which is what it was around the IPO. Current price of 64$ is around 36% too high to be reasonable.

Conclusion

Based on available information, optimistic valuation of TWTR should be in the 46-48 $ range. There are no clear reasons it should be trading higher and many operational risks to trade lower.

My guess is that during the IPO enough professionals have reviewed the price, setting it at a fair price level. What happened next was an irrational market move not justified by new information. Just take a look at the bullish frenzy on stocktwits, with people claiming things like 'this bird will fly to $100'. Pure emotion, which never works out well.

The only thing that rests me now is to put my money where my mouth is and stick to my shorts. Time will tell.

Thursday, September 19, 2013

Trading With Python course available!

The Trading With Python course is now available for subscription! I have received very positive feedback from the pilot I held this spring, and this time it is going to be even better. The course is now hosted on a new TradingWithPython website, and the material has been updated and restructured. I even decided to include new material, adding more trading strategies and ideas.

For an overview of the included topics please take a look at the course contents .

Sunday, August 18, 2013

Short VXX strategy

Shorting the short-term volatility etn VXX may seem like a great idea when you look at the chart from quite a distance. Due to the contango in the volatility futures, the etn experiences quite some headwind most of the time and looses a little bit its value every day. This happens due to daily rebalancing, for more information please look into the prospect.
In an ideal world, if you hold it long enough, a profit generated by time decay in the futures and etn rebalancing is guaranteed, but in the short term, you'd have to go through some pretty heavy drawdowns. Just look back at the summer of 2011. I have been unfortunate (or foolish) enough to hold a short VXX position just before the VIX went up. I have almost blown my account by then: 80% drawdown in just a couple of days resulting in a threat of margin call by my broker. Margin call would mean cashing the loss. This is not a situation I'd ever like to be in again. I knew it would not be easy to keep head cool at all times, but experiencing the stress and pressure of the situation was something different. Luckily I knew how VXX tends to behave, so I did not panic, but switched side to XIV to avoid a margin call. The story ends well, 8 month later my portfolio was back at strength and I have learned a very valuable lesson.

To start with a word of warning here: do not trade volatility unless you know exactly how much risk you are taking.
Having said that, let's take a look at a strategy that minimizes some of the risks by shorting VXX only when it is appropriate.

Strategy thesis: VXX experiences most drag when the futures curve is in a steep contango. The futures curve is approximated by the VIX-VXV relationship. We will short VXX when VXV has an unusually high premium over VIX.

First, let's take a look at the VIX-VXV relationship:

The chart above shows VIX-VXV data since January 2010. Data points from last year are shown in red.
I have chosen to use a quadratic fit between the two, approximating VXV = f(VIX) . The f(VIX) is plotted as a blue line.
The values above the line represent situation when the futures are in stronger than normal contango. Now I define a delta indicator, which is the deviation from the fit: delta = VXV-f(VIX).

Now let's take a look at the price of VXX along with delta:

Above: price of VXX on log scale. Below: delta. Green markers indicat delta > 0 , red markers delta<0.
It is apparent that green areas correspond to a negative returns in the VXX.

Let's simulate a strategy with this these assumptions:

Short VXX when delta > 0
Constant capital ( bet on each day is 100$)
No slippage or transaction costs

This strategy is compared with the one that trades short every day, but does not take delta into account.

The green line represents our VXX short strategy, blue line is the dumb one.

Metrics:

          Delta>0     Dumb
Sharpe:    1.9         1.2
Max DD:    33%         114% (!!!)

Sharpe of 1.9 for a simple end-of-day strategy is not bad at all in my opinion. But even more important is that the gut-wrenching drawdowns are largely avoided by paying attention to the forward futures curve.

Building this strategy step-by-step will be discussed during the coming Trading With Python course.

Getting short volume from BATS

In my last post I have gone through the steps needed to get the short volume data from the BATS exchange. The code provided was however of the quick-n-dirty variety. I have now packaged everything to bats.py module that can be found on google code. (you will need the rest of the TradingWithPython library to run bats.py)

Usage:


import tradingWithPython as twp # main library
import tradingWithPython.lib.bats as bats # bats module

dl = bats.BATS_Data('C:\\batsData') # init with directory to save data
dl.updateDb() # update data
s = dl.loadData() # process zip files

Thursday, August 15, 2013

Building an indicator from short volume data

Price of an asset or an ETF is of course the best indicator there is, but unfortunately there is only only so much information contained in it. Some people seem to think that the more indicators (rsi, macd, moving average crossover etc) , the better, but if all of them are based at the same underlying price series, they will all contain a subset of the same limited information contained in the price.
We need more information additional to what is contained the price to make a more informed guess about what is going to happen in the near future. An excellent example of combining all sorts of info to a clever analysis can be found on the The Short Side of Long blog. Producing this kind of analysis requires a great amount of work, for which I simply don't have the time as I only trade part-time.
So I built my own 'market dashboard' that automatically collects information for me and presents it in an easily digestible form. In this post I'm going to show how to build an indicator based on short volume data. This post will illustrate the process of data gathering and processing.

Step 1: Find data source.
BATS exchange provides daily volume data for free on their site.

Step 2: Get data manually & inspect
Short volume data of the BATS exchange is contained in a text file that is zipped. Each day has its own zip file. After downloading and unzipping the txt file, this is what's inside (first several lines):

Date|Symbol|Short Volume|Total Volume|Market Center
20111230|A|26209|71422|Z
20111230|AA|298405|487461|Z
20111230|AACC|300|3120|Z
20111230|AAN|3600|10100|Z
20111230|AAON|1875|6156|Z

....

In total a file contains around 6000 symbols.
This data is needs quite some work before it can be presented in a meaningful manner.

Step 3: Automatically get data
What I really want is not just the data for one day, but a ratio of short volume to total volume for the past several years, and I don't really feel like downloading 500+ zip files and copy-pasting them in excel manually.
Luckily, full automation is only a couple of code lines away:
First we need to dynamically create an url from which a file will be downloaded:

from string import Template

def createUrl(date):
    s = Template('http://www.batstrading.com/market_data/shortsales/$year/$month/$fName-dl?mkt=bzx')
    fName = 'BATSshvol%s.txt.zip' % date.strftime('%Y%m%d')
    
    url = s.substitute(fName=fName, year=date.year, month='%02d' % date.month)
    
    return url,fName

Output:

http://www.batstrading.com/market_data/shortsales/2013/08/BATSshvol20130813.txt.zip-dl?mkt=bzx

Now we can download multiple files at once:

import urllib

for i,date in enumerate(dates):
    source, fName =  createUrl(date)# create url and file name
    dest = os.path.join(dataDir,fName) 
    if not os.path.exists(dest): # don't download files that are present
        print 'Downloading [%i/%i]' %(i,len(dates)), source
        urllib.urlretrieve(source, dest)
    else:
        print 'x',

Output:

Downloading [0/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110103.txt.zip-dl?mkt=bzx
Downloading [1/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110104.txt.zip-dl?mkt=bzx
Downloading [2/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110105.txt.zip-dl?mkt=bzx
Downloading [3/657] http://www.batstrading.com/market_data/shortsales/2011/01/BATSshvol20110106.txt.zip-dl?mkt=bzx

Step 4. Parse downloaded files

We can use zip and pandas libraries to parse a single file:

import datetime as dt
import zipfile
import StringIO

def readZip(fName):
    zipped = zipfile.ZipFile(fName) # open zip file 
    lines = zipped.read(zipped.namelist()[0]) # unzip and read first file 
    buf = StringIO.StringIO(lines) # create buffer
    df = pd.read_csv(buf,sep='|',index_col=1,parse_dates=False,dtype={'Date':object,'Short Volume':np.float32,'Total Volume':np.float32}) # parse to table
    s = df['Short Volume']/df['Total Volume'] # calculate ratio
    s.name = dt.datetime.strptime(df['Date'][-1],'%Y%m%d')
    
    return s

It returns a ratio of Short Volume/Total Volume for all symbols in the zip file:

Symbol
A         0.531976
AA        0.682770
AAIT      0.000000
AAME      0.000000
AAN       0.506451
AAON      0.633841
AAP       0.413083
AAPL      0.642275
AAT       0.263158
AAU       0.494845
AAV       0.407976
AAWW      0.259511
AAXJ      0.334937
AB        0.857143
ABAX      0.812500
...
ZLC       0.192725
ZLCS      0.018182
ZLTQ      0.540341
ZMH       0.413315
ZN        0.266667
ZNGA      0.636890
ZNH       0.125000
ZOLT      0.472636
ZOOM      0.000000
ZQK       0.583743
ZROZ      0.024390
ZSL       0.482461
ZTR       0.584526
ZTS       0.300384
ZUMZ      0.385345
Name: 2013-08-13 00:00:00, Length: 5859, dtype: float32

Step 5: Make a chart:

Now the only thing left is to parse all downloaded files and combine them to a single table and plot the result:

In the figure above I have plotted the average short volume ratio for the past two years. I also could have used a subset of symbols if I wanted to take a look at a specific sector or stock. Quick look at the data gives me an impression that high short volume ratios usually correspond with market bottoms and low ratios seem to be good entry points for a long position.

Starting from here, this short volume ratio can be used as a basis for strategy development.

Sunday, March 17, 2013

Trading With Python course - status update

I am happy to announce that a sufficient number of people have showed their interest in taking the course. This means that the course will definitely take place.
Starting today I will be preparing a new website and material for the course, which will start in the second week of April.

Thursday, January 12, 2012

Reconstructing VXX from CBOE futures data

Many people in the blogsphere have reconstructed the VXX to see how it should perform before its inception. The procedure to do this is not very complicated and well-described in the VXX prospectus and on the Intelligent Investor Blog. Doing this by hand however is a very tedious work, requiring to download data for each future separately, combine them in a spreadsheet etc.
The scripts below automate this process. The first one, downloadVixFutures.py , gets the data from cboe, saves each file in a data directory and then combines them in a single csv file, vix_futures.csv
The second script reconstructVXX.py parses the vix_futures.csv, calculates the daily returns of VXX and saves results to reconstructedVXX.csv.
To check the calculations, I've compared my simulated results with the SPVXSTR index data, the two agree pretty well, see the charts below.

Note: For a fee, I can provide support to get the code running or create a stand-alone program, contact me if you are interested.

--------------------------------source codes--------------------------------------------

Code for getting futures data from CBOE and combining it into a single table
downloadVixFutures.py

#-------------------------------------------------------------------------------
# Name:        download CBOE futures
# Purpose:     get VIX futures data from CBOE, process data to a single file
#
#
# Created:     15-10-2011
# Copyright:   (c) Jev Kuznetsov 2011
# Licence:     BSD
#-------------------------------------------------------------------------------
#!/usr/bin/env python



from urllib import urlretrieve
import os
from pandas import *
import datetime
import numpy as np

m_codes = ['F','G','H','J','K','M','N','Q','U','V','X','Z'] #month codes of the futures
codes = dict(zip(m_codes,range(1,len(m_codes)+1)))

dataDir = os.path.dirname(__file__)+'/data'



   

def saveVixFutureData(year,month, path, forceDownload=False):
    ''' Get future from CBOE and save to file '''
    fName = "CFE_{0}{1}_VX.csv".format(m_codes[month],str(year)[-2:])
    if os.path.exists(path+'\\'+fName) or forceDownload:
        print 'File already downloaded, skipping'
        return
    
    urlStr = "http://cfe.cboe.com/Publish/ScheduledTask/MktData/datahouse/{0}".format(fName)
    print 'Getting: %s' % urlStr
    try:
        urlretrieve(urlStr,path+'\\'+fName)
    except Exception as e:
        print e

def buildDataTable(dataDir):
    """ create single data sheet """
    files = os.listdir(dataDir)

    data = {}
    for fName in files:
        print 'Processing: ', fName
        try:
            df = DataFrame.from_csv(dataDir+'/'+fName)
            
                    
            code = fName.split('.')[0].split('_')[1]
            month = '%02d' % codes[code[0]]
            year = '20'+code[1:]
            newCode = year+'_'+month
            data[newCode] = df
        except Exception as e:
            print 'Could not process:', e
        
        
    full = DataFrame()
    for k,df in data.iteritems():
        s = df['Settle']
        s.name = k
        s[s<5] = np.nan
        if len(s.dropna())>0:
            full = full.join(s,how='outer')
        else:
            print s.name, ': Empty dataset.'
    
    full[full<5]=np.nan
    full = full[sorted(full.columns)]
        
    # use only data after this date
    startDate = datetime.datetime(2008,1,1)
    
    idx = full.index >= startDate
    full = full.ix[idx,:]
    
    #full.plot(ax=gca())
    print 'Saving vix_futures.csv'
    full.to_csv('vix_futures.csv')


if __name__ == '__main__':

    if not os.path.exists(dataDir):
         print 'creating data directory %s' % dataDir
         os.makedirs(dataDir)

    for year in range(2008,2013):
        for month in range(12):
            print 'Getting data for {0}/{1}'.format(year,month+1)
            saveVixFutureData(year,month,dataDir)

    print 'Raw wata was saved to {0}'.format(dataDir)
    
    buildDataTable(dataDir)

Code for reconstructing the VXX
reconstructVXX.py

"""
Reconstructing VXX from futures data

author: Jev Kuznetsov

License : BSD
"""
from __future__ import division
from pandas import *
import numpy as np

class Future(object):
    """ vix future class, used to keep data structures simple """
    def __init__(self,series,code=None):
        """ code is optional, example '2010_01' """
        self.series = series.dropna() # price data
        self.settleDate = self.series.index[-1]
        self.dt = len(self.series) # roll period (this is default, should be recalculated)
        self.code = code # string code 'YYYY_MM'
    
    def monthNr(self):
        """ get month nr from the future code """
        return int(self.code.split('_')[1])
         
    def dr(self,date):
        """ days remaining before settlement, on a given date """
        return(sum(self.series.index>date))
    
    
    def price(self,date):
        """ price on a date """
        return self.series.get_value(date)


def returns(df):
    """ daily return """
    return (df/df.shift(1)-1)


def recounstructVXX():
    """ 
    calculate VXX returns 
    needs a previously preprocessed file vix_futures.csv     
    """
    X = DataFrame.from_csv('vix_futures.csv') # raw data table
    
    # build end dates list & futures classes
    futures = []
    codes = X.columns
    endDates = []
    for code in codes:
        f = Future(X[code],code=code)
        print code,':', f.settleDate
        endDates.append(f.settleDate)
        futures.append(f)

    endDates = np.array(endDates) 

    # set roll period of each future
    for i in range(1,len(futures)):
        futures[i].dt = futures[i].dr(futures[i-1].settleDate)


    # Y is the result table
    idx = X.index
    Y = DataFrame(index=idx, columns=['first','second','days_left','w1','w2','ret'])
  
    # W is the weight matrix
    W = DataFrame(data = np.zeros(X.values.shape),index=idx,columns = X.columns)

    # for VXX calculation see http://www.ipathetn.com/static/pdf/vix-prospectus.pdf
    # page PS-20
    for date in idx:
        i =nonzero(endDates>=date)[0][0] # find first not exprired future
        first = futures[i] # first month futures class
        second = futures[i+1] # second month futures class
        
        dr = first.dr(date) # number of remaining dates in the first futures contract
        dt = first.dt #number of business days in roll period
        
        W.set_value(date,codes[i],100*dr/dt)
        W.set_value(date,codes[i+1],100*(dt-dr)/dt)
     
        # this is all just debug info
        Y.set_value(date,'first',first.price(date))
        Y.set_value(date,'second',second.price(date))
        Y.set_value(date,'days_left',first.dr(date))
        Y.set_value(date,'w1',100*dr/dt)
        Y.set_value(date,'w2',100*(dt-dr)/dt)
    
    
    valCurr = (X*W.shift(1)).sum(axis=1) # value on day N
    valYest = (X.shift(1)*W.shift(1)).sum(axis=1) # value on day N-1
    Y['ret'] = valCurr/valYest-1    # index return on day N

    return Y


  
    

##-------------------Main script---------------------------

Y = recounstructVXX()

print Y.head(30)#
Y.to_csv('reconstructedVXX.csv')

Monday, December 26, 2011

howto: Observer pattern

The observer pattern comes very handy when dealing with complex systems. It allows for class-to class communication with a very simple structure. Even more important is the ability to separate functionality in different modules, for example running a single 'broker' as a wrapper to some api and letting multiple strategies subscribe to relevant broker events. There are some ready-made modules available, but the best way to understand how this process works is to write the whole system from scratch. In many languages this is a very tedious task, but thanks to the power of Python it only takes a couple of lines to do this.

The following example code creates a Sender class (named Alice). Sender keeps track of interested listeners and notifies them accordingly. In more detail, this is achieved by a dictionary containing a function-event mapping, Sender.listeners.
A listener class can be of any type, here I make a bunch of ExampleListener classes, named Bob,Dave & Charlie. All of them have a method, that is that is subscribed to Sender. The only special thing about the subscribed method is that it should contain three parameters: sender, event, message. Sender is the class reference of the Sender class, so a listener would know who sent the message. Event is an identifier, for which I usually use a string. Optionally, a message is the data that is passed to a function.
A nice detail is that if a listener method throws an exception, it is automatically unsubscribed from further events.

'''
Created on 26 dec. 2011
Copyright: Jev Kuznetsov
License: BSD

sender-reciever pattern.

'''

import tradingWithPython.lib.logger as logger
import types

class Sender(object):
    """
    Sender -> dispatches messages to interested callables 
    """
    def __init__(self):
        self.listeners = {}
        self.logger = logger.getLogger()
        
        
    def register(self,listener,events=None):
        """
        register a listener function
        
        Parameters
        -----------
        listener : external listener function
        events  : tuple or list of relevant events (default=None)
        """
        if events is not None and type(events) not in (types.TupleType,types.ListType):
            events = (events,)
             
        self.listeners[listener] = events
        
    def dispatch(self,event=None, msg=None):
        """notify listeners """
        for listener,events in self.listeners.items():
            if events is None or event is None or event in events:
                try:
                    listener(self,event,msg)
                except (Exception,):
                    self.unregister(listener)
                    errmsg = "Exception in message dispatch: Handler '{0}' unregistered for event '{1}'  ".format(listener.func_name,event)
                    self.logger.exception(errmsg)
            
    def unregister(self,listener):
        """ unregister listener function """
        del self.listeners[listener]             
                   
#---------------test functions--------------

class ExampleListener(object):
    def __init__(self,name=None):
        self.name = name
    
    def method(self,sender,event,msg=None):
        print "[{0}] got event {1} with message {2}".format(self.name,event,msg)
                   

if __name__=="__main__":
    print 'demonstrating event system'
    
    
    alice = Sender()
    bob = ExampleListener('bob')
    charlie = ExampleListener('charlie')
    dave = ExampleListener('dave')
    
    
    # add subscribers to messages from alice
    alice.register(bob.method,events='event1') # listen to 'event1'
    alice.register(charlie.method,events ='event2') # listen to 'event2'
    alice.register(dave.method) # listen to all events
    
    # dispatch some events
    alice.dispatch(event='event1')
    alice.dispatch(event='event2',msg=[1,2,3])
    alice.dispatch(msg='attention to all')
    
    print 'Done.'

Wednesday, December 14, 2011

Plotting with guiqwt

While it's been quiet on this blog, but behind the scenes I have been very busy trying to build an interactive spread scanner. To make one, a list of ingredients is needed:

gui toolkit: pyqt -check.
data aquisition: ibpy & tradingWithPython.lib.yahooData - check.
data container: pandas & sqlite - check.
plotting library: matplotlib - ehm... No.

After tinkering with matplotlib in pyqt for several days I must admit its use in interactive applications is far from optimal. Slow, difficult to integrate and little interactivity. PyQwt proven to work a little better, but it had its own quirks, a little bit too low-level for me.
But as it often happens with Python, somebody, somewhere has already written a kick-ass toolkit that is just perfect for the job. And it looks like guiqwt is just it. Interactive charts are just a couple of code lines away now, take a look at an example here: Creating curve dialog . For this I used guiqwt example code with some minor tweaks.

And of course a pretty picture of the result:

...If only I knew how to set dates on the x-axis....

Friday, November 4, 2011

How to setup Python development environment

If you would like to start playing with the code from this blog and write your own, you need to setup a development environment first. I've already put a summary of tools and software packages on the tools page and to make it even easier, here are the steps you'll need to follow to get up and running:

1. Install PythonXY. : this includes Python 2.7 and tools Spyder, Ipython etc.
2. Install Tortoise SVN. This is a utility that you need to pull the source code from Google Code
3. Install Pandas (time series library)

This is all you need for now.
To get the code, use 'Svn Checkout' windows explorer context menu that is available after installing Tortoise SVN. Checkout like this (change Checkout directory to the location you want, but it should end with tradingWithPython):

If all goes well, the most recent version of the files will be downloaded. I'll be writing more code and improving current one, you'll be able to stay in sync with my code by using 'svn update' context menu.

The final step is to launch Spyder (through pythonXY launcher or start menu) and add the directory just above the 'tradingWithPython' (in my example C:\Users\jev\Desktop') dir to python path . Do this with 'tools'->'PYTHONPATH manager'.
Ok, all done, now you can run the examples from \cookbok dir.

Friday, October 28, 2011

kung fu pandas will solve your data problems

I love researching strategies, but sadly I spend too much time on low-level work like filtering and aligning datasets. In fact, about 80% of my time is spent on this mind numbing work. There had got to be a better way than hacking all the filtering code myself, and there is!
Some time ago I've come across a data analysis toolkit pandas especially suited for working with financial data. After just scratching the surface of its capabilities I'm already blown away by what it delivers. The package is being actively developed by Wes McKinney and his ambition is to create the most powerful and flexible open source data analysis/manipulation tool available. Well, I think he is well on the way!

Let's take a look at just how easy it is to align two datasets:

from tradingWithPython.lib import yahooFinance

startDate = (2005,1,1)

# create two timeseries. data for SPY goes much further back
# than data of VXX
spy = yahooFinance.getHistoricData('SPY',sDate=startDate)
vxx = yahooFinance.getHistoricData('VXX',sDate=startDate)

# Combine two datasets
X = DataFrame({'VXX':vxx['adj_close'],'SPY':spy['adj_close']})

# remove NaN entries
X= X.dropna()
# make a nice picture
X.plot()

Two lines of code! ( this could be even fit to one line, but I've split it for readability)
Here is the result:

Man, this could have saved me a ton of time! But it still will help me in the future, as I'll be using its DataFrame object as a standard in my further work.

Monday, October 17, 2011

Tools & Cookbook

I've added two pages specifically to help new users to get started.
Tools: here you'll find all the info you need to set up a development environment.
Cookbook: Overview of recipies I've written. The code itself is hosted on Google Code.

Saturday, October 15, 2011

How to download a bunch of csv files

I'll be analyzing VIX futures data, so to start with, I need all the files from CBOE. They use a consistent format for file names, which is easy to generate. The following script creates a data directory and then downloads all csv files for the period 2004-2011. As you can see, not much coding is required, less than 30 lines!

from urllib import urlretrieve
import os

m_codes = ['F','G','H','J','K','M','N','Q','U','V','X','Z'] #month codes of the futures
dataDir = os.getenv("USERPROFILE")+'\\twpData\\vixFutures' # data directory

def saveVixFutureData(year,month, path):
    ''' Get future from CBOE and save to file '''
    fName = "CFE_{0}{1}_VX.csv".format(m_codes[month],str(year)[-2:])
    urlStr = "http://cfe.cboe.com/Publish/ScheduledTask/MktData/datahouse/{0}".format(fName)

    try:
        urlretrieve(urlStr,path+'\\'+fName)
    except Exception as e:
        print e

if __name__ == '__main__':
    if not os.path.exists(dataDir):
        os.makedirs(dataDir)

    for year in range(2004,2012):
        for month in range(12):
            print 'Getting data for {0}/{1}'.format(year,month)
            saveVixFutureData(year,month,dataDir)

    print 'Data was saved to {0}'.format(dataDir)

Friday, October 14, 2011

Interactive work with IPython

Pretty good screencast on python basics/
There is more of this good stuff which can be found here.

Sunday, September 25, 2011

Getting Started

First of all, I must admit that it took me quite some time even to consider switching from Matlab to Python. I've been working with Matlab for more than ten years, and I must say that it is an excellent tool for research. However, both in my professional and private work there where quite some cases where Matlab was not the best tool to do the job. This forced me to learn things like Labwindows and C# for gui programming, php for website building. Not to forget that Matlab is not a free (not as in speech, nor as in beer) product.
Still, the main reason for me trying to make the switch, is Python being a 'swiss knife of programming'. It can be used to make excellent GUIs with Qt, conduct research with SciPy, program dynamic websites and so on.

All posts on this blog will be based on the following tooling:

As a base distribution I'm using PythonXY, which is an excellent package containing (almost) everything one needs to do scientific programming with Python.
To connect to Interactive Brokers you'll also need IbPy.

Trading With Python course

If you are a trader or an investor and would like to acquire a set of quantitative trading skills you may consider taking the Trading With Python couse. The online course will provide you with the best tools and practices for quantitative trading research, including functions and scripts written by expert quantitative traders. You will learn how to get and process incredible amounts of data, design and backtest strategies and analyze trading performance. This will help you make informed decisions that are crucial for a traders success. Click here to continue to the Trading With Python course website

About

My name is Jev Kuznetsov, during daytime I am a researcher/engineer in a company that is involved in printing business. By night I am a trader.

I studied applied physics with specialization in pattern recognition and artificial intelligence. My daily work involves anything from quick algorithm prototyping in Matlab and other languages to hardware design & programming.

Since 2009 I have been using my technical skills in financial markets. Before coming to the conclusion that Python is the best tool available, I was working extensively in Matlab, which is covered on my other blog.

You can reach me at