Optimize Strategies in Backtrader

Once you have created a basic strategy and analysed it, the next logical step would be to optimize it. Optimization is the process of testing different values for each parameter of strategy to see which configuration provides the best returns.  Note that not everyone agrees this will lead to better results. People can often fall into the trap of overfitting the data.

Why Optimize?

The argument is that the markets are constantly evolving. We have bull markets, bear markets, periods of inflation, periods of deflation, volatile times and moments of serenity. If that was not enough, different instruments have different rhythms and different markets have different mindsets. This means parameters for one instrument in one market may not be optimal for another instrument in another market.

But Be Careful of Overfitting

When we optimize strategies we need to be careful that we do not create parameters that will only work for a single “moment in time”. It can be tempting to take the best results from optimizing and run with those parameters. However, if your data set is for only a brief period of time or covers only a certain market condition, you may find that the parameters with the best results only fit that moment in time. Officially overfitting is more academic. In statistics and machines learning a statistical model or algorithm is applied to training data so it can be used to make predictions in the future. Overfitting occurs when the model or algorithm is too complex for the dataset. Complex in our sense would be that the algorithm is tweaked to such an extent that it only fits that data. Overfitting results in overreactions when applied outside of the training data. In backtesting we can think of our backtest data being our training data and our strategy as the algorithm.

Background

This code in this post follows on from the code developed in the Backtrader: First Script post and will form part of the getting started series. If you are completely new to Backtrader and/or Python, I suggest starting here: Getting Setup: Python and Backtrader

The code

The code for this tutorial is going to be built over three examples. Each example will be accompanied by its own commentary and output.

Part 1 – Adding Parameters

The before we can optimize the code we need to give the strategy some changeable parameters. If you look back at our previous code, you will see that we hard-coded the RSI parameter to 21. Hard coding means that parameter is set in the code and cannot be changed later. In order to optimize we need to make this parameter configurable when we load the strategy into cerebro.
'''
Author: www.backtest-rookies.com

MIT License

Copyright (c) 2017 backtest-rookies.com

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
'''

import backtrader as bt
from datetime import datetime

class firstStrategy(bt.Strategy):
    params = (
        ('period',21),
        )

    def __init__(self):
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=self.params.period)

    def next(self):
        if not self.position:
            if self.rsi < 30:
                self.buy(size=100)
        else:
            if self.rsi > 70:
                self.sell(size=100)


#Variable for our starting cash
startcash = 10000

#Create an instance of cerebro
cerebro = bt.Cerebro()

#Add our strategy
cerebro.addstrategy(firstStrategy, period=14)

#Get Apple data from Yahoo Finance.
data = bt.feeds.YahooFinanceData(
    dataname='AAPL',
    fromdate = datetime(2016,1,1),
    todate = datetime(2017,1,1),
    buffered= True
    )

#Add the data to Cerebro
cerebro.adddata(data)

# Set our desired cash start
cerebro.broker.setcash(startcash)

# Run over everything
cerebro.run()

#Get final portfolio Value
portvalue = cerebro.broker.getvalue()
pnl = portvalue - startcash

#Print out the final result
print('Final Portfolio Value: ${}'.format(portvalue))
print('P/L: ${}'.format(pnl))

#Finally plot the end results
cerebro.plot(style='candlestick')

Part 1 – Code Commentary

First of all, let’s remind ourselves of the code in our first script. This will allow us to easily see what has changed. A code snippet of the class declaration and __init__() method from the first script are below
class firstStrategy(bt.Strategy):

    def __init__(self):
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=21)
In the example above, this has been changed to:
class firstStrategy(bt.Strategy):
    params = (
        ('period',21),
        )

    def __init__(self):
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=self.params.period)
Here we have added a “params” tuple. It contains other tuples that declare the strategy parameters. What is a tuple? A tuple is a list of items that is fixed and cannot be changed / edited. Officially the programming speak used to differentiate this is immutable (cannot be edited) and mutable (can be edited). You may come across these terms in other topics. Inside the params tuple, I have a parameter (‘period’,21). The first string is the name / reference for the parameter. The second value is the parameters’ default value. Having a default value means you do not have to specify a parameter every time you run the strategy. If nothing is specified, it will run with the default setting. You can put as many parameters as you like in the params tuple. Just make sure you add them as a tuple inside the main tuple (known as a nested tuple). The strategy parameters can be accessed anywhere in the class. You access them just like accessing any class attribute (variable). In our __init__() method self.params.period is accessed and assigned to the period keyword when adding the RSI indicator.

Calling the strategy

cerebro.addstrategy(firstStrategy, period=14)
The only thing that changes when we add the strategy into cerebro is that we now add a keyword argument for the parameter. As mentioned above, this is optional. Calling the strategy in this way will allow us to optimize it later.

Gotchas

There are a couple of things to watch out for when adding parameters. The first is that every tuple in the list of tuples needs a comma following it. If you are used to coding in Python, you will know that for lists and dictionaries, the last value should not have a comma following it. If you type: (Incorrect)
params = (
        ('period',21)
    )
Instead of: (Correct)
params = (
        ('period',21),
    )
You will receive a  ValueError:
ValueError: too many values to unpack (expected 2)
In addition, be careful when adding your indicators in the __init__() method. If you forget to use a keyword argument, you can end up with a TypeError. If you type: (Incorrect)
def __init__(self):
        self.rsi = bt.indicators.RSI_SMA(self.data.close, self.params.period)
Instead of: (Correct)
def __init__(self):
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=self.params.period)
You will receive the following error:
TypeError: __init__() takes 1 positional argument but 2 were given
There is potential for confusion with this error. We added the indicator in the __init__() method of the strategy. However, the error is actually referring to the __init__() method of the indicator! You could end up spending time debugging the wrong thing.

Part 1 – Results:

Output showing RSI indicator with different parameter There we go. Perhaps a little hard to see but the RSI period parameter is using a look back period of 14 instead of the default 21.

Part 2 – Optimize

Now that we are able to initialize the strategy with different parameters, optimizing the code is pretty simple. We technically just need to replace the cerebro.addstrategy() line with:
#Add our strategy
cerebro.optstrategy(firstStrategy, period=range(14,21))
Cerebro will then run over the strategy for every period in the range given. However, the output would not be useful. If we want to be able to see which parameter worked best, we will need to add a new method to our strategy. The full code is below:
'''
Author: www.backtest-rookies.com

MIT License

Copyright (c) 2017 backtest-rookies.com

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
'''

import backtrader as bt
from datetime import datetime

class firstStrategy(bt.Strategy):
    params = (
        ('period',21),
        )

    def __init__(self):
        self.startcash = self.broker.getvalue()
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=self.params.period)

    def next(self):
        if not self.position:
            if self.rsi < 30:
                self.buy(size=100)
        else:
            if self.rsi > 70:
                self.sell(size=100)

    def stop(self):
        pnl = round(self.broker.getvalue() - self.startcash,2)
        print('RSI Period: {} Final PnL: {}'.format(
            self.params.period, pnl))

if __name__ == '__main__':
    #Variable for our starting cash
    startcash = 10000

    #Create an instance of cerebro
    cerebro = bt.Cerebro()

    #Add our strategy
    cerebro.optstrategy(firstStrategy, period=range(14,21))

    #Get Apple data from Yahoo Finance.
    data = bt.feeds.YahooFinanceData(
        dataname='AAPL',
        fromdate = datetime(2016,1,1),
        todate = datetime(2017,1,1),
        buffered= True
        )

    #Add the data to Cerebro
    cerebro.adddata(data)

    # Set our desired cash start
    cerebro.broker.setcash(startcash)

    # Run over everything
    strats = cerebro.run()

Part 2 – Code Commentary

The eagle-eyed readers may notice there have been some deletions in addition to the new method (function) added to the strategy. First let’s take a look at the new method:
    def stop(self):
        pnl = round(self.broker.getvalue() - self.startcash,2)
        print('RSI Period: {} Final PnL: {}'.format(
            self.params.period, pnl))
Backtrader will loop through all the different parameters before it arrives at the end of the script. In our previous example, we printed the account value and PnL (profit and loss) at the end of the script. This means you will not see the results of the individual loops if we leave our print() statements there. As a result, a stop() method is added to the script. This method is part of the bt.Strategy base class and we are simply overwriting the logic within it. As a reminder, we inherit from bt.Strategy when creating our class. As the name suggests, this method is called when the strategy stops. This is ideal for printing the final profit or loss to the terminal along once the test is finished. Plotting In addition to removing the print() statements at the end of the script, the plotting function has been removed. When we optimize, I recommend that you do not plot the output. At the time of writing, a new plot will be made after each loop of the strategy. You will then need to manually close it before the next run begins. If you have a lot of parameters, this can take a long time and become annoying fast.

Part 2 – Results

Terminal output showing PnL from Optimization So it seems a period of 17 is the optimal setting for this time period. Interestingly if your setting was off by just 2 (period of 19), your results would have been drastically different!

Part 3 – Taking it a step further

The example above is ok but there is a problem in my opinion. The results above are not ordered and you may not want to solely print the results. Imagine you have 3 parameters that can make 100 + combinations. It would be quite laborious and prone to error if you had to read the lines one by one. In this part, we will look at accessing the results after cerebro has finished running.
'''
Author: www.backtest-rookies.com

MIT License

Copyright (c) 2017 backtest-rookies.com

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
'''

import backtrader as bt
from datetime import datetime

class firstStrategy(bt.Strategy):
    params = (
        ('period',21),
        )

    def __init__(self):
        self.startcash = self.broker.getvalue()
        self.rsi = bt.indicators.RSI_SMA(self.data.close, period=self.params.period)

    def next(self):
        if not self.position:
            if self.rsi < 30:
                self.buy(size=100)
        else:
            if self.rsi > 70:
                self.sell(size=100)

if __name__ == '__main__':
    #Variable for our starting cash
    startcash = 10000

    #Create an instance of cerebro
    cerebro = bt.Cerebro(optreturn=False)

    #Add our strategy
    cerebro.optstrategy(firstStrategy, period=range(14,21))

    #Get Apple data from Yahoo Finance.
    data = bt.feeds.YahooFinanceData(
        dataname='AAPL',
        fromdate = datetime(2016,1,1),
        todate = datetime(2017,1,1),
        buffered= True
        )


    #Add the data to Cerebro
    cerebro.adddata(data)

    # Set our desired cash start
    cerebro.broker.setcash(startcash)

    # Run over everything
    opt_runs = cerebro.run()

    # Generate results list
    final_results_list = []
    for run in opt_runs:
        for strategy in run:
            value = round(strategy.broker.get_value(),2)
            PnL = round(value - startcash,2)
            period = strategy.params.period
            final_results_list.append([period,PnL])

    #Sort Results List
    by_period = sorted(final_results_list, key=lambda x: x[0])
    by_PnL = sorted(final_results_list, key=lambda x: x[1], reverse=True)

    #Print results
    print('Results: Ordered by period:')
    for result in by_period:
        print('Period: {}, PnL: {}'.format(result[0], result[1]))
    print('Results: Ordered by Profit:')
    for result in by_PnL:
        print('Period: {}, PnL: {}'.format(result[0], result[1]))

Part 3 – Code Commentary

In this example, there are quite a few changes to the code. First of all we have removed the stop() method in the last example. We will be accessing all the values we need after the script has finished running. Another change that could be easy to miss if you are just copying and pasting the code is:
cerebro = bt.Cerebro(optreturn=False)
Here we have added a new parameter to the cerebro initialization. This parameter setting changes what is returned by cerebro.run() at the end of the script. In a normal script cerebro.run() will return full strategy objects. These objects are created from the firstStrategy class blueprints we have written in the code. Strategy objects make everything that was available to cerebro during the test (indicators, data, analyzers, observers etc) available after the test has finished. This means you have access to all the data and results. However when optimizing, cerebro.run() returns OptReturn objects by default. These are trimmed down objects that only contain parameters and analyzers. The reason for this is to improve optimization speed. It is assumed that the important metrics needed to decide which parameters are best can be deduced from just the analyzers and parameters. However, since the examples on this site have been printing the final profit, I would like to keep this convention for the final example. For this reason, optreturn parameter must be set to false because the broker information (for profit /loss) is not part of an analyzer. We need Cerebro to return full strategy objects. The rest of the interesting code in this example happens after cerebro has run.

Getting the data from a strategy object

# Run over everything
opt_runs = cerebro.run()

# Generate results list
final_results_list = []
for run in opt_runs:
    for strategy in run:
        value = round(strategy.broker.get_value(),2)
        PnL = round(value - startcash,2)
        period = strategy.params.period
        final_results_list.append([period,PnL])
Cerebro returns a list of strategies for each loop through the parameter list. In our case, there is only one strategy. However, since a nested list (list of lists) is returned, we still need to loop through the returned object twice to get to the information we need. Once we have the values we want, they can be appended to the final_results_list. This list can then be sorted.
#Sort Results List
by_period = sorted(final_results_list, key=lambda x: x[0])
by_PnL = sorted(final_results_list, key=lambda x: x[1], reverse=True)
If you are new to Python, this part may look a little complex. Our final_results_list is also a nested list. To sort it properly, we need to provide a sorting key. The key keyword argument needs to be passed a function. A lambda is a small one line function that allows us to use the sorting key. For more information, I have added some reference links for further reading at the end of this post.

Part 3 – Results

Ordered printing of optimization results. There we go. This turned into a much longer post than I expected when I started. If you managed to make it to the end without skipping, I hope the content provided some benefit.

Reference Docs

  1. Backtrader optimization documentation:  https://www.backtrader.com/docu/quickstart/quickstart.html?highlight=optimize#let-s-optimize
  2. Backtrader Cerebro documentation: https://www.backtrader.com/docu/cerebro.html
  3. Python sorting how to:  https://docs.python.org/3/howto/sorting.html
  4. Python lambda documentation:  https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions

Find This Post Useful?

If this post saved you time and effort, please consider donating a coffee to support the site!  

3PUY12Tgp8xynrMCbBdLE56DShzCbxFG8i

0xb90252f1a0af77a43c499102be8a08ce5e190e01

Le3ykk29k2TjD3ZFoEyWTFzJgUu9Q9v6Fq