Download Cryptocurrency Data with CCXT

CCXT is a cross-language library for interacting with Crypto exchange API’s. At the time of writing, the library provides a single interface to over 100 cryptocurrency exchanges. In this post, we are going to use the library to build a tool that will allow us to download cryptocurrency data for backtesting. For more information on the library, visit CCXT’s GitHub page (referenced below). For those of you who prefer a quick overview, here is a snippet straight from the horse’s mouth:

It is intended to be used by coders, developers, technically-skilled traders, data-scientists and financial analystsfor building trading algorithms on top of it.
Current feature list:

  • support for many exchange markets, even more upcoming soon
  • fully implemented public and private APIs for all exchanges
  • all currencies, altcoins and symbols, prices, order books, trades, tickers, etc…
  • optional normalized data for cross-exchange or cross-currency analytics and arbitrage
  • an out-of-the box unified all-in-one API extremely easy to integrate
  • works in Node 7.6+, Python 2 and 3, PHP 5.3+, web browsers

You can find out more about the project here:

Installing the Library

As with most modules, installing CCXT is made simple via pip. Just open up the command line and type:
pip install ccxt
If you happen to havepython2installed alongsidepython3and want to install it for python3, simply replace pipwithpip3.

Other Dependencies

If this is your first time visiting the site, it is assumed you know how to install Python 3 and run some basic scripts. If that is not you, try looking at the Getting Setup: Python and Backtrader post first.
Lastly, the script below requires pandasto also be installed. However, the code could be updated to not require it. The download tool only uses pandas to massage and export the data received. To install it use the same pipcommand as above but replace ccxtwithpandas.

Download Tool Overview

In the first iteration, the download tool shall be a simple command line tool. It shall allow simple arguments to be specified at runtime using the argeparsemodule for flexibility. This means we can avoid altering the code everytime we want to download different data. For a deeper look at argparseand how it can be used with Backtrader, see here: Using argparse to change strategy parameters
The download tool shall allow users to specify the following options:

  • Instrument (Currency Pair)
  • Timeframe (1m, 5m, 1 hour etc)
  • Exchange

Notice how there are no inputs for start and end dates? The tool shall download everything that is available using CCXT’s fetchOHLCVmethod. Once the data is downloaded, it shall be exported to a CSV file in the current folder. You may wish to tweak this and add anotherargparseargument to specify the output directory.

The Code

The code is quite straightforward. After using argparseto capture our exchange, instrument and timezone requirements, the bulk of the work is spent error checking before and calling CCXT’s fetch_ohlcvmethod.
Speaking of error checking, not all exchanges support fetching of OHLCV data. Furthermore, not all exchanges support the same timeframes and currency pairs. Since there are over 100 exchanges supported at the time of writing, it is easy to imagine that there will be a lot of variances between exchanges. To avoid frustration with bumping into error after error trying to guess which timeframes and/or currency pairs are supported, the code performs a series of checks before attempting to download data. It detected something is not supported, it will notify the user and provide a list of alternative parameters where applicable.
Other than that, it is worth paying attention to the exchange.load_markets()call if you plan to expand upon the code or work with the library yourself. Admittedly, I spent a little too much some time scratching my head, trying to figure out why every exchange I tested seemed to have empty markets, symbolsand markets_by_iddictionaries. It turns out that snuggled in the documentation is the answer, you need to call load_markets()before any data is available. This was a good reminder to read the documentation carefully! A little time spent reading can save a lot of time.
Finally, if you are new to programming, the getattr(ccxt, be new to you. getattr()is a python built-in function which allows you to “get an attribute” from an object using a string. In this case, we are getting the correct exchange object from the given argument string.

Usage Examples

Please note that the following usage examples assume the script is saved as
ETH/NEO Binance Daily
python -s NEO/ETH -e binance -t 1d

Unsupported Timeframe

The example demonstrates the expected result when an unsupported timeframe is given as an argument:
python -s NEO/ETH -e bittrex -t 12h

Unsupported Symbol

The second example has a fair bit of output. It demonstrates the expected result when an unsupported symbol is given as an argument:
python -s ATP/USD -e bittrex -t 5m

Unsupported Exachange

The final example demostrates the expected result when an exchange does not support fethcing of OHLCV data.
python -s BTC/USD -e gdax -t 15m

Issues and Improvements

The code tries to be user-friendly in that it will tell the user when incorrect data is provided. Having said that, not all edge cases have been considered and there is definitely room for improvement in that regard.
In addition to this, the amount of data available on most exchanges for download is limited. CCXT does provide a sinceparameter for the fetchOHLCV()method but I did not see any detailed description for this in the fetchOHLCV()documentation. Having said that, an explanation of a parameter with the same name does appear the “Trades, Executions and Transactions” section as follows (Note: That the parameter appears frequently, so I assume it will have consistent functionality but that is no guarantee!):

The second optional argument since reduces the array by timestamp, the third limit argument reduces by number (count) of returned items.

So taking that into account, it appears sinceshould reduce the amount of data provided. However, it would be good to know for sure. If anyone pokes through the codebase and has an answer, I would be happy to hear it. Leave a comment below!

Find This Post Useful?

If this post saved you time and effort, please consider support the site! There are many ways to support us and some won’t even cost you a penny.