Downloading and parsing Dukascopy tick data with Birt’s PHP scripts
The Dukascopy data is available on the web in its raw form as files that span only 1 hour, so it becomes apparent that some tools are necessary to download and parse it. Before it was possible to get the data via any of the other methods, I made a series of scripts that I still use nowadays for downloading the free tick data available from Dukascopy. I’m a fan of the PHP simplicity, so I chose that to write the scripts. They’re not commercial quality code, but they work.
You can get the PHP script archive from the tick data downloads page.
You will find 4 scripts inside:
- A script for downloading the Dukascopy data, suggestively named “download_dukascopy_data.php”. As a courtesy to Dukascopy who is graciously providing free data, the script does not attempt to download files that are already on your harddisk. However, it still requests missing files, so to avoid stressing their server please set the dates in the $currencies array at the beginning of the script to the date of your last download; they’re using the standard unix timestamps (epoch date, which is in essence the number of seconds since 01.01.1970). If you want to easily convert from a regular date to such a unix timestamp, you can use Epoch Converter, a very easy to use online tool.
- A script for processing the downloaded data, which assumes that it is located in the same directory as the previous script and that the data was downloaded there (process_dukascopy_data.php); this one needs some parameters, run it without any for a description or check out the next script.
- A small shell script that will process all the downloaded data available in .bat form for windows and .sh form for linux.
Windows download & convert to CSV how-to
Start by visiting the windows PHP download section and fetching the latest binary version as a zip file.
Once you’ve done that, unpack it to c:\php\ and also unpack the scripts from the script archive you downloaded in the same directory.
Rename c:\php\php.ini-development to c:\php\php.ini. If your folder does not contain a file named php.ini-development, use php.ini-dist or any other php.ini-something file you can find.
Edit c:\php\php.ini, search for
and remove the semicolon in front of the line and add an “ext/” in front of “php_curl.dll” so that it looks like this:
Save the file and exit.
If you run into a zip error and your PHP installation has an ext/php_zip.dll, also apply the method above for extension=ext/php_zip.dll.
Head to the 7-Zip download page and get the command line version. Unpack it and put 7za.exe in the same directory (c:\php\).
Click start->run and type
then click ok (or alternatively type
cmd and hit enter in the windows 7/vista “search programs and files” box in the start menu).
in the command window.
Have a coffee. Have another coffee. Go sleep. Go to work. Go to the gym. Go to a club. Wait some more. I’m not kidding, it takes ages. You can check the progress by watching the currency pair directories get filled. If you get any strange errors, run the process again when it’s finished – it will only download any files that were missed in the first step due to network errors.
If you only want to download some of the currency pairs available, you can edit download_dukascopy_data.php and change the array at the beginning of the file. You can switch the currency pair download order or completely remove the pairs that you don’t want. The number next to each pair is the unix timestamp at which to start downloading; if you wish to start at a later point in time (the default is the earliest date available) you can use epochconverter.com to get the timestamp for your chosen date.
When the download is finished, assuming you wanted to get the EURUSD data up to 01.01.2012, you’d type
php process_dukascopy_data.php EURUSD 200702 201201 EURUSD.csv
and the output will be placed in EURUSD.csv.
Alternatively, you can type
which will batch process all the currency data. It’s mostly safe to ignore the error spam at this step. Note: if you use process.bat or process.sh, you might have to update the ending dates in them to get the full data range!
This should be it, if everything went fine you should have your CSV files in the same c:\php folder and you should be ready to proceed with preparing your tick data for Metatrader 4.
Warning: make sure you have enough space on your harddisk. As of 2012, the downloaded tick files have over 20 GB and if you add up the size of the resulted CSV files you will be well past the 100 GB mark.