The new beta version released today – 2.2.16 – brings a lot of new translations: Russian, German, Chinese (traditional), Chinese (simplified), Czech and Slovak – please do not hesitate to contact support if you see any problems with the translated strings. The Russian version in particular is in need of attention and although the translation is finished, the translator can no longer work on the project and I am looking for somebody to take over so please get in touch if you’re a native Russian speaker and you are interested.

Other than the new available languages, the most notable new feature is the ability to download tick data from Rannforex, a relatively new Russian forex broker. There are also lots of small improvements and fixes and it’s worth mentioning that a bug was identified and fixed that would occur in a particular configuration: delay slippage used with FXT files larger than 2 GB in which case some ticks were skipped; this was likely to have a minimal (if any) effect on backtests/optimizations, but it did have an effect on some EAs and a couple of users actually ran into the issue.

This new beta version comes with some tardiness when compared to the normal release pace. That is mostly because this version was supposed to also include downloading data from Gain Capital (forex.com) but I eventually had to completely give up the idea after putting several weeks of work into it. Why? Because it’s simply the single most disorganized tick data repository that I’ve ever seen and I’m going to launch into a bit of a rant here, feel free to stop reading at this point if you don’t care about the technical aspects. So, here are some issues that I encountered while working on implementing Gain Capital tick data downloading:

  • The file & folder structure is inconsistent. The first few years contain fully packed data for the whole year, then it’s packed into months for a short period and finally it’s packed into weeks.
  • The week numbering looks more or less random. I haven’t been able to identify a consistent rule that they use when they consider week 1 of a month to begin in the previous month – actually I’ve identified a couple of such rules but none of them can be applied throughout the whole repository and were only working on certain periods of time. Seems to me like whoever is doing the packing is taking an ad-hoc decision about that, but then again I may be mistaken about this. Some months even start directly with week 2 and there is no week 1. Some months have a week 5 that goes into the next month while other months have a week 1 that starts in the previous month. Furthermore, at some point, they must have forgotten to pack the data and packed weeks 2-4 in a single archive instead of 3 different archives, most likely because whoever did it was too lazy or too incompetent to break it up in 3 parts as it should have been.
  • Some archive files are incorrectly named, I’ve seen several named “WeeX” instead of “WeekX”.
  • The CSV field structure is changing over the years.
  • The CSV file in the ZIP archive is sometimes missing the CSV extension.
  • The packing appears to be sloppish at times: I found ZIP archives that contained ZIP archives for other symbols while some other archives contained tmp files.
  • The file encoding is inconsistent – some of the files are ASCII, some are UTF-16.
  • The time format also keeps changing: there are files that have 3 digits for the milliseconds, files that don’t have millisecond information at all and files that have 9 digits for the milliseconds, the last 6 of which are always 0.

While many of these problems were not insurmountable, a lot of hardcoding would have been required. The major problem was that I found no way to figure out the month & week for a given date – the solution would have been to try downloading a couple of different weeks in order to identify to which the date belongs but it would have been too much guesswork and I just can’t abide that.

Bottom line, Gain Capital, if you’re reading this you should be ashamed. You’re a publicly traded company and you’re too cheap to invest in an automation for your tick data repository plus someone to put some order in the existing data… Contract a 3rd party if you don’t have the internal resources, I doubt it would cost a fortune – writing some scripts to put the existing data in some semblance of order shouldn’t take more than a couple of days.

For those interested, you can manually download tick data from the Gain Capital tick data repository and import it in the Tick Data Suite but if you batch import you need to make sure that all the files are using the same format (check the millisecond digits and the field order in particular).