Wonderful Inspiration for a Nasty Problem
Friday, January 29th, 2010I have been working today to try and solve a problem that a new datasource to the main web app I've been working on since I arrived at the Shop. All the other data sources I've had to use sent me a complete, consistent, state in one packet. This meant that I was able to put the data into multiple database tables with foreign keys and have the 'acquired time' be a link from one table to another. This made it easy to see what arrived when. But the latest data source is more like an event-driven system, and it's not being so considerate.
This new guy sends packets of information at each level - and they are totally dissociated from one another. It's like they are destined for cells in a spreadsheet, where the user is supposed to know that the goal is the most recent data, even if it's not directly comparable and consistent.
For instance, there are about a dozen groups, and each group has from less than a dozen to upwards of 75 symbols in it. Each of these pieces of information is arriving at different times, and there's no guaranteed consistency between them. This includes the elements of the group and the group totals. It's meant to be "the best I know, with the least bandwidth". I can see the reason for the design, but it's very, very different from what we have for the remainder of the data sources. And this is causing me grief.
No longer can I look at the arrival time of the group totals, and know that this represents the arrival time of all the elements in the group. Nope... they have their own arrival times. So in order to get the complete state of this data source, the SQL to fetch it out has become far more complex, and time-consuming. It was really getting to be annoying.
And then a co-worker said something that made it all click. "Just buffer it up".
Wonderfully simple. I don't know why I didn't think of it before. Well... probably because I was thinking of processing the data as it arrived, and we can't do that. What we can do is to take the data as it arrives, buffer it saving the arrival time as the 'generation time', and then when the group totals packet arrives, use that as the 'acquired time', and save it as well as all it's components to the database.
This makes it possible to see the individual arrival times (generation time), as well as link the data all together with the 'acquired time'. It made the "shotgun" data source fit into the existing mold that I'd created for the other data sources. Sure it was simple, but it was something that hadn't occurred to be, and I was better for having listened to him and understood how to synchronize things up.
In the end, I took about an hour to fix things up and it is working wonderfully. Very neat.