Archive for March, 2011

Still Trying to Find More Speed

Tuesday, March 8th, 2011

bug.gif

I've spent a few hours today trying to find even more speed in my exchange decoder. It's the core part of the ticker plants as it's the first component in the chain, and the exchange datagrams come into this component, are converted into our own message format, and sending them downstream. The problem I'm seeing is that the time to get through this step is far too long, in my mind. I'm seeing times in the hundreds of milliseconds - and that's not right.

So I'm trying to find the problem. It's not easy because I can only do this during exchange hours, but even then, it's not obvious where the problem lies. I clearly need to do more work.

I'm concerned that it's in the OPRA decoding - that would be a tragic problem. Messing with that code could be really dangerous.

I Shouldn’t Have Watched House, M.D. Last Night

Tuesday, March 8th, 2011

TV.jpg

OK, I really like House, M.D. and have really enjoyed the last two seasons where he's kicked the vicodin and started being a little softer, almost like a Secret Santa of sorts. And I'll admit I get sucked into the suspension of disbelief for this series - I mistake these actors and writers as people and not fiction... when he saves people, it's great. When it's funny, it's a scream. But when, like last night, it's sad for House, it's really sad. And it lingers.

I know I am just the kind of audience producers love. But they will loose me if the show "jumps the sharks" and causes House to relapse into his old, self-destructive, ways. I know it's all about the "drama", but this show isn't about the drama for House, but the drama that swirls around him. He's clearly tortured when he can't solve the patient's puzzle, no matter how much he says he doesn't care. To heap more on him seems, to me, to be too much.

I hope they resolve this very soon.

The Conversion from Decimals to Integers

Monday, March 7th, 2011

GeneralDev.jpg

This afternoon I've been working very hard to convert all the prices and decimal numbers in my ticker plant codebase from float values to uint32_t values with a given getDecimalMultiplier() on each message. This came up in a meeting regarding some other group's use of the codebase, and they currently don't use floating point numbers - but rather an integer and a multiplier. OK... I can fix that, and so I did.

First thing was realizing that a uint32_t was sufficient as that would give me a 10,000 multiplier and values in excess of $400,000.00, when divided out. Good enough. Then I had to go into the code, replace all the values, add constructors and methods to take either the floating point number, and convert it to the proper integer, or take the integer and just use it.

The next thing was to look at the conversion/sampling functions on the exchange data. A lot of these take an integer mantissa and a divisor code and generate a float. What I needed to do was to alter these, or make similar functional versions, where they would take the same arguments and generate the correct integer representation of the value - offset by 10000 (my new offset). Again, not really hard, but it's detail work - making sure you get all the conversions done and don't loose any precision in the process.

Next, I created getters and setters for the messages that allowed the user to get the integer or floating point value at their choice. The scheme I used was to say that getPrice() got the decimal number and getPriceAsInt() got the biased integer. Pretty simple, and I don't think I'm going to have a lot of confusion here, which is very important.

Finally, with nothing but a few float values remaining - and the getters and setters using float arguments, I decided it was better to do a complete conversion to double and get rid of any float values in the processing. It's cleaner, more easily dealt with at the chip level, better scale and accuracy -- it's just better.

With this, I have everything stored as integers, with the multiplier available to the clients, and even decimal getters if they don't want to hassle with the conversions themselves. It's about as clean as I can imagine making it.

At What Point Does the Whoop-Ass Come Out?

Monday, March 7th, 2011

cubeLifeView.gif

So I'm sitting here this morning thinking about all the emails I had to respond to this morning and wondering When can I pull out a can of Whoop-Ass? or just as effectively, When will someone pull out a can of Whoop-Ass? I mean really. Please. Now.

I'm still a bit of the "new guy" here, and I realize that there are a lot of things in every corporate culture that are really evolutionary in their entrenchment, and can't realistically be changed without major moves at the top. But I'm not think about that. I'm thinking about all the people that are fighting me - really fighting me, on the new ticker plant.

Sure, there are those that see it as an untested work, and worthy of suspicion. For those, I offer tests, and testing time for them to run their tests to see if it's working. I don't mind these people at all - in fact, I welcome them because they are the converts that will become my strongest allies. They will pick up the banner and run with it because they beat the crap out of it and proved to themselves that this is a good system, and it's now something they can treat as a provided service and not have to maintain themselves.

Then there are those that see their only value to the organization is the code they wrote a few months/years ago, and I'm talking about taking all that away from them. Maybe it's the team they have built up to help maintain the code. Mine would strip all that away from them as their team would no longer be needed to maintain this code. All these threats breed fear, and fear breeds anger. I see it, but I don't even pretend to understand it.

There are also the completely clueless, often fed information by the threatened, and that's a dangerous combo. I've had several conversations and untold emails from folks like this. They say things like "our needs are different", and "if you consider the efficiency, you'll see this is needed" - as if I've never even tried to understand their needs (have - for months), and that I'm completely unaware of the engineering issues involved (am - for years). It's best when it's all sugary-sweet in the tone you'd hear a parent of a young child use. Yeah... that's right, I'm your 4 year old writing this code and need a little help from Mommy and Daddy.

It's about professionalism, folks. Let's just all treat each other professionally. We don't have to agree on everything - we don't even have to agree on a lot of things. But we do have to agree on facts and numbers. If you show me test results, and I disagree with them, it's up to me to provide proof, typically in the form of better tests, that refute the data. Otherwise, I have to accept them. That's professionalism.

Likewise, if management (whoever that is) decides that we need to go in a certain direction, then that's what we all have to do. It's not a question of whether or not we like it... it's the job. Period. Done.

But it seems that things aren't quite this way. I'm sure the founders expect their instructions to be followed without delay, grief, crud, or delay. And they should. And I'm not totally sure where things start to break down, but by the time they are at my level, it's clear that something very bad has happened to the chain of command and the idea of professionalism. I'm not hypersensitive to it, but I do think I shouldn't have to spend months dealing with people fighting me on topics that I've been asked to do my partners in this firm.

I'm not saying it's my ideas. It's their ideas, I'm just the instrument of change. Still... these people are doing all they can to stop this implementation. Now we've moved and I'm sitting right in the middle of the snake pit. I understand why the managers did this - they want me to effect the maximum change I can, but they aren't thinking that I'm getting sick and tired of this.

They're just going to burn me out if things don't chance soon. Face it - how long are you interested in fighting the fight assigned you with no real support from those that assigned it to you? I'll do it for a while, and then I'll just say "Do whatever. Don't care." and I'm all done.

Guess I'm closer to that than I thought.

Finally Realizing One Size Never Fits All

Friday, March 4th, 2011

GottaWonder.jpg

I originally designed my ticker plants to fit a specific client: the systems feeding the human traders. Eyeballs. There was no need to have everything up-to-date every millisecond - the human eye can't tell, and the systems don't update faster than a few times a second. It's just a waste. But what they do care about is that when they see the change, it's the latest data available. This means don't queue it up! You have to remember the order the ticks came in, but allow for updates to the data to replace the old with the new. This is commonly called conflation. It's a good thing for systems delivering data to humans.

But automated trading systems don't want this. They want every tick. They want it all as fast as possible. It's understandable - if you can make a machine able to see everything, then you have a much better chance of seeing opportunity and therefore making a profit. While I didn't design my ticker plants for these kinds of systems, several months ago, I was asked to make it work for these kinds of systems.

I've spent a lot of time trying to speed things up so that one system is capable of meeting the needs of both kinds of clients. It's been very difficult, and in a very real sense, what I've been doing is dumbing down my system to force the clients to handle every tick. If I could have done it, it would have been fantastic. But it really isn't possible. The compromises for one client are just too far from the compromises for the other.

So I finally had another little Ah! Ha! moment - Stop trying to make one size fit all. Elementary, but true, and an important understanding of really making something good for everyone.

If I made my ticker plants the way I started - for the 'slow' trading, and then had the 'fast' trading use an embedded ticker plant, then those that needed speed wouldn't even have to deal with a network hop. That's good. No serialization or deserialization. No worries about dropping packets from the server to the client. There are a lot of things that just "go away" when you decode and use the data in the same process.

I do this in my NBBO server - I have n exchange feeds all going into one NBBOEngine, and then sending it out to the clients. I don't take in the feed, process it, and then send it out - that'd take too long. I process the feed within the process space of the consuming application.

The resources to do this aren't horrible, two threads, less than a core and some memory. All this can be dealt with very easily by adding a box or two, if necessary. These boxes could be the "servers" you turned off because you no longer need them. In any case, it's a very solvable problem.

In the end, those that need conflation get it, and those that don't want it, get the data in-process as fast as possible. It's really the best of both worlds as it doesn't make compromises for one client or another.

Google Chrome dev 11.0.686.3 is Out

Friday, March 4th, 2011

Seems there's another quick-fix for Google Chrome dev to bring it to 11.0.686.3 - this time about the autofill related crash. Fair enough - it's nice they are being this responsive, but if it's just a day, they could have waited the original release and not messed with these two updates. But maybe they had to release for political reasons.

Successful Tests with ZeroMQ – Time to Update

Thursday, March 3rd, 2011

ZeroMQ

I've had a very successful day testing ZeroMQ in my ticker plants with the updated parameters that had been hinted to me by a co-worker. It's not something I'd have thought to try, given that we're using OpenPGM - I thought the socket buffers were going to be controlled by OpenPGM, but I guess not.

In any case, if I create a socket and then set the send and receive buffers to 64MB and the peak speed to 500Mbps with a 100 msec recovery interval:

  // set the send and receive buffers to 64MB each
  static int64_t      __sndbuf = 67108864;
  static int64_t      __rcvbuf = 67108864;
  // have the maximum sending rate be 500Mbps
  static int64_t      __rate = 500000;
  // ...and the recovery interval 100 msec
  static int64_t      __recovery = 100;
 
  // create the socket...
  try {
    mSocket = new zmq::socket_t(*mContext, ZMQ_PUB);
    if (mSocket == NULL) {
      error = true;
      cLog.error("could not create the socket!");
    } else {
      // now let's set the parameters one by one...
      mSocket->setsockopt(ZMQ_SNDBUF, &__sndbuf, sizeof(__sndbuf));
      mSocket->setsockopt(ZMQ_RCVBUF, &__rcvbuf, sizeof(__rcvbuf));
      mSocket->setsockopt(ZMQ_RATE, &__rate, sizeof(__rate));
      mSocket->setsockopt(ZMQ_RECOVERY_IVL_MSEC, &__recovery,
                          sizeof(__recovery));
      // now let's connect to the right multicast group
      mSocket->connect(aURL.c_str());
    }
  } catch (zmq::error_t & e) {
    cLog.error("while creating the socket an exception was thrown!");
    if (mSocket != NULL) {
      delete mSocket;
      mSocket = NULL;
    }
  }

I've got a lot more testing to do, but these parameters really seem to help. Very nice.

The next step is to get the latest code from the GitHub git repo and try it. There are a ton of new features and lots of fixes which hopefully will clear up the last of the problems I'm seeing.

Erlang Ring Benchmark from Chapter 8 of J Armstrong Book

Thursday, March 3rd, 2011

erlang

I've been trying to learn erlang from the book Programming Erlang by J Armstrong, and one of the first real challenges was the exercises in chapter 8 where he challenges me to:

Write a ring benchmark. Create N processes in a ring. Send a message round the ring M times so that a total of N * M messages get sent. Time how long this takes for different values of N and M.

Write a similar program in some other programming language you are familiar with. Compare the results. Write a blog, and publish the results on the Internet!

One thing I think he missed in the problem statement is that a message passed from one node to another really should have a response sent by the receiver. In my design, I planned to have a 'ping' message sent, received, and a 'pong' message sent back. The receipt of the 'pong' message would be a no-op, but it needed to be received. Other than that, the design was just like the problem statement.

My solution to the problem is this:

  1. -module(ring).
  2. -export([start/0, test/2]).
  3.  
  4. %% make the state container for the ring node
  5. -record(state, {next, origin, caller}).
  6.  
  7. %% standard entry point for a 1000 node, 500 cycle test
  8. start() ->
  9. test(1000, 500).
  10.  
  11. %% make a synchronous message call to the pid and wait for response
  12. rpc(Pid, Request) ->
  13. Pid ! {self(), Request},
  14. receive
  15. {Pid, Response} ->
  16. Response
  17. end.
  18.  
  19. %% main messaging loop for all nodes in the ring
  20. loop(State) ->
  21. receive
  22. %% the head of the ring needs to know it's the origin
  23. {From, {origin, Origin}} ->
  24. From ! {self(), Origin},
  25. loop(State#state{origin=Origin});
  26.  
  27. %% building the ring is a countdown of creations
  28. {From, {build, Count}} when Count > 1 ->
  29. Node = spawn(fun() -> loop(State) end),
  30. rpc(Node, {build, Count-1}),
  31. From ! {self(), Count},
  32. loop(State#state{next=Node});
  33. %% ...to the final node that circles back to the origin
  34. {From, {build, Count}} ->
  35. From ! {self(), Count},
  36. loop(State#state{next=State#state.origin});
  37.  
  38. %% starting the test kicks it off and saves the caller
  39. {From, {go}} ->
  40. State#state.next ! {self(), {ping}},
  41. loop(State#state{caller=From});
  42.  
  43. %% the ping needs to answer and then stop or continue
  44. {From, {ping}} ->
  45. From ! {self(), {pong}},
  46. if
  47. State#state.origin =:= self() ->
  48. State#state.caller ! {self(), 1};
  49. true ->
  50. State#state.next ! {self(), {ping}}
  51. end,
  52. loop(State);
  53. %% ...the response to a pong is to do nothing
  54. {_, {pong}} ->
  55. loop(State)
  56. end.
  57.  
  58. %% build a ring o 'N' nodes, and run through this 'M' times...
  59. test(Nodes,Cycles) ->
  60. io:format("starting the build and exercise of the ring...~n"),
  61. statistics(runtime),
  62. statistics(wall_clock),
  63. State = #state{},
  64. Head = spawn(fun() -> loop(State) end),
  65. rpc(Head, {origin, Head}),
  66. rpc(Head, {build, Nodes}),
  67. _ = [rpc(Head, {go}) || _ <- lists:seq(1,Cycles)],
  68. {_, Runtime} = statistics(runtime),
  69. {_, Walltime} = statistics(wall_clock),
  70. U1 = Runtime * 1000 / (Nodes*Cycles),
  71. U2 = Walltime * 1000 / (Nodes*Cycles),
  72. io:format("total cpu=~pms ... ~pus/op and wall=~pms ... ~pus/op~n",
  73. [Runtime, U1, Walltime, U2]).

There are several things I think are important watershed events in the code that really started to solidify my understanding of erlang. I think it's worth going over them to make sure it's easy to follow along.

There are Only Functions

Seems odd, but really the entire language is a series of functions. This may seem obvious to someone thinking Hey! It's a functional language, Bob! but it was't clear to me as I started this exercise. There are variables, but their scope is so limited that it's really just a series of function calls. If you want to build the structure of a ring, you have to have some idea of the head of the ring, the N-1 'other' nodes, and then loop it back to the head. This 'next' state is essential for a node, and it's not at all obvious where that's stored.

In truth, it's stored in the arguments to the loop() function. This was my first Ah! Ha! moment:

All state is maintained as function arguments.

Seems silly, but I wish he'd said that in the book. It sure would make things a lot easier. Again, think superconductor. You have state maintained in the "execution ring" of the typical loop() function. Once I got that, it was clear to stop trying my other methods.

State is Held in Records

Passing all this state-based data as arguments to functions gets ugly very fast. So the solution was to create records. Second Ah! Ha! moment:

State is conveniently held in records that are easily updated in parts.

This was major as it just isn't stated in the book that there's a reason for these records, and that state maintenance is it. They could really have said something and made it far easier to catch the major points.

Initializing Processes is a Method Call (or Two)

Because I create a process with the spawn() function, if you want it to refer to itself, other than the self() function, you have to send it a message. Lines 22-25 handle the method that's used to tell the Head of the ring that it is, in fact, the head of the ring. Since there's no state in the process other than what it maintains in a calling loop, you have to start that loop, and then "feed it" the data that it can "piece together" to form the complete state you want it to have.

This is more than a little complicated, because you really can have state in a process, but that state is really just held in a "ring" of looping calls like electrons in a superconductor. You have to set up the conditions under which they will flow, and then insert the data that flows.

I get it, but larger, more complex systems might be a real pain to keep straight. We'll have to see how things go.

Results

When I ran this test I got the following:

  29> c(ring).     
  {ok,ring}
  30> ring:start().
  starting the build and exercise of the ring...
  total cpu=2040ms ... 4.08us/op and wall=1692ms ... 3.384us/op
  ok
  31>

Now I haven't written my C++ equivalent - yet, but there's no way I'm not going to be able to beat this. First off, the CPU time is longer than the wall clock time? That makes no sense. I've double-checked the code, but yeah, it's longer. Even so, 4 μsec/op is not all that fast for as simple as it is. Again, I'll have to write the C++ version and see, but I'm guessing to be really able to beat this handily.

We'll see.

[3/14] UPDATE: I just made a C++ equivalent of this erlang code and it's not too bad. Yeah, it's about twice as long as the erlang code - in terms of number of lines, but it's clean, and it's got a lot more error checking than the erlang code does.

/**
 * ring.cpp - this is the C++ equivalent of the Armstrong Chapter 8
 *            exercise where you are supposed to make a ring of 'n'
 *            objects and have one fire another for a total of 'm' laps.
 */
//  System Headers
#include <stdint.h>
#include <iostream>
#include <sys/time.h>
 
//  Third-Party Headers
 
//  Other Headers
 
//  Forward Declarations
 
//  Public Constants
 
//  Public Datatypes
 
//  Public Data Constants
/**
 * These are the different messages that we're going to pass around
 * from Node to Node. They will be simple uint8_t values as they don't
 * need to be anything special.
 */
#define PING    0
#define PONG    1
 
 
 
/**
 * This is the node that will make up the ring. It's got a nice pointer
 * to the next Node in the ring and a few simple methods to make the
 * ring a little easier to build and use.
 */
class Node {
    public:
        // Constructors and Destructors
        Node() : mNext(NULL), mStopOnPing(false) { }
        ~Node() { }
 
        // Accessor Methods
        void setNext( Node *aNode ) { mNext = aNode; }
        void setStopOnPing( bool aFlag ) { mStopOnPing = aFlag; }
        bool stopOnPing() { return mStopOnPing; }
 
        // send the message to the target where it can respond
        bool send( Node *aTarget, uint8_t aMessage ) {
            bool        error = false;
            if (aTarget == NULL) {
                error = true;
            } else {
                error = !aTarget->onMessage(this, aMessage);
            }
            return !error;
        }
 
        // this method is called when a message is sent to this guy
        bool onMessage( Node *aSource, uint8_t aMessage ) {
            bool        error = false;
            switch (aMessage) {
                case PING:
                    if (((error = !send(aSource, PONG)) == false) &&
                        !mStopOnPing) {
                        error = !send(mNext, PING);
                    }
                    break;
                case PONG:
                    break;
                default:
                    error = true;
                    break;
            }
            return !error;
        }
 
        // this is a simple way to send a ping around the ring
        bool ping() {
            return send(mNext, PING);
        }
 
    private:
        // this is the next node in the ring - wrapping back around
        Node    *mNext;
        // ...lets me know if I need to stop on a PING (loop done)
        bool    mStopOnPing;
};
 
 
/**
 * This method just gives me a nice microseconds since epoch that I can
 * use for timing the operations.
 */
uint32_t snap() {
    struct timeval tp;
    gettimeofday(&tp, NULL);
    return (tp.tv_sec * 1000000) + tp.tv_usec;
}
 
 
/**
 * This is the main entry point that will build up the ring and then fire
 * it off 'm' times and then we'll see how fast it runs.
 */
int main(int argc, char *argv[]) {
    bool        error = false;
 
    // start off with the defaults for the program
    uint16_t    n = 1000;
    uint16_t    m = 500;
 
    // start the timer
    uint32_t    click = snap();
 
    // now, let's make the ring of the right size, holding onto the head
    Node    *head = NULL;
    if (!error) {
        std::cout << "Building the " << n << " element ring..."
                  << std::endl;
        if ((head = new Node()) == NULL) {
            error = true;
        } else {
            head->setStopOnPing(true);
        }
    }
    Node    *tail = head;
    for (uint16_t i = 0; !error && (i < (n - 1)); ++i) {
        Node    *newbie = new Node();
        if (newbie == NULL) {
            error = true;
            break;
        } else {
            tail->setNext(newbie);
            tail = newbie;
            tail->setNext(head);
        }
    }
 
    // now let's run it the right number of times
    if (!error) {
        std::cout << "Running the " << n << " element ring "
                  << m << " times..." << std::endl;
        for (uint16_t i = 0; i < m; ++i) {
            head->ping();
        }
    }
 
    // stop the timer
    if (!error) {
        click = snap() - click;
        std::cout << "Took " << click << " usec or "
                  << click*1000.0/(n*m) << " nsec/op"
                  << std::endl;
    }
 
    return (error ? 1 : 0);
}

When I run this guy, I get a much different runtime:

  peabody{drbob}23: c++ ring.cpp -o ring
  peabody{drbob}24: ring
  Building the 1000 element ring...
  Running the 1000 element ring 500 times...
  Took 16742 usec or 33.484 nsec/op
  peabody{drbob}25: 

So the time it took for C++ to do the work was 33.484 nsec, and the erlang took 3.384 μsec -- a difference of about 100x - in favor of C++. Yeah, it's that much different. I'm shocked, but only by the margin. I expected erlang to have the code side advantage, but not by a factor of two. And I expected C++ to beat erlang in speed, but not by a factor of 100.

Wild stuff. Very neat.

Dealing WIth People – Not Easy… At All

Wednesday, March 2nd, 2011

cubeLifeView.gif

I've said it before to my kids, and people I've worked with time and again: What makes you think you get to make rules here? It's a classic attitude - someone does something that gets considerable praise, they enjoy that, and the attention it brings, they start to exercise a little more power over their world, and it succeeds. This continues for a while, unchecked, and in the end, the person has to be smacked down. This leads to hard feelings, etc. but in the end, if they can accept that they were at fault, then things can move forward.

I suppose I really can't blame the people, they see what they want, they grab it, don't see any consequences, and grab again. Power corrupts, and absolute power corrupts absolutely. They start to think of themselves as untouchable and begin to act that way. I suppose the only thing that keeps a person in check in these situations is some internal regulator that says Hey, they may think you're great, but I know the difference - don't get too cocky! But many people don't have that. Maybe even most don't.

But I do. I guess that's age. Maybe it's a belief in something larger than myself that's looking over us all and just laughing at us when we think we're "all that". I don't know, but I do know that it's hard to work with these people.

As I write this I've realized that in additional to not making the rules here, I also don't enforce the rules. I'm sure someone is sitting up there laughing at my frustration at trying to get through to these folks. He's probably laughing harder at me than them.

I need to just see people as he sees them - funny. Liza made a cross-stitch of a fantastic fortune cookie I received:

Life is a tradegdy for those that feel.
And a comedy for those that think.

Think more, Bob. It's all pretty funny if you can get a little distance on it.

Integrating Vim with Gist at GitHub

Wednesday, March 2nd, 2011

MacVim.jpg

This morning I expanded my world considerably by happening across a Vim plugin for access to Gist. This is one of the services that I've been amazed at for a long while. GitHub is simply amazing, and I really should just give them money because I love what they are doing and want to support their work. But gists, in particular, are exceptionally cool.

Sure, there are a lot of places where you can throw up text and then look at it. But GitHub is so clean and focused on what they are doing, it's joy to use. So here's how I got it working:

First, follow the instructions on this page to download the plugin to your ~/.vim/plugin/ directory. You'll need to make a few additions to your ~/.vimrc file:

  let g:gist_clip_command = 'pbcopy'
  let g:gist_detect_filetype = 1
  let g:github_user = 'yourname'
  let g:github_token = '...big long hex number...'

and the instructions for getting your token are on the plugin page. Pretty simple stuff.

One thing I didn't like was the fact that when a new gist was downloaded, it was put into a split window. I don't like that. I have MacVim, and I open up new tabs and use them. So I changed the code in the plugin just a little. It was originally:

  1. if winnum != -1
  2. if winnum != bufwinnr('%')
  3. exe "normal \<c-w>".winnum."w"
  4. endif
  5. setlocal modifiable
  6. else
  7. exec 'silent split gist:'.a:gistid
  8. endif

and I changed line 299 to:

  1. if winnum != -1
  2. if winnum != bufwinnr('%')
  3. exe "normal \<c-w>".winnum."w"
  4. endif
  5. setlocal modifiable
  6. else
  7. exec 'silent edit gist:'.a:gistid
  8. endif

and everything worked just like I wanted it to. What an amazing little plugin for Vim! I can now edit, post, update, pull - all the things I'd like to be able to do on a gist, now from within Vim. What a treat.