Archive for the ‘Coding’ Category

Google Chrome dev 12.0.725.0 is Out

Wednesday, April 6th, 2011

Google Chrome

This morning I saw that Google Chrome dev 12.0.725.0 was out and the release notes say it all:

All

  • Updated V8 - 3.2.6.0
  • Spring cleaning in the code, lots of code cleanup and refactoring under the covers

It's nice to see the clean-out of old code. I'm sure there's a lot that's sitting there - like maybe the H.264 decoder... OK, enough of that. In any case, at least they got the icon back. Baby steps.

Finally Done with Broker Services

Tuesday, April 5th, 2011

I finally finished the erlang services for The Broker. I was able to get my ticker plants hitting it and all was running. It's been a lot of work, and not a lot of "new stuff" - face it, it's a re-write, but in the end, we have something that's a lot better than the old, and it's going to be paying benefits long into the future.

Good work.

Making Progress on The Broker

Monday, April 4th, 2011

Today I made some real progress on The Broker. It's still the login service and the configuration service, but I'm getting things together. It's all skeletoned out, so I know what I need to do, and what I need to make it work - in a minimal sense.

Things are looking a lot better today.

More Work on the Login Service

Friday, April 1st, 2011

Ringmaster

It's been another long day of meetings and trying to get work done. For the most part, I've been working on the new Broker, and it's login and configuration services. The trick is currently the mongo-to-erlang interface, and it's proving to be powerful, but not entirely easy to use or well-documented. But it is workable, and that's enough.

Lots of work, but not a lot of code. But I'm getting better at erlang, and that's a big plus.

Creating the Broker’s Login Service

Wednesday, March 30th, 2011

Ringmaster

Most of today has been spent working on the latest rewrite of The Broker - but this guy is the distributed erlang version where the service registry, distribution, routing, etc. are all handled by erlang. This guy will then be a little node on each server that uses or provides services, and it'll connect into a web of other such nodes and share it's state to all so that we can load balance around the net, and provide some nice fault-tolerance.

Today I spent a lot of time working on the login service. There's a lot there because I have to get the LDAP authentication going and then deal with the mongo database - all from erlang. There's a good start, but I'm still very new to erlang, and this is just plain slow going.

Tricky Timing Bug with Inherited Threads

Wednesday, March 30th, 2011

bug.gif

This afternoon a co-worker pointed out a problem with a threaded app he was testing. It was using a data feed component I'd written, and it was causing seg faults when it was shutting down. He was having a hard time figuring out why it was crashing, and I was having a hard time figuring out what was causing the state to be reset.

The thread model I was using is a simple class that runs a process() method over and over catching exceptions, etc. until the process() method returns a "stop" flag. Pretty simple stuff. There's the ability for users of the thread object to tell it to stop:

  Thread::setTimeToDie(true);

and the next time it's ready to call process() it bails out and stops. Pretty simple. But it wasn't acting that way in the tests.

I subclassed this Thread for my data feed class, and when I detected that the parent thread was to stop, I stopped some processing sub-threads. The structure is pretty simple - the main thread was handling supervision, a boost ASIO thread was handling the incoming data, and the processing sub-threads were taking that raw data and converting it (two steps) to be used downstream. What was supposed to happen was that the sub-threads were to detect when the parent Thread was to die, and they themselves, would then die.

What appeared to be happening was that the sub-threads weren't getting the message. Or at least not getting it in time. Very odd. If I told the Thread to stop, the sub-threads didn't stop. If I told the Thread to stop, and then did a little shut-down processing, and finally told the Thread to stop again, then things shut down.

It appeared that there's a nasty timing problem here, and I didn't want to leave it at this, but there's no more time today. For now, it's working, but very oddly.

[3/31] UPDATE: Interesting point... this morning, on my walk to the train, I saw it. When the Thread stops, it resets the "time to die" flag to false. The sub-threads weren't seeing it because they weren't checking fast enough. The main thread died, reset it's "die" flag, and the sub-threads just didn't think there was any reason to stop. The fix was easy - don't reset the flag until the start of the next thread. Easy fix.

Finishing up on The Broker

Tuesday, March 29th, 2011

Ringmaster

Today I finished up my part of The Broker - the service registry and the service load (for load balancing) and they were pretty simple modules. That's a good thing because I'm new to erlang, and the simpler the better. Even so, I was introduced to the erlang crash dumps, and picked my way threw a few. Not easy, but once you know what to look for, it's not too bad. Certainly no worse than gdb.

Then I started work on converting the C++ code I had to use the new Broker. It was mostly a gutting as the code I had added for the direct dial was now useless, and that simplified the code quite a bit. Then I needed to add in the ability to send in the load numbers for each service, and I was in business.

I updated my Broker tests - the client and the service, and they now work nicely as well. It's a really good day. Lots of nice progress.

On the DIsadvantages of Being a Lead Developer

Monday, March 28th, 2011

There are a lot of things to like about being a lead developer - you get to have a significant hand in the design of the work you're doing. You get to pick what you want to do, and farm out the rest to others. But there's a dark, ugly side as well - people. Yes, those devices that can't be debugged no matter how long you use gdb - people. And I'm reminded of it today in a very unpleasant way.

The lead on a project is expected to help the junior guys along. Help them learn the craft as it were. They will pick up habits, the goal is that they pick up the good habits of the lead, and then they are able to be self-sufficient in the workplace. Kind of like kids. Being in the middle of that cycle at home gives me keen insight into what's happening to me here at work.

I've got a clingy 5-year old. At home we called it "the five-foot rule". The kids would never be more than five linear feet from us no matter what we were doing, where we were doing it, no matter how big the house was. They were right there. Always. In their early years we were everything - playmate, audience, judge, everything. The same is true at work, it seems.

He's telling me about the status of his work. He's telling me he'll be done soon. When he's done. That he needs to be busy. It's like a clingy 5-year old. I can appreciate that he's trying very hard to be productive and contribute, but he also doesn't realize - like a 5-year old doesn't realize, that there are completely different expectations for him than for me. He's supposed to learn, listen, and if he's really doing his job, be as light a load on me as possible.

But that's not how it's working out.

I'm going to have to slug through this for a while longer. I'm not sure how much longer, but a while at least. If he doesn't get better, then he's the learning disabled kid, and will continue to be a burden on me as long as he's here. That's not going to fly. But assuming he's going to get better, I just have to give him time. It's tough some times.

Building the Next Broker

Monday, March 28th, 2011

Ringmaster

I've been talking about The Broker for a while, and it's been through at least five iterations/revisions/re-writes that I can remember, but we're in our sixth now. What I've seen, as I hit it the hardest of almost anyone, is that the java version built on Netty is just too complex. The design is reasonable, but even that's complex because of the direct dial interface we put on The Broker to remove the massive load some clients can generate. It just wasn't working. So we tried a new approach.

I mentioned distributed erlang, and asked why we didn't build the original erlang Broker using that? It seemed to handle all the registration/location service issues. It also handled the problem of a centralized service in general. When one client makes a request to one service, it's only those two nodes that will see the load - not anyone else. It's a really nice idea - extremely simple, and therefore, probably the best.

So my boss started on this next version. He and I worked on it today and we're down to just a few things that need to be finished: the service registry and the service load (for load balancing). I'll do those tomorrow and we'll be feature complete.

The beauty of this design is that we can add as many native services to the Broker as erlang modules in the distribution package. Each one is very simple, and is loaded on demand from the elang runtime. This places us in the wonderful position of being able to add in the login service, the configuration service, and any other service that we want in the native erlang. Because this new design places a Broker node on each box running a service, this means that for the most used services, like login and configuration, we won't even go off-box to get the response.

This is going to have a huge impact to the reliability of the Broker. It's going to be able to be stopped/started on each node, as needed, it'll connect into the grid and share it's state, and will be able to handle all the routine services on it's own. This is a major win for me. I think it's going to make it great.

First Week of Crush Down – Not Looking Very Promising

Friday, March 25th, 2011

Well... the first week of the four week "delivery cycle" of the new Greek Engine has come and gone, and I have to say, if we get done with it in the next three weeks, I'll be surprised. Maybe not off by much, but there's so much to do, and with two new guys on the team, one very junior, it's a lot of work on my part just to keep them for imploding.

On the other hand, I have to say that things are coming together. They are taking shape in the code, and things are getting written - just not fast enough, and the hardware was always going to be an issue. There's just not enough time to get all the hardware purchased, installed, built and running. It'll be close, if they hurry, but I've already found out that I'm going to be delayed at least a week on some hardware coming from another data center. Doesn't sound like much, but that's 25% slippage.

But progress has been made. We have better integration between the two projects - my ticker plant and the greek engine, and we at least know the sections of code we need to build in order to make this thing work. That's a big help. It's just a matter of time.