Networking Problems

For the past few days at work there seems to be an insidious problem that is causing the server processes to have to be restarted in order to clean themselves up. This happened three times yesterday - an all-time high. The problem is that I didn't design the networks, and certainly don't maintain them, so if there's a router problem, I'll never know it. I'm concerned that since this is a typical network problem - here one second and gone the next, that it's likely not to get fixed until it gets horribly worse.

Then there's the home networking that's having problems this morning as well. This morning before I left for the train I ran into the problem that I didn't seem to be getting out to the internet. So I restarted the NAT service and things seemed to be a little better, but I think there was a bad DNS issue that makes simple tasks like browsing hard eventhough the real connectivity is there. We'll have to see when I get home, but I think it's probably cleared up by now.

So today is one of those times when things are going poorly at work and the level of support is rising a lot due to these problems. The code hasn't changed in months, the user base hasn't changed, the markets are changing but not like this... so what's causing the problem? Maybe it's not the network, but it's the only thing that I can think of that's capable of causing these types of problems, and is out of our control to check and repair, if necessary.

Having been in the network support scene for years - including leased lines that are without doubt the worst things to support if you don't have an excellent carrier, I know that the infrastructure gets blamed for a lot of things that aren't really its fault. So I've tried to be easy on the complaints, but when a user is asking what you think the problem is, you have to give them the best knowledge you have. It's not a great situation to be in.

Then again, the guys that put this network together at work aren't the totally sharpest in the land, either. When we moved in we had a terrible time getting the NICs on 100Mbps/half-duplex when the switches were supposed to be auto-sensing. For all the Linux workstations we had to have the switch set to 100/full and then force the driver as well. It was a real pain to figure this out... lots of lost time on that one. So it really could be the network... I just wish it'd get cleared up.