Interesting User Validation Problem
Friday, October 12th, 2007Today another developer stopped by with a problem I've seen several times before - how to handle user validation of percentages. Specifically, let's say a user needs to put in how to subdivide something - 10% to this, 20% to that, and 70% to the other. Breakdowns like this are easy - until they get into factional percentages and numbers that are, by their very nature, impossible to represent exactly in a computer.
Take 0.12 - it seems easy enough to look at, but try to add it to 0.74 and 0.14 and you're going to find that a double is not exactly 1.0. That's because the numbers are not exactly what we typed in and there are representational errors in the doubles that make it something like 1.00000001. So how to fix this?
One way, which I've done for many cases, is to have an ε something like 10-6 and check to see if the value you want is some ε from the target value by:
if (Math.abs(sum - target) <= 1.0e-6) {
// close enough to be considered equal
} else {
// not equal
}
which can be put into a method or function that makes it easy to call. But the idea is that if you get close enough, you're equal. The problem with this is that the ε that you should use is greatly dependent on the values you're summing, in this example. Say you have values like 10, 20, and 70 - in this case, ε should be 0.1 because there's no need to have it any smaller as all the numbers are integers. But if you had numbers like 10, 20.22, and 69.78 then you may need to have ε set to 0.001. The optimal value of ε seems to be directly related to the maximum precision of the data. Also, there's always the possibility that some oddball case will be just large enough a difference to not be 'equal' and then you're flagging a false positive. But there is an interesting way that this won't happen.
Look at the problem of the user input as if you were a third-grader. Break the problem down into the whole number and fractional components. This makes everything an integer, and adding integers is very easy indeed - with no rounding problems. What I mean to say is that if you had the set of numbers: 10, 20.22, and 69.78 then break the summation down into 10 + 20 + 69 + (00 + 22 + 78)/100.0. This means that you only have to look at the string representation of the numbers - separate out the parts to the left and right of the decimal, find the maximum precision to the right of the decimal (for scaling), and then go to town.
After we talked about this approach for a little bit, the developer went back and coded it up in no time. It's really very simple. But given that users can type in arbitrary precision in the numbers, it's important to sum them as exactly as possible, and using integers for the whole and fractional parts is really pretty sweet. There's no rounding (until you get to the final double) and by then, you're comparing it to the test value, and if it's another integer (in the case of percentages), then it'll be easy.
I giggled a bit thinking about this. It's not often that the best way of doing something is the way a third-grader would do it. But that's the point - not everything needs to be so overly complex and over-designed.