I've been working with the STL std::map for quite a while, but recently I use it as the core of a data structure where I wasn't finding an exact match - I was looking for a "lower bound" on the key to see where I needed to start looking for a match to the data. It's probably easier to start from the beginning. The data I'm dealing with is a std::map where the key is a uint32_t and the value is a boost::tuple of a uint32_t, another uint32_t, and a std::string. Like this:
typedef boost::tuple<uint32_t, uint32_t, std::string> Channel;
typedef std::map<uint32_t, Channel> ChannelMap;
Where the data looks something like this:
0x00000000 |
0x00000000, 0x01ffffff, "first" |
0x02000000 |
0x02000000, 0x02ffffff, "second" |
0x03000000 |
0x03000000, 0x03ffffff, "third" |
0x04000000 |
0x04000000, 0x04ffffff, "fourth" |
where the 'key' is the first value of the tuple, and the two numeric values in the tuple form an arithmetic range for a uint32_t. What I need to do is to take an arbitrary uint32_t value and find the string that it fits with. If it's less then the least range there's nothing, and if it's greater than the last, it's nothing.
The problem was that I was using STL's lower_bound() and assuming that it was going to give me what I thought was the "lower bound" of the value in the keyspace. But it doesn't. What lower_bound() returns is:
Finds the first element whose key is not less than the argument
Simply put, it find the key whose value is greater than or equal to the argument. So this is almost the "upper bound" in my book. But there's more.
The upper_bound() method returns:
Finds the first element whose key greater than the argument
which is no better. What I wanted was something like: the largest key less than or equal to the argument. Put that way, I'm not really surprised that I didn't find it. So how to I make it out of the methods I have?
What I needed to do was to look at values of both of these functions and try to make some sense of it. So I made the following test code:
#include <iostream>
#include <string>
#include <map>
#include <stdint.h>
int main(int argc, char *argv[]) {
std::map<uint32_t, uint32_t> m;
for (uint32_t i = 10; i <= 100; i += 10) {
m[i] = i + 1;
}
std::map<uint32_t, uint32_t>::iterator it;
// print out the entire map
for (it = m.begin(); it != m.end(); ++it) {
std::cout << "m[" << it->first << "] = " << it->second << std::endl;
}
// now check the lower_bound and upper_bound methods
it = m.lower_bound(45);
std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(45);
std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.lower_bound(50);
std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(50);
std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(45);
--it;
std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(50);
--it;
std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
return 0;
}
which returns:
m[10] = 11
m[20] = 21
m[30] = 31
m[40] = 41
m[50] = 51
m[60] = 61
m[70] = 71
m[80] = 81
m[90] = 91
m[100] = 101
lower_bound(45): 50 == 51
upper_bound(45): 50 == 51
lower_bound(50): 50 == 51
upper_bound(50): 60 == 61
From this, it seems like the two really aren't all that different - and they aren't. What's important to see, though, is that really the definition of greatest less than or equal to is something like one less than the one just greater. With that, I tried using the upper_bound() and then backing off one:
#include <iostream>
#include <string>
#include <map>
#include <stdint.h>
int main(int argc, char *argv[]) {
std::map<uint32_t, uint32_t> m;
for (uint32_t i = 10; i <= 100; i += 10) {
m[i] = i + 1;
}
std::map<uint32_t, uint32_t>::iterator it;
// print out the entire map
for (it = m.begin(); it != m.end(); ++it) {
std::cout << "m[" << it->first << "] = " << it->second << std::endl;
}
// now check the lower_bound and upper_bound methods
it = m.lower_bound(45);
std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(45);
std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.lower_bound(50);
std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(50);
std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(45);
--it;
std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
it = m.upper_bound(50);
--it;
std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
return 0;
}
which returns:
m[10] = 11
m[20] = 21
m[30] = 31
m[40] = 41
m[50] = 51
m[60] = 61
m[70] = 71
m[80] = 81
m[90] = 91
m[100] = 101
lower_bound(45): 50 == 51
upper_bound(45): 50 == 51
lower_bound(50): 50 == 51
upper_bound(50): 60 == 61
--upper_bound(45): 40 == 41
--upper_bound(50): 50 == 51
If I was careful and checked for the limits, I think I'd have something. So that's exactly what I did.
My final code for finding the string in the tuple looks something like this:
const std::string ZMQChannelMapper::getURL( const MessageMapCode aCode )
{
std::string url;
if (!mChannelMap.empty()) {
ChannelMap::iterator itr;
if (mChannelMap.size() == 1) {
// if there's only one... try it - we might get lucky
itr = mChannelMap.begin();
} else {
// find the high-end of the enclosing range in the table
itr = mChannelMap.upper_bound(aCode);
// if it's not at the ends, back off one to the start
if ((itr != mChannelMap.begin()) && (itr != mChannelMap.end())) {
--itr;
}
}
// from this starting point, check the range for a match
if (itr != mChannelMap.end()) {
Channel & tupleInfo = (*itr).second;
if ((tupleInfo.get<0>() <= aCode) &&
(aCode <= tupleInfo.get<1>())) {
url.append(getBaseURL());
url.append(tupleInfo.get<2>());
}
}
}
// all done - return what we have
return url;
}
It works, and it's OK, but it's clear why they didn't make something like this in STL - far too specialized. But I have it now.