Limitations on the STL std::map and Finding Keys

I've been working with the STL std::map for quite a while, but recently I use it as the core of a data structure where I wasn't finding an exact match - I was looking for a "lower bound" on the key to see where I needed to start looking for a match to the data. It's probably easier to start from the beginning. The data I'm dealing with is a std::map where the key is a uint32_t and the value is a boost::tuple of a uint32_t, another uint32_t, and a std::string. Like this:

  typedef boost::tuple<uint32_t, uint32_t, std::string> Channel;
  typedef std::map<uint32_t, Channel> ChannelMap;

Where the data looks something like this:

Key Value
0x00000000 0x00000000, 0x01ffffff, "first"
0x02000000 0x02000000, 0x02ffffff, "second"
0x03000000 0x03000000, 0x03ffffff, "third"
0x04000000 0x04000000, 0x04ffffff, "fourth"

where the 'key' is the first value of the tuple, and the two numeric values in the tuple form an arithmetic range for a uint32_t. What I need to do is to take an arbitrary uint32_t value and find the string that it fits with. If it's less then the least range there's nothing, and if it's greater than the last, it's nothing.

The problem was that I was using STL's lower_bound() and assuming that it was going to give me what I thought was the "lower bound" of the value in the keyspace. But it doesn't. What lower_bound() returns is:

Finds the first element whose key is not less than the argument

Simply put, it find the key whose value is greater than or equal to the argument. So this is almost the "upper bound" in my book. But there's more.

The upper_bound() method returns:

Finds the first element whose key greater than the argument

which is no better. What I wanted was something like: the largest key less than or equal to the argument. Put that way, I'm not really surprised that I didn't find it. So how to I make it out of the methods I have?

What I needed to do was to look at values of both of these functions and try to make some sense of it. So I made the following test code:

  #include <iostream>
  #include <string>
  #include <map>
  #include <stdint.h>
 
  int main(int argc, char *argv[]) {
    std::map<uint32_t, uint32_t>   m;
    for (uint32_t i = 10; i <= 100; i += 10) {
      m[i] = i + 1;
    }
    std::map<uint32_t, uint32_t>::iterator    it;
    // print out the entire map
    for (it = m.begin(); it != m.end(); ++it) {
      std::cout << "m[" << it->first << "] = " << it->second << std::endl;
    }
    // now check the lower_bound and upper_bound methods
    it = m.lower_bound(45);
    std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(45);
    std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
 
    it = m.lower_bound(50);
    std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    it = m.upper_bound(45);
    --it;
    std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    --it;
    std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    return 0;
  }

which returns:

  m[10] = 11
  m[20] = 21
  m[30] = 31
  m[40] = 41
  m[50] = 51
  m[60] = 61
  m[70] = 71
  m[80] = 81
  m[90] = 91
  m[100] = 101
  lower_bound(45): 50 == 51
  upper_bound(45): 50 == 51
  lower_bound(50): 50 == 51
  upper_bound(50): 60 == 61

From this, it seems like the two really aren't all that different - and they aren't. What's important to see, though, is that really the definition of greatest less than or equal to is something like one less than the one just greater. With that, I tried using the upper_bound() and then backing off one:

  #include <iostream>
  #include <string>
  #include <map>
  #include <stdint.h>
 
  int main(int argc, char *argv[]) {
    std::map<uint32_t, uint32_t>   m;
    for (uint32_t i = 10; i <= 100; i += 10) {
      m[i] = i + 1;
    }
    std::map<uint32_t, uint32_t>::iterator    it;
    // print out the entire map
    for (it = m.begin(); it != m.end(); ++it) {
      std::cout << "m[" << it->first << "] = " << it->second << std::endl;
    }
    // now check the lower_bound and upper_bound methods
    it = m.lower_bound(45);
    std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(45);
    std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
 
    it = m.lower_bound(50);
    std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    it = m.upper_bound(45);
    --it;
    std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    --it;
    std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    return 0;
  }

which returns:

  m[10] = 11
  m[20] = 21
  m[30] = 31
  m[40] = 41
  m[50] = 51
  m[60] = 61
  m[70] = 71
  m[80] = 81
  m[90] = 91
  m[100] = 101
  lower_bound(45): 50 == 51
  upper_bound(45): 50 == 51
  lower_bound(50): 50 == 51
  upper_bound(50): 60 == 61
  --upper_bound(45): 40 == 41
  --upper_bound(50): 50 == 51

If I was careful and checked for the limits, I think I'd have something. So that's exactly what I did.

My final code for finding the string in the tuple looks something like this:

  const std::string ZMQChannelMapper::getURL( const MessageMapCode aCode )
  {
    std::string     url;
 
    if (!mChannelMap.empty()) {
      ChannelMap::iterator  itr;
      if (mChannelMap.size() == 1) {
        // if there's only one... try it - we might get lucky
        itr = mChannelMap.begin();
      } else {
        // find the high-end of the enclosing range in the table
        itr = mChannelMap.upper_bound(aCode);
        // if it's not at the ends, back off one to the start
        if ((itr != mChannelMap.begin()) && (itr != mChannelMap.end())) {
          --itr;
        }
      }
      // from this starting point, check the range for a match
      if (itr != mChannelMap.end()) {
        Channel & tupleInfo = (*itr).second;
        if ((tupleInfo.get<0>() <= aCode) &&
            (aCode <= tupleInfo.get<1>())) {
          url.append(getBaseURL());
          url.append(tupleInfo.get<2>());
        }
      }
    }
 
    // all done - return what we have
    return url;
  }

It works, and it's OK, but it's clear why they didn't make something like this in STL - far too specialized. But I have it now.