It Seems Very Easy to Misuse Google Collections

I like reusable code as much as the next guy. I really do. For what I've seen of the Google Collections Java code, I like that too, though to be honest, I haven't seen all that much. But that's not to say that it's not possible - even easy, to create a mess with the Collections. In fact, it can make code very difficult to read. Take as an example, the code I ran into today:

  public Set<Trade> getTradesForReport(PositionService aPositionService) {
    // get all the trades from the service
    Set<Trade>  retval = aPositionService.getAllTrades();
    // now filter them on the trading parameter
    for (Iterator<Trade> iter = retval.iterator(); iter.hasNext(); ) {
      Trade   t = iter.next();
      if (!allowedUsers.contains(t.getUser())) {
        iter.remove();
      }
    }
    return retval;
  }

Java's definition of the initial Set<Trade> does not address the mutability of the Set. In fact, Java has no native immutability, and the way that Google achieved this is to make an Iterator that overrides the remove() method and throws an Exception.

That's dangerous. Not illegal for the language, but certainly dangerous.

If you look at NeXTSTEP/OPENSTEP/Cocoa on the Mac, the 'base class' is by definition immutable. You have to create a mutable version deliberately. This makes it clear what you are dealing with. The 'default' (base) behavior is to assume that you can't mess with it. Makes perfect sense.

But Java is the opposite. By default, all the objects, collections, etc. are mutable, and they handle immutability by removing the iterator() method, and using a different Enumerator. Look at the ConcurrentHashMap. No way to remove an item during a 'scan' of the data because the Enumerator doesn't have a remove() method. Period.

While I admire Google's work, it's not clear what they are doing. Say, in the example above, there were several position services, and some of them decided to return Google's ImmutableSet. Now I'm sunk. The compiler won't see that I can't do this, because as far as it's concerned, I can. I'd have to trap for the exception and reverse the logic - create a blank one and add those I wanted to it. But if I do that, I might as well use that logic for all cases:

  public Set<Trade> getTradesForReport(PositionService aPositionService) {
    // get all the trades from the service
    Set<Trade>  src = aPositionService.getAllTrades();
    // ...and make a place to put the good ones
    Set<Trade>  retval = new HashSet();
    // now filter them on the trading parameter
    for (Trade t : src) {
      if (allowedUsers.contains(t.getUser())) {
        retval.add(t);
      }
    }
    return retval;
  }

While this works in all cases, it doubles the references used, and that's not a good thing when you get into a tight memory application and garbage collection is already an issue.

What they should have done isn't clear. Face it, they wanted this to fit into the standard Collections in Java. But they are all mutable by default. No... the problem here lies with the coder that uses these. You need to be more explicit on the return types and explicitly say they are ImmutableSets. Then, the user/maintainer of the code can see what was intended.

Alas, such was not the case for me today. I had a production problem because some of the code returned a standard HashSet to the method signature and some returned a Google ImmutableSet. Unfortunately, this has been in production for several weeks, but it's just now getting hit. Lovely bug to catch and fix in a hurry.