Typed collection keys

Summary: When programming, often there is a requirement to store heterogeneous types of values in a collection. The Typed Keys pattern defines keys which ‘know’ the type of data to which they refer, and which encapsulate the type information and logic to convert it to and from the storage type.

Preamble

I seem to write on this theme a lot: how to leverage the type system of modern programming languages to reduce programming errors, and make application code less verbose. Well, this is another example of that, and it’s a technique I find myself using quite often in Java & C# to really simplify code, increase type-safety (i.e., reduce the opportunity for dumb errors) and make business logic more understandable.

The gist is that if you do the same thing every time you access particular keys/variables/lookup items, then you should encapsulate these actions within the key, rather than spreading them throughout the code.

It’s not an original idea, and we’ll point to some examples in a future post.

Motivating example:

Let’s say you have a map of strings to objects (in Java: Map<String, Object>), and you’re storing in it values of several different types:

‘sessionId’: Long
‘username’: String
‘loginDate’: Date object (representing a point-in-time)
‘userStaffId’: Integer (optional)

And let’s say that there could be a bunch of other such keys too—dozens or hundreds—it’s an open-ended list.

This situation occurs with session attributes, and configuration settings, and key-value databases.

If it is a small, relatively-fixed list of keys, the solution might be a wrapper class around the Map, with appropriately-typed getters and setters for each value. But the keys are an open-ended set. The code to retrieve a session-id might be:

long sessionId = (Long)map.get("sessionId");

Let’s imagine that we’re doing that a few times throughout the codebase.

And to set it:

map.put("sessionId") = /* generate the session id somehow */

Already, there are a couple of problems with this code, so let’s address them one at a time.

The string literal

A simple (and obvious) one to start:

We’re repeating the same constant strings throughout the codebase. This is error-prone, and makes the code fragile & hard to change.

It’s error-prone—because a typo at any of the string literals will never be caught at compile time, but will cause the code to fail at runtime.

It’s fragile—because these errors are easy to introduce accidentally.

It’s hard to change—because changing any key-name requires changing it at many places throughout the code.

So the obvious first step is to define each of the key-names we’re using as string constants:

public class Keys {
public static final String SESSION_ID = “sessionId”;
public static final String USERNAME = “username”;
// etc.
}

And we change our access code to make use of the string constants:

long sessionId = (Long)map.get(SESSION_ID);

That’s an improvement. It’s one that you’d expect most people to make. Fairly uncontroversial.

However, there is another bit of repetition & fragility in the code as it stands, and it’s not immediately obvious how to refactor this one:

The type of each key

The repetition is the cast-to-long (or whatever the type of the key is) at each access point. This repetition leads to the same issues as the literal strings: fragility, error-proneness & difficulty in changing the code.

It’s error-prone—because a wrong cast at each any access point will never be caught at compile time, but will cause the code to fail at runtime. For example, there’s no compile-time guarantee that we’re casting to the same type when reading the value as the type we wrote to the key.

It’s fragile—because these errors are easy to introduce accidentally.

It’s hard to change—because changing any key’s type could require changing it at many, many places throughout the code.

Additionally, the repetition of the type casts—which is inherent to each key—is nevertheless written out explicitly each access point, and so clutters the code.

The idea

So can we somehow encapsulate the string key name with the type of the variable…?

Perhaps something like:

class Key<T> {
final String keyName;
final Class<T> valueType;
}

We can then construct a mechanism for getting and setting these variables:

static class KeyMapAccessor {
static T getFrom<T>(Key<T> key, Map<String, Object> map) {
Object untypedValue = map.get(key.keyName);
return key.valueType.cast(untypedValue);
}

static void setIn<T>(Key<T> key, Map<String, Object> map, T value) {
map.set(key.keyName, (Object)value);
}
}

If we then declare SESSION_KEY as a variable of type Key<Long>, we can get its value like this:

long sessionId = getFrom(SESSION_ID, map);

And set it like this:

setIn(SESSION_ID, map, generateSessionId());

We’ve accomplished two big improvements here:

  1. Removed the visual clutter of type conversions from the ‘getter’ code.
  2. Made accesses of the value consistently typesafe. (This applies to the getter and the setter.)

Just to drive that last point home, you would get an error at compile time if you wrote:

setIn(SESSION_ID, map, "string value, should be a long!");
// or
int sessionId = getFrom(SESSION_ID, map); // int ≢ long

If you ever needed to change the type of a keyed variable, for example, changing the session ID from a Long to a GUID, you’d change the declaration of SESSION_ID, and the compiler would point out all the places in the code that needed to change.

Alternatively

Another, and common, approach to achieving type-safety is to wrap the (untyped) map in a typed wrapper, and provide accessors for each of the values. For example:

public class SessionMap {
private final Map<String, Object> underlyingMap;
private static final String SESSION_ID = "sessionId";
// etc

public Long getSessionId { return (Long)underlyingMap.get(SESSION_ID); }
public void setSessionId { underlyingMap.set(SESSION_ID, value); }
// etc
}

That’s very viable, especially when the number of keys/variables is bounded and/or small.

However, as soon as the number of keys is open-ended—or the keys don’t belong together, and putting them all in the same class would entail mixing of concerns—this becomes a less practical solution.

Summary & next steps

We’ve looked at the general model of associating type conversion logic and type information with a collection key name.

This technique can:

  • Increase code clarity, by moving type conversion code out of the business logic.
  • Increase code safety & robustness, by removing the fragile repetition of type conversion code throughout the codebase.

There are a few other variations on the technique that I might look at in future blog posts.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.