One way of defining modules is by what engineers have to know. If a program X consists of components X1 … Xn then we can say Xi is a module if a programmer can modify it without learning “much” about any other module. A second, related measure can be in terms of information hiding – a module essentially encapsulates some data and exports interfaces for manipulating the data. This second definition seems quantifiable. Given a system state S, let Xi(S) be the state of component Xi – essentially all the data that Xi in state S will use to go to the next state. We need for Xi(S) to be minimal – it must contain only information that Xi uses to compute next state – no extra information at all. For example, if we have a poorly designed component that is just a single variable with methods set variable and get variable then Xi(S) would just be the contents of the variable. Now if looking at Xj(S) for all the other components Xj would let us know which of the other components last wrote a value to Xi and what that value was, we could recreate Xi(S) from the other components. In other words, we would see that Xi hid nothing. If Xi did something as simple as guard its variable against values outside a certain range, then we could not recreate Xi(S) from the other Xj(S) states. In that case, Xi is modular in the information hiding sense. We could then look at Xi and see how many bits are needed to reproduce that range information outside of Xi. And that count of bits is, in a sense, the measure of “how modular” Xj can be considered to be. And then maybe its worth doing the same exercise for the first definition of modularity.

Speculation on modularity and information theory