Collaborative articulation of how abstraction and language is employed in the computational manifestation of numbers -- including analysis of the role of syntax, semantics, and meaning in the specification and use of software interfaces.
The nfoCentrale Blog Conclave
nfoCentrale Associated Sites
Technorati Tags: programming idioms, programming style, functional programming, object-oriented styles, oMiser, cybersmith
This post is intended to accomplish three things:
How Big Should a Function Be? When I first saw the commandment to describe computational functions in around 5 lines, I thought it was a joke. With little consideration for the language or the situation, I found the injunction about keeping function descriptions small (as opposed to functional) simply bizarre. Yet it is from a very smart fellow , and I was stopped in my tracks. Although Martin concedes that this is situational, there is no guidance about that. I worry that budding cybersmiths will take this as gospel and a commandment not to be broken.
The echo chamber in the comments, and the dismissal of those who objected to context-free generalizations is surprising. That supports my concern. There's too much fussiness over indentation, nesting of conditionals and other layout considerations. This seemed to flaunt the principled sophistication of guidelines that I favor, such as the C++ Coding Standards of Sutter and Alexandrescu .
Breaking the Rules on Principle
Here's the test case that came to mind as soon as I read Martin's article.
The programming notation is Frugalese, an informal, unimplemented language that I use to illustrate functional-programming concepts. That's unimportant. It is important that it is neither Java nor C# and it is completely functional. Frugalese is an applicative notation for computation: the fundamental operation consists of carrying out a procedure described by one operand against the data represented by the second.
The operation ob.ap[f, x] defines the basic interpretive process of oMiser. The operand f is taken as a program and the operand x is taken as data which the program is to process. In oMiser these are both data structures called obs. The Frugalese example is an applicative expression of the oMiser interpreter's basic applicative operation. (That is, we are using an applicative-language notation to describe the implementation of an applicative-machine interpreter. There's no harm in that and there is no chicken-and-egg problem: the oMiser implementation used by oFrugal is not implemented in oFrugal or any higher-level Frugalese language. But ob.ap, below, is a faithful simulation, capturing the essential behavior and functionality.)
I choose this example because I am in the process of rethinking its formulation and how to present it in a direct way. It struck me that this definition illustrates the difficulties and trade-offs of arbitrarily choosing smaller function (procedure) definitions over larger ones.
In this example, there is only the one function, ob.ap, that is defined for use from elsewhere. There are 23 lines, not counting blank lines introduced for layout purposes:
The Principle Function
There are at least two functions defined above. The main one is ob.ap[f, x]. This definition is self-referential (signified by defrec), where the defined function is referenced within its definition at line 41. The operation of evaluating this function can lead to multiple recursive evaluations using the same function. The essence of the function is this simple structure:
But Wait, There's More
The function for the composed-form evaluation, eval(f), is defined in the following lines. Here's the key definition:
This function is being defined local to the fragment
That is, the function is only usable (and only defined for use) in conjunction with the expression eval f.
The where introduces a local definition and it introduces it right there. This establishes that this is the only place that definition is required. This makes the dependency exactly clear in both directions: ob.ap(f, x) only requires eval in that one place and eval is only used in one way (other than the ways that eval is recursive).
Equally important is the fact that eval depends on the parameters of ob.ap[f, x]. This is at lines 23 and 25. This is another isolated dependency. It is global to eval yet local to the instance of ob.ap[f, x] in which line 17 is reached. In effect, the definition of eval(p) occurs dynamically in each instance of ob.ap[f, x] evaluation. That instance-specific definition is reused for all recursions of the defined eval.
These kinds of "inner function" have been around since the introductions of the LISP and ALGOL 60 programming languages.
This use of the inner function, a recursive one at that, is very economical. It may not the clearest, but it emphasizes locality and cohesion that are important for this particular algorithm.
Breaking Up Is Maybe Not So Hard To Do?
It is possible to break out the subfunctions used in defining ob.ap[f, x]. These considerations must be weighed:
To see what's involved, pull the definition of eval p outside of the definition of ob.ap[f, x]:
We have succeeded in pulling out ev[f, x] p in place of the internal eval p definition.
The additional parameters, [f, x], are required because we are no longer in any context where those variables are visible globally. In effect, it is necessary to replace eval everywhere in the definition of eval p with ev[f, x].
I don't want to write a redundant ev[f, x] for every eval in the original version. There is not meant to be any variability here, so I choose another approach. This involves defining a third function. There are many other ways to introduce that third function. My choice takes advantage of the fact that a let ... in form is equivalent to introduction of a function definition and application:
This refactoring works easily because ev[f, x] is a legitimate intermediate result that can be defined and used as an individual operand.
The use of evform(ev[f, x], ob.a p, ob.b p) in the definition of ev[f, x] p is very much like saying
This second form preserves locality at the cost of additional indentation and coupling directly into the superior function. At the same time, it avoids surfacing a function that is only needed in one place.
Confessions and Compromise
It is intriguing that the form with definitions of three functions satisfies Martin's requirements.
Confession: When I explain the operation of ob.ap[f, x], I do so using auxiliary function definitions. I use more of them than the three factored out here. One important reason for additional functions is the prospect that some of them are extended as more features are added to the system.
Compromise: When I want to control the coupling among functions, it is important to have an explicit way to reflect that. Locality of auxiliary function definitions allows that. My compromise version would preserve the three functions for explanatory purposes in this locality-controlling form:
The use of where at the "top level" is a bit like saying private in the method declarations for a class. But we do not need to introduce classes, saving us from having to consider the interactions between access specifiers and inheritance.
More confession: I just made up this additional use of where as an "access" specifier similar to use of rec to impose self-reference in definitions. I invented it on the spot. (The internal form of where clause has been known at least since the work of Burge and Landin[4, 5] and was used by Strachey) I introduced this specifier flavor because I am jealous of the improved readability of the interdependent multi-function definition form.
I will pay a penalty in the complication of the Frugalese grammar and in explanation of how this use of where impacts the definitions in a definition list where it occurs.
This device is not going to assure that Martin's injunction can or should be satisfied in all cases. I'm pleased that I can take advantage of that style without surrendering localization of interdependencies among function definitions. I'm going to use this Frugalese feature much more before I am satisfied that it carries the day well enough.
In testing Martin's guidelines against a specific example of my own, I found a way where following the rules slavishly would surrender other qualities of functional definitions. Since I have the privilege of defining the Frugalese notation, I am able to adopt a compromise approach that keeps auxiliary functions auxiliary while flattening the layout and the presentation. It remains to be determined whether this addition is a significant improvement overall.
|You are navigating The Miser Project.|