The Miser Project  
privacy 
 
 
 

Collaborative articulation of how abstraction and language is employed in the computational manifestation of numbers -- including analysis of the role of syntax, semantics, and meaning in the specification and use of software interfaces.



Click for Blog Feed
Blog Feed

Recent Items
 
More Annotated Turing
 
Different (Universal) Computation Models
 
2007-11-14: SeaFunc Meet-Up
 
Miser Project: Repaving Kick-Off
 
Hark! Is That an Idiom That I See Before Me?
 
Whither Peano?
 
Miser Hacks II: A Hole to Bind Them
 
Petzold Annotates Turing!
 
Hazard Warning: Much Site Breakage I’ve just notic...
 
Tweaking Technorati

This page is powered by Blogger. Isn't yours?
  


visits to Miser Project pages

The nfoCentrale Blog Conclave
 
Millennia Antica: The Kiln Sitter's Diary
 
nfoWorks: Pursuing Harmony
 
Numbering Peano
 
Orcmid's Lair
 
Orcmid's Live Hideout
 
Prof. von Clueless in the Blunder Dome
 
Spanner Wingnut's Muddleware Lab (experimental)

nfoCentrale Associated Sites
 
DMA: The Document Management Alliance
 
DMware: Document Management Interoperability Exchange
 
Millennia Antica Pottery
 
The Miser Project
 
nfoCentrale: the Anchor Site
 
nfoWare: Information Processing Technology
 
nfoWorks: Tools for Document Interoperability
 
NuovoDoc: Design for Document System Interoperability
 
ODMA Interoperability Exchange
 
Orcmid's Lair
 
TROST: Open-System Trustworthiness

2008-01-07

 

Cybersmith: How Many Functions Am I Holding Up?

This post is intended to accomplish three things:

  1. Find a decent way to have Windows Live Writer produce pretty-printed code that appears cleanly on both blog pages and in my RSS feeds.
      
  2. Refine my definition of a function that has a critical role in the Miser Project.
      
  3. Object to a strange set of commandments about the size and other characteristics of "functions," exploring how to make them work in functional programming while recognizing the compromises involved.

How Big Should a Function Be?  When I first saw the commandment to describe computational functions in around 5 lines, I thought it was a joke.  With little consideration for the language or the situation, I found the injunction about keeping function descriptions small (as opposed to functional) simply bizarre.  Yet it is from a very smart fellow [1], and I was stopped in my tracks.  Although Martin concedes that this is situational, there is no guidance about that.  I worry that budding cybersmiths will take this as gospel and a commandment not to be broken.

The echo chamber in the comments, and the dismissal of those who objected to context-free generalizations is surprising.  That supports my concern.   There's too much fussiness over indentation, nesting of conditionals and other layout considerations.   This seemed to flaunt the principled sophistication of guidelines that I favor, such as the C++ Coding Standards of Sutter and Alexandrescu [2].

Breaking the Rules on Principle

Here's the test case that came to mind as soon as I read Martin's article.

The programming notation is Frugalese, an informal, unimplemented language that I use to illustrate functional-programming concepts.  That's unimportant.  It is important that it is neither Java nor C# and it is completely functional.  Frugalese is an applicative notation for computation: the fundamental operation consists of carrying out a procedure described by one operand against the data represented by the second.  

The operation ob.ap[f, x] defines the basic interpretive process of oMiser[3].  The operand f is taken as a program and the operand x is taken as data which the program is to process.  In oMiser these are both data structures called obs.  The Frugalese example is an applicative expression of the oMiser interpreter's basic applicative operation.  (That is, we are using an applicative-language notation to describe the implementation of an applicative-machine interpreter.  There's no harm in that and there is no chicken-and-egg problem: the oMiser implementation used by oFrugal is not implemented in oFrugal or any higher-level Frugalese language.  But ob.ap, below, is a faithful simulation, capturing the essential behavior and functionality.)

I choose this example because I am in the process of rethinking its formulation and how to present it in a direct way.  It struck me that this definition illustrates the difficulties and trade-offs of arbitrarily choosing smaller function (procedure) definitions over larger ones.

In this example, there is only the one function, ob.ap, that is defined for use from elsewhere.   There are 23 lines, not counting blank lines introduced for layout purposes:


1: defrec /* SELF-CONTAINED SINGLE FUNCTION DEFINITION */
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */
4:
5: = if is-individual(f) /* Atomic (single-instruction) operation */
6:
7: then if f = ob_A then ob.a x /* a-part of x */
8:
9: elif f = ob_B then ob.b x /* b-part of x */
10:
11: elif f = ob_E then ob.e x /* enquote x */
12:
13: else ob.c(f, x) /* prefix f to x, etc. */
14:
15: elif is-singleton(f) then ob.a f /* quoted constant */
16:
17: else eval f /* Evaluate applicative form f */
18:
19: where rec eval p
20:
21: = if is-singleton(p) /* argument reference */
22:
23: then if p = ob_ARG then x
24:
25: elif p = ob_SELF then f
26:
27: else ob.a p /* or (quoted) literal */
28:
29: else let op = ob.a p,
30:
31: arg = ob.b p
32:
33: in if op = ob_C /* pairing form */
34: then ob.c( eval ob.a arg,
35: eval ob.b arg )
36:
37: elif op = ob_D /* comparison form */
38: then ob.d( eval ob.a arg,
39: eval ob.b arg )
40: /* applicative form */
41: else ob.ap[ eval op, eval arg ];


The Principle Function

There are at least two functions defined above.  The main one is ob.ap[f, x].  This definition is self-referential (signified by defrec), where the defined function is referenced within its definition at line 41.     The operation of evaluating this function can lead to multiple recursive evaluations using the same function. The essence of the function is this simple structure:


1: defrec
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */
4:
5: = if is-individual(f)
6:
7: then /* do the simple cases for f being simple */


17: else eval f /* handle the case of f being a composed form */


But Wait, There's More

The function for the composed-form evaluation, eval(f), is defined in the following lines.  Here's the key definition:


1: defrec
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */

...








19: where rec eval p /* Evaluate applicative form p */
20:
21: = if is-singleton(p) /* argument reference */
22:
23: then if p = ob_ARG then x
24:
25: elif p = ob_SELF then f
26:
27: else ob.a p /* or (quoted) literal */
28:
29: else /* cases for composite p */

41: ... ob.ap[ eval op, eval arg ];


This function is being defined local to the fragment





17: else eval f /* handle the case of f being a composed form */


That is, the function is only usable (and only defined for use) in conjunction with the expression eval f.  

The where introduces a local definition and it introduces it right there.  This establishes that this is the only place that definition is required.  This makes the dependency exactly clear in both directions: ob.ap(f, x) only requires eval in that one place and eval is only used in one way (other than the ways that eval is recursive). 

Equally important is the fact that eval depends on the parameters of ob.ap[f, x].  This is at lines 23 and 25.  This is another isolated dependency.  It is global to eval yet local to the instance of ob.ap[f, x] in which line 17 is reached.   In effect, the definition of eval(p) occurs dynamically in each instance of  ob.ap[f, x] evaluation.   That instance-specific definition is reused for all recursions of the defined eval

These kinds of "inner function" have been around since the introductions of the LISP and ALGOL 60 programming languages. 

This use of the inner function, a recursive one at that, is very economical.  It may not the clearest, but it emphasizes locality and cohesion that are important for this particular algorithm.

Breaking Up Is Maybe Not So Hard To Do?

It is possible to break out the subfunctions used in defining ob.ap[f, x].  These considerations must be weighed:

  • The contracted interface and definition is that of ob.ap[f, x].  Everything else is incidental and the definition can be replaced by anything that preserves the required behavior.  In one sense, this makes refactoring of the interior of the definition permissible.  But,
      
  • The exposure of additional interfaces creates a separation of concerns that is not required by the definition of ob.ap[f, x].  When a subfunction is no longer local, there is the problem of the maintenance of the dependency with ob.ap[f, x].
       
  • The extraction of a subfunction requires expansion of variability beyond that required by locally-incorporated subfunctions.  This makes the extracted subfunction more complex than the embedded one.

To see what's involved, pull the definition of eval p outside of the definition of ob.ap[f, x]:


1: defrec
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */
4:
5: = if is-individual(f) /* Atomic (single-instruction) operation */
6:
7: then if f = ob_A then ob.a x /* a-part of x */
8:
9: elif f = ob_B then ob.b x /* b-part of x */
10:
11: elif f = ob_E then ob.e x /* enquote x */
12:
13: else ob.c(f, x) /* prefix f to x, etc. */
14:
15: elif is-singleton(f) then ob.a f /* quoted constant */
16:
17: else ev[f,x] f, /* Evaluate applicative form f */
18:
19: ev[f,x] p
20:
21: = if is-singleton(p) /* argument reference */
22:
23: then if p = ob_ARG then x
24:
25: elif p = ob_SELF then f
26:
27: else ob.a p /* or (quoted) literal */
28:
29: else ... /* UH OH */
30:


We have succeeded in pulling out ev[f, x] p in place of the internal eval p definition. 

The additional parameters, [f, x], are required because we are no longer in any context where those variables are visible globally.  In effect, it is necessary to replace eval everywhere in the definition of eval p with ev[f, x]

I don't want to write a redundant ev[f, x] for every eval in the original version.  There is not meant to be any variability here, so I choose another approach.  This involves defining a third function.  There are many other ways to introduce that third function.  My choice takes advantage of the fact that a let ... in form is equivalent to introduction of a function definition and application:


1: defrec /* INTER-DEPENDENT MULTIPLE-FUNCTION DEFINITION */
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */
4:
5: = if is-individual(f) /* Atomic (single-instruction) operation */
6:
7: then if f = ob_A then ob.a x /* a-part of x */
8:
9: elif f = ob_B then ob.b x /* b-part of x */
10:
11: elif f = ob_E then ob.e x /* enquote x */
12:
13: else ob.c(f, x) /* prefix f to x, etc. */
14:
15: elif is-singleton(f) then ob.a f /* quoted constant */
16:
17: else ev[f,x] f, /* Evaluate applicative form f */
18:
19: ev[f,x] p
20:
21: = if is-singleton(p) /* argument reference */
22:
23: then if p = ob_ARG then x
24:
25: elif p = ob_SELF then f
26:
27: else ob.a p /* or (quoted) literal */
28:
29: else evform(ev[f,x], ob.a p, ob.b p),
30:
31: evform(eval, op, arg)
32:
33: = if op = ob_C /* pairing form */
34: then ob.c( eval ob.a arg,
35: eval ob.b arg )
36:
37: elif op = ob_D /* comparison form */
38: then ob.d( eval ob.a arg,
39: eval ob.b arg )
40: /* applicative form */
41: else ob.ap[ eval op, eval arg ];


This refactoring works easily because  ev[f, x] is a legitimate intermediate result that can be defined and used as an individual operand. 

The use of  evform(ev[f, x], ob.a p, ob.b p) in the definition of ev[f, x] p is very much like saying


29: else let eval = ev[f,x],
30: op = ob.a p,
31: arg = ob.b p
32:
33: in if op = ob_C /* pairing form */
34: then ob.c( eval ob.a arg,
35: eval ob.b arg )
36:
37: elif op = ob_D /* comparison form */
38: then ob.d( eval ob.a arg,
39: eval ob.b arg )
40: /* applicative form */
41: else ob.ap[ eval op, eval arg ];


This second form preserves locality at the cost of additional indentation and coupling directly into the superior function.  At the same time, it avoids surfacing a function that is only needed in one place.

Confessions and Compromise

It is intriguing that the form with definitions of three functions satisfies Martin's requirements[1]. 

Confession: When I explain the operation of ob.ap[f, x], I do so using auxiliary function definitions[3].  I use more of them than the three factored out here.  One important reason for additional functions is the prospect that some of them are extended as more features are added to the system.

Compromise: When I want to control the coupling among functions, it is important to have an explicit way to reflect that.  Locality of auxiliary function definitions allows that.  My compromise version would preserve the three functions for explanatory purposes in this locality-controlling form:


1: defrec /* SELF-CONTAINED WITH AUXILARY FUNCTION DEFINITIONS */
2:
3: ob.ap[f, x] /* Apply the procedure defined by f to operand x */
4:
5: = if is-individual(f) /* Atomic (single-instruction) operation */
6:
7: then if f = ob_A then ob.a x /* a-part of x */
8:
9: elif f = ob_B then ob.b x /* b-part of x */
10:
11: elif f = ob_E then ob.e x /* enquote x */
12:
13: else ob.c(f, x) /* prefix f to x, etc. */
14:
15: elif is-singleton(f) then ob.a f /* quoted constant */
16:
17: else ev[f,x] f, /* Evaluate applicative form f */
18:
19: where ev[f,x] p
20:
21: = if is-singleton(p) /* argument reference */
22:
23: then if p = ob_ARG then x
24:
25: elif p = ob_SELF then f
26:
27: else ob.a p /* or (quoted) literal */
28:
29: else evform(ev[f,x], ob.a p, ob.b p),
30:
31: where evform(eval, op, arg)
32:
33: = if op = ob_C /* pairing form */
34: then ob.c( eval ob.a arg,
35: eval ob.b arg )
36:
37: elif op = ob_D /* comparison form */
38: then ob.d( eval ob.a arg,
39: eval ob.b arg )
40: /* applicative form */
41: else ob.ap[ eval op, eval arg ];

The use of where at the "top level" is a bit like saying private in the method declarations for a class.  But we do not need to introduce classes, saving us from having to consider the interactions between access specifiers and inheritance.

More confession: I just made up this additional use of where as an "access" specifier similar to use of rec to impose self-reference in definitions.   I invented it on the spot.  (The internal form of where clause has been known at least since the work of Burge and Landin[4, 5] and was used by Strachey)  I introduced this specifier flavor because I am jealous of the improved readability of the interdependent multi-function definition form.

I will pay a penalty in the complication of the Frugalese grammar and in explanation of how this use of where impacts the definitions in a definition list where it occurs.

This device is not going to assure that Martin's injunction can or should be satisfied in all cases[1].     I'm pleased that I can take advantage of that style without surrendering localization of interdependencies among function definitions.   I'm going to use this Frugalese feature much more before I am satisfied that it carries the day well enough.

In testing Martin's guidelines against a specific example of my own, I found a way where following the rules slavishly would surrender other qualities of functional definitions.  Since I have the privilege of defining the Frugalese notation, I am able to adopt a compromise approach that keeps auxiliary functions auxiliary while flattening the layout and the presentation.  It remains to be determined whether this addition is a significant improvement overall.


[1] Robert C. Martin: How Big Should a Function Be?  Uncle Bob's Blatherings (web log), Object Mentor, 2007-10-29 (via Hans Martin Kern).
"Functions should be small. Very small. A well written function might be 4, 5, perhaps 8 lines long; but no longer.  ... Specifically, functions that process information as part of an algorithm should be very small."  The ob.ap[f, x] definition is definitely that of an algorithm that processes information, so it is not at all clear to me how that fact correlates with having to be small, to have a brief description.  Even though I manage to break out the subfunctions, I could not be satisfied without also enforcing their auxiliary nature.  
   "High level functions should be followed by lower level functions that should be followed by still lower level functions. The calls should all point downwards in the source file if possible."  This is certainly satisfied in the definition of ob.ap[f, x] in all of the variations shown.
   "Of course these rules cannot be followed 100% of the time, but they are good rules nonetheless. Our goal, as programmers, is to write programs that humans can understand and maintain." is a reasonable statement, but for its sequel that seems to suggest there is an obvious boundary between "good" and "bad":  "Remember, a product that works, but that has a bad internal structure is a bad product."   I suppose my view can be summarized as a willingness to support such practices along with a reluctance to enforce them.
  
[2] Herb Sutter and Andrei Alexandrescu: C++ Coding Standards -- 101 Rules, Guidelines, and Best Practices.  Addison-Wesley (Boston: 2005), ISBN 0-321-11358-6 pbk.
The principle that struck me in this context was "0.  Don't sweat the small stuff."  Some others that apply to this example are "5. Give one entity one cohesive responsibility," "10.  Minimize global and shared data," "18. Declare variables as locally as possible," and "20. Avoid long functions.  Avoid deep nesting" (with statement of the principle involved).  You can decide how well "6. Correctness, simplicity, and clarity come first."
   
[3] Dennis E. Hamilton: oMiser Sketch [latest], Miser Theory Notes page N020600f 0.05 2007-05-03.
I am in the process of shrinking and correcting much of this material, my motivation for finding a succinct statement of ob.ap(f, x).
    
[4] William H. Burge: Recursive Programming Techniques. Addison-Wesley (Reading, MA: 1975). ISBN 0-201-14450-8.
I rely on this approach to functional programming for many idioms and test cases for Frugalese.  The structure of Frugalese is different from ISWIM in several respects, although it is generally easy to translate between the notations.
     
[5] Peter J. Landin: The Next 700 Programming Languages. Comm. ACM 9, 3 (March 1966), pp. 157-164.
The refactoring performed between my variations of ob.ap[f, x] definition is based on a few of the rewriting cases described by Peter Landin in this paper.  Although I also use indentation heavily when writing Frugalese, that is not essential to the language (so far). 

 
Construction Structure (Hard Hat Area) You are navigating The Miser Project.

template created 2004-05-31-22:34 -0700 (pdt) by orcmid
$$Author: Orcmid $
$$Date: 10-04-30 21:00 $
$$Revision: 22 $