First learn computer science and all the theory.
Next develop a programming style.
Then forget all that and just hack.
(George Carrette)
Coding Style used in Smalltalk/X Classes
Contents
This document describes the coding style and conventions
used in Smalltalk/X's class library.
The author is aware of the fact, that coding style is a very personal
matter and should not be enforced by dictators.
However, it is useful to follow some rules, to enable other programmers
an easier entry into the system. Also, there exist tools which extract
useful information and can format neat documents if you follow those rules
in your classes. Thus, when its about time to deliver a documentation
on your project, a whole bunch of work done for free ...
Experienced Smalltalk programmers may want to skip this document.
If you have any suggestions or additions on this theme, let me know about it.
I admit that some of Smalltalk/X's code does not follow those guidelines:
some code is very old and the author(s) of the code have matured as we all do.
Bad code is and will be refactored, whenever we encounter it.
In Smalltalk/X, every class contains a method category called
"documentation" in its class protocol.
You will find at least two methods named
#version and #documentation there:
The #version method
The #version methods comment consists of a single
version line - this is created automatically by the source code
control system (RCS or SCCS).
NEVER manually change this string in the browser
- ST/X depends on it,
in order to be able to extract a classes correct source
code from the source repository.
If you report an error (to eXept),
this string should be used to identify the exact version of the class.
The #documentation method
The #documentation methods comment describes the class, its uses
and (if of public interest) its instance and classvariables.
In many classes, you will find an addition #examples method.
This will contain a comment giving typical uses
(often ready to select & doIt).
These methods consist of comments only; they are not meant to be executed.
(actually, if evaluated, they will return the receiver; since empty methods
are semantically equivalent to a '^ self' method).
The
SystemBrowser
automatically shows the documentation text
(found either in the "documentation" method or in the class comment)
whenever a class is selected.
Thus, to be nice to other people browsing through the system,
you should add a short description of what your class is about in these
documentation methods.
Also, the document viewer can extract a classes documentation methods
text and present it cutely formatted - you get your documentation almost
for free, if you stick to these conventions !
Do not worry about memory usage when creating documentation methods - simple methods
which return self (as empty methods do) all share a common piece of code,
so there will NOT be thousands of empty methods filling
up your memory.
(to be exact: there is some little overhead per method created
by the method object itself - not by the methods code.
However, for production code, all documentation methods can be easily
removed. Also, stc provides a command line argument, to skip
all methods in the documentation category;
to allow building more compact class libraries.)
If you don't like the above, use the classes comment string, which also does
not eat up memory (it used to before release 2.10.3 of Smalltalk/X).
BTW: from the authors experience, you should not delay documentation too much.
Write them down as soon as possible - otherwise you may not find the time to
do so later - or you may simply forget to do it. Also, keep in mind that it may
take more time to add those comments later, since you may have to reflect about
what is going on. From our experience, the later the documentation is written
in a project, the higher is its cost.
If you plan to use the Smalltalk/X sourcecode manager,
every class should contain a #version method, which
must return the classes version string.
The actual format of that string is specific to the underlying
sourceCode mechanism
(for rcs or cvs, it is something
like: "$Header$" ;
for sccs, it is: "%W% %H%").
These version strings will be expanded (by the source code management)
to the actual version;
for example, in the Array class, you will see something like:
version
^ '$Header: /cvs/stx/stx/doc/online/english/programming/codingStyle.html,v 1.27 2009/09/21 13:52:21 cg Exp $'
Notice:
Currently only a manager for cvs is provided - you have to write
your own manager classes (derive from AbstractSourceCodeManager)
if others base mechanisms (for example: sccs) are to be used.
If such a version method is present, and the sourceCodeManager is enabled,
access to the repository is possible from the browser, and done
automatically to retrieve a classes source (based on the actual
version of that class in the system).
When classes are checked in via the browser, these version methods
are automatically created (if not already present) - there is also a menu
function, to create those manually.
In normal operation, the handling of those is transparent, and you can
safely forget about it ... its useful to know about it, anyway).
Every method should contain (at least) two comments:
- a comment giving a short description
this comment should be the very first comment in the method, and should be placed
between the selector & argument specification and the local variables declaration
or first statement.
It should give a short description of what this method does, and what the
arguments are to be used for.
Please, use this type of comment, since ST/X provides special printout
features, which allow you to create printed documentation automatically,
based on these comments (similar to javadoc).
See the
SystemBrowsers printOut protocol
functions - and the
online class documentation.
- a comment giving example uses of this method
this comment (or multiple comments) should be at the very end of the method,
and give some example(s) of how the method is used.
Also, as required, failure examples should be given.
This comment will allow select & printIt from the browser, without a need to
type in any expressions.
Thus providing simple "HowTo examples" to readers of your code.
Example (from Collections enumeration protocol):
select:aBlock
"return a new collection
with all elements from the receiver,
for which the argument aBlock evaluates to true"
|newCollection|
newCollection := self species new.
self do:[:each |
(aBlock value:each) ifTrue:[
newCollection add:each
].
].
^ newCollection
"
#(1 2 3 4) select:[:e | e odd]
(1 to:10) select:[:e | e even]
"
|
Of course, you should give your variables and methods descriptive names.
You should do so in any programming language.
In smalltalk, a common trick is to encode the expected type of a variable
in the name
(which you don't have to in static typed languages).
For example, names like "originPoint", "lineString"
or "collectedNames" make it totally clear,
what the variables/arguments are used for.
By convention, global variables and class variables should start with
an upper case character - other variables and selectors by a lower case
chracter.
Think twice before using globals - usually there is no need for them.
Beside increasing code complexity (by introducing side effects),
use of globals may lead to conflicts if packages from different
programming teams are merged and both use the same global name.
Although the browser offers search functions for uses of globals,
you have to manually edit (and think about) the code in this case.
Avoid this by banning globals from your code.
In many situations, a global can be eliminated by
by passing additional method arguments
(which may even be an advantage later,
offering more possibilities for reuse of a method).
Almost all globals can easily be
replaced by a private classVariable instead and access be provided to
other parts via class methods.
You won't need too many comments in your methods,
if the code is clean and straight forward.
Don't add comments just for the comment.
For example, a comment like:
sum := sum + 1. "add one to sum"
|
is stupid and filling your methods with this kind of "information"
actually makes your code less readable.
(you may wonder why this is mentioned here;
we have seen departments where code ``quality'' was measured by counting comments,
which ended in people doing above rubbish - only to make the codecheckers happy.)
Also, if you think that a variable needs a comment stating its use,
think about changing the variables name;
For example, the following code is a (stupid) example for a bad variable name:
why not give the variable that name right away? As in:
And, similar, if a group of statements need an explanation as in:
...
"read the data from file blabla"
...code to read data...
...and so on...
...
|
I would suggest that you extract those statements into a separate method,
name then according to what they do and invoke that method:
...
self readDataFromBlaBlaFile.
...
|
Voila - the methods name is just as good as the original comment.
The above actually means, that as your code becomes more & more readable and less-cryptic,
less comments are needed.
That does not mean that your code should be completely uncommented;
a lot of smalltalk PD code is floating around, which is very hard to understand
by not providing a single informative comment.
This often makes it very hard to understand the overall structure of a framework
or application.
That said use the following rules:
- if you use special tricks or uncommon constructs,
you should add a comment describing what is going on -
for yourself and for others.
-
If a group of objects collaborate, write a class comment which gives a rough overview.
-
Give a hint on how to startup and shutdown an applications/server in some class comment.
-
Provide a set of unit-tests, to describe the public protocol.
In some methods (especially in included example code),
you may see comments, explaining obvious things.
Comments like: " ... now open the view ..."
seem to violate the above arguments.
The reason for those comments is that we expect these code-texts to be read mostly by newcomers -
so that the code text is also used as an introductionary text.
Therefore, more than usual is often commented there.
The question of how code should be indented is a very subjective
and the discussion often even a religious one.
For that reason, we will not give any recommendations here.
Instead, the two most commonly used styles are described in short here.
Take the one that you (and your friends) find to be the easiest to read.
- Lisp-style indent
In this, all closing parenthesises and brackets are lined up at the end,
for example:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar
ifTrue:[
self fooBar.
[doingBaz]
whileTrue:[
self baz].
self moreFooBar.
1 to:10 do:[:index |
1 to:10 do:[:index2 |
self doMore]]]
ifFalse:[
...
and so on]
This style of indentation is seen often in ST-80 code.
(the ST-80 formatter seems to automatically produce output in this format).
Some variations are possible, for example, you can put the ifTrue:
and whileTrue: right behind the receiver (as below).
- C style indent (used by the author)
This style is roughly based on the C-indentation style used by
Kernigham & Ritchie.
for example:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar ifTrue:[
self fooBar.
[doingBaz] whileTrue:[
self baz
].
self moreFooBar.
1 to:10 do:[:index |
1 to:10 do:[:index2 |
self doMore
]
]
] ifFalse:[
...
and so on
]
Also, variations are possible; for example, the opening brackets of blocks
can be put onto a separate line, as in:
foo
"this method performs some fooBar.
Sometimes even baz is done"
doingBar ifTrue:
[
self fooBar
[doingBaz] whileTrue:
[
self baz
].
self moreFooBar.
1 to:10 do:
[:index |
1 to:10 do:
[:index2 |
self doMore
]
]
]
ifFalse:
[
...
and so on
]
However, this seems to spread the code for even small methods quite a bit.
To many programmers, this makes the readability worse.
A simple rule: try to make your methods fit onto a page or screenful
(of course: not by putting all into oneliners ;-)).
Readability is usually better if you do not have to scroll when looking at a methods code.
Therefore, methods should be short.
On the other hand, don't break up a method into many short methods just
for this - find a useful compromise.
(having too many too small methods also often hinders readability)
Many other styles are possible, however, whichever you choose, follow these rules:
- use the style consistent
try to not mix styles.
Although, there are some situations, when you have to break out
of your indentation (for example, if lines become too long) you should
stick to a general style.
Also, for very simple if- or while-blocks, you may decide to put the block
right behind the selector - within the same line.
- use indentation
Never write your code in a Fortran or Basic-style, without any
indentation. It makes your code almost impossible to read.
Don't think that you will never again look at some method and therefore don't
need to indent and/or comment it - experience shows: you or someone else will.
- the style should show the control flow
If you use a style which still requires comments like "end of while"
or "end if this or that if", rethink about it. After all, indentation should
express exactly that; without the need for further explanations.
- don't argue about styles
I hate people arguing about indentation styles.
Let others use whatever they like - just as you want them to let you do your
work as you like.
However, if you work in a group and (for whatever reason) you have
to use a common style, discuss it in the group before you start to code.
The following is an incomplete list of recommendations:
- ommitting a "^ self" to save some typing
Often, some alternative must return from a method after some other method is invoked.
A typical example are guards, as in:
...
foo ifTrue:[
^ self doSomethingForFoo.
].
bar ifTrue:[
^ self doSomethingForBar.
].
...etc...
...
|
the problem with the above code is that from reading, you do
not know whether the return value from "doSomethingForFoo" is really
wanted here, or if its simply a lazy typer, and the real return value
is "self". So the reader has to navigate to implementers of "doSomethingForFoo"
and look for the return code.
Except for the cases when the return value really counts,
you should always write:
...
foo ifTrue:[
self doSomethingForFoo.
^ self
].
bar ifTrue:[
self doSomethingForBar.
^ self
].
...etc...
...
|
Do not obfuscate the fact that you are not interested in the answer from
"doSomethingForFoo".
- reusing variables
Do not reuse a local variable - a good style is to assign only once (functional style).
Give your variables useful names. The code is written only once, but usually read
many times by yourself and others later.
- use "contains:", "includes:" etc. instead of "do:"-loops whenever possible
These methods not only save typing - they are also documenting what is done.
Do not write code like:
myMethod
someCollection
do:[:eachElement |
someCondition on eachElement
ifTrue:[^ true]
].
^ false.
|
instead, write it as:
myMethod
^ someCollection
contains:[:anElement | anElement someCondition].
|
Just read it aloud and you know why.
- use "contains:", "includes:" etc. instead of "detect:ifNone:" when checking for the presence of an element
The same argument as above applies here; often, "detect:ifNone:"
is used to check for the presence of an element - not to retrieve that element.
This should be avoided;
instead of:
(someCollection
detect:[:el | someCondition on el]
ifNone:nil) notNil
ifTrue:[
...
].
|
write it as:
(someCollection contains:[:el | someCondition on el])
ifTrue:[
...
]
|
Again, if in doubt, read the code loud for yourself.
As an extreme example, look at:
(coll detect: [:b| b == fooBlock] ifNone: [nil]) isNil
ifFalse:[ ... ]
|
which is a combination of "double-negation" confusion AND "detect:-to-test-for-inclusion" confusion.
Every one (even an experienced smalltalker) has to read and think about
this carefully, only to fogure out, that its nothing more than:
(coll includesIdentical:fooBlock)
ifTrue:[ ... ]
|
By the way: you can search for such code fragments using the systembrowsers "code-search" facility.
The above is found with the pattern:
(`@e1 detect: `@b1 ifNone: [nil]) isNil ifFalse: `@b2
|
Here is another example, which uses two nested loops:
| list rslt |
list := someList asOrderedCollection.
list do: [:p|
(p blocks detect: [:b| b == someObjectToSearchFor] ifNone: [nil]) isNil
ifFalse: [foundElement := p]
].
foundElement isNil ifTrue:[ ^ false ].
^ true
|
let us rewrite this in smaller steps to a more readable version;
the inner loops statement is the one we saw in the previous example, so we can rewrite it to:
(p blocks contains: [:b| b == someObjectToSearchFor])
ifTrue: [foundElement := p]
|
which is the same as:
(p blocks includesIdentical:someObjectToSearchFor)
ifTrue: [foundElement := p]
|
the outer is simply another search, and the "do" can be rewritten to:
foundElement := list
detect:[:p | (p blocks includesIdentical:someObjectToSearchFor) ]
ifNone:[nil].
|
as the foundElement is not needed, the overall code can also be:
| list |
list := someList asOrderedCollection.
^ list
contains:[:p |
(p blocks includesIdentical:someObjectToSearchFor) ]
|
the last question we should ask is "why do we need a copy of the original collection for the search ?".
So we can remove this as well and get the final code:
| list |
^ someList
contains:[:p |
(p blocks includesIdentical:someObjectToSearchFor) ]
|
which is both faster and requires less memory (due to the remove list-copy).
And it certainly is much more readable!
- avoid using symbols as enum-values
Very often, symbols are used as enumerated values,
as in:
...
state == #foo ifTrue:[
...doSomethingForFoo...
].
...
state == #bar ifTrue:[
...doSomethingForBar...
].
...
|
Beside the fact that this code is not very object oriented and
should probably be done somehow using a messagesend,
this code is very sensible to typing errors, as neither the
compiler, nor the runtime system has a chance to check any typing mistakes.
For example, if you mistype #bar as #Bar at some place, the corresponding if-statements
will never be executed and the bug is hard to find.
There are multiple ways to make the above more secure:
- Use class variables instead of symbols
Some of the danger can be taken out of the code, by removing the possibility for
typing errors.
Add a few class variables, for example named "FooState" and "BarState",
and initialize them in the classes' initialize method as:
initialize
FooState := #FooState.
BarState := #BarState.
|
then, use those variables in the case code:
...
state == FooState ifTrue:[
...doSomethingForFoo...
].
...
state == BarState ifTrue:[
...doSomethingForBar...
].
...
|
Codewise, this does not make much of a difference - except for the fact,
that you will be notified when you mistype one of the state names.
If stateNames have to be shared among classes or components,
either use a poolDictionary or (better ?) use class access methods to
fetch the states from the other class.
- Use state-objects
The above being first a step towards maintainability,
a much better solution, which also makes later extensions much simpler is
to provide separate state objects (state classes) and move functionality into
them.
This will eventually remove all of these switches and is generally called "object oriented".
So as a first step, create some StateClass, and two subclasses named FooState and
BarState (these can of course be private classes).
Then, if there is any need to check for the state explicitely,
use testing protocol:
...
state isFooState ifTrue:[
...doSomethingForFoo...
].
...
state isBarState ifTrue:[
...doSomethingForBar...
].
...
|
However, once this is done,
the obvious next step is to move the above action into the state object,
and get rid of the if, as in:
if required, pass the receiver, as in:
...
state doXFor:self.
...
|
the BIG advantage of this becomes apparent, when a new state is added.
BTW: this is a well-known pattern in OO-programming.
Notice, that the Smalltalk language offers the added convenience that the class itself
can be used as a state object (which means that you don't have to care for any singleton, instance creation,
identity or other problems).
In most other so called "OO"-languages, this is not possible; either becuase there is no such thing as
a class available at runtime, or because static (class-) methods are not inherited the way that
instance methods are.
- Message chains which follow the object structure
Do not hardwire knowledge about object relations into you program;
a typical example (which is very bad) is the following extract from
a piece of code found in the manchester archive:
...
aReadStream ioConnection input readWait.
^ aReadStream atEnd not
...
|
obviously, the readStream object is some kind of Socket-accessing
thingy, and the programmer wants to wait for something to arrive.
The bad thing with the above, that it hardwires the knowledge of
the (internal) layout of that socket-accessing thingy into the program.
(I.e. if the underlying stream implementation is ever changed to not use
an ioConnection-slot,
or that ioConnection-object no longer has an input-slot,
your code is doomed.)
Another example is:
...
widget wrapper wrapper wrapper controller enable.
...
|
here, the widgets internal hierarchy is reflected in the access-path,
and the code will have a very hard time if the wrapper hierarchy ever changes.
These code fragments were specially written for a parcPlace system, but even parcPlace
will not be able to change any internals, without affecting this program.
Thus, there is a chance that the above code will not work without change in the next
parcPlace version (we don't even think of porting it without change to another smalltalk dialect).
A better solution was to provide a #readWait in the class of the socket-accessing
object, and delegate things there,
and to provide an enable in the widget, which delegates it to some controller.
This hides the object structure and allows both parcPlace to change things,
and allows ST/X's Socket class to behave like other SocketAccessors.
Of course, the above examples do not only apply to system classes -
the same is often found in user code (i.e. GUI code).
Copyright © 1995 Claus Gittinger Development & Consulting
<cg@exept.de>
Doc $Revision: 1.27 $ $Date: 2009/09/21 13:52:21 $