[prev] [up] [next]

Smalltalk basics

					To a hammer,
					everything looks like a nail.


Motivation

Smalltalk provides many features which are hard or impossible to implement in many other programming languages - namely closures, real reflection, dynamic late binding and a powerful integrated (I mean: really integrated) development environment.

Before describing the language and how you can create your own programs, we should explain a few basics - both to give you some background and to define the technical terms used in the documentation (and literature).

Keep in mind, that this text is only a short introduction - we recommend reading of a standard textbook on the language for more detailed information on the language
(-> 'literature').

Definitions, Nomenclature and Concepts

Objects

In Smalltalk, everything is about objects. Objects can be as simple as a number, a character or as complex as a graphic being displayed, windows, user dialogs or complete applications.
Typical applications consist of many objects (hundreds or thousands), each being specialized for and responsible for a particular functionality.

In contrast to hybrid systems like C++ or Java, everything is an object in Smalltalk; this includes integers, characters, arrays, classes and even the programs stackframes, which hold the local variables during execution.
In Smalltalk, there are no such things as "builtin" types or classes, which have to be treated different, or which do not behave exactly like other objects with respect to message sending, inheritance or debuggability.

Messages & Protocol

Objects communicate by sending messages to other objects.
To the outside world, any internals of an object are hidden - all interaction is only via messages.
The set of messages an object understands is called its message protocol, or protocol for short.
For example,
a numbers protocol contains the +, -, * ... messages.
a strings protocol contains the asUppercase, asLowercase ... messages.
When an object receives a message, it is the object itself, which decides how to react to the message - the smalltalk language does not imply any semantic meaning into it.
Therefore, theoretically, an object may add the "+" message to its protocol and perform an operation which has nothing to do with the mathematical concept of adding numbers.
In practice, this is never done, since it makes programs less understandable. However, it is useful to keep in mind that only the messages receiver is responsible for the outcome.
(As we will see, this is also the reason for the uncommon precedence rules in binary operations.)

On the other hand, it makes the system very flexible. For example, it is very easy to extend the numeric class hierarchy by additional things like Complex numbers, Matrices, Functional Objects etc.
All that is required for those new objects to be handled correctly is that they respond to some basic mathematical protocol for arithmetic, comparison etc. Existing mathematical code is unually not affected by such extensions, which makes Smalltalk one of the best environments for code reuse and sharing.

Classs & Instances

Since there are often many objects which share the same protocol, smalltalk groups similar objects into classes. Every object is an instance of some class, and instances of the same class have the same protocol (although, each may have different internal state).
This does not imply that instances of different classes must implement different protocols - actually, there are classes in the system (numeric classes, collection classes), which implement a common protocol, but are implemented completely different internally. This is called polymorphism.

Classes may have zero, one or many instances.
You may wonder how a class without instances could be useful - this will become clear when inheritance and abstract classes are described further down in this document.

Examples:
1, 99 and -8 are instances of the Integer class
1.0 and 3.14159 are instances of the Float class
'hello', 'foo' are instances of the String class
the buttons in a window are instances of the Button class
nil is the one and only instance of the UndefinedObject class
For the curious:
we say that smalltalk is a class based object oriented language. There are other languages around, which are not based upon the concept of classes - the Self programming language, for example.
However, most object oriented languages (c++, eiffel, java and many others) are class based.
So, in Smalltalk every object is an instance of a class, and its the class which describes the behavior of its instances (by defining the protocol and thereby defining how instances react to messages).
Smalltalk allows access to this class at runtime - this is called reflection.
Because Smalltalk is a pure object oriented language, this class thing is also an object and therefore responds to a set of messages (more on this below).

Methods

When a message is sent to an object, a corresponding action has to be performed - technically, a piece of code must be executed. This piece of code is called a method.

Every class keeps a table which associates the name of the message (the so called message selector) to a method.
When a message is sent to an object, the classes method table (called "MethodDIctionary") is searched for a corresponding entry and - if found - the associated method is invoked (more details below ...).

Since smalltalk is a pure object oriented language, this table is an object and accessable at execution time; it may even be modified during execution and allows objects to learn about new messages dynamically.
(Of course, the interactive programming environment heavily depends on this; for example, the browser is a tool which adds new items to this table when a methods new or changed code is to be installed...)

Inheritance

In smalltalk (like in most other object oriented languages), classes are organized as a tree. Every class has a so called superclass, and is called a subclass of its superclass. (C++ programmers call these "baseclass" and "derived class" respectively).
Since the superclass may itself have a superclass, we get a superclass-chain, which (typically) ends in a class called "Object" (*).
In Smalltalk, a class can have only a single superclass (as opposed to C++, for example, where classes can inherit from multiple baseclasses) (**).

A class inherits all protocol as defined by its superclass(es) and may optionally redefine individual methods or provide additional protocol.

Therefore, a message send performs the following actions (***):

  1. the objects class is asked for the method table
  2. the table is searched for a method associated to the selector
  3. if not found, repeat the previous step with the superclasses table.
  4. unless there is no superclass - then report an error (see below).
  5. if found, execute the methods code
Error reporting is done by packaging the bad messages arguments into a so called "message" object and resending another message (#doesNotUnderstand:) to the receiver with the message object as argument.
This mechanism can be used to implement special error handling and recovery mechanisms.

Footnotes:
(*)
For the curious:
Although most classes eventually do inherit from Object, there is no need to. Actually, it may occasionally make sense for a class to inherit from no class at all (i.e. to have no superclass). The effect is that instances of such classes do not inherit ANY protocol and will therefore trigger an error for all received messages.
This behavior is useful to implement advanced features, such as proxies (placeholders) for remote objects, message tracers, etc.
(**)
For the curious:
Support for multiple inheritance (MI) in C++ serves two purposes: first, to inherit private variables and operations (i.e. reuse) and second, to support polymorphism.
As multiple inheritance can make things very complicated and can leads to number of problems, Smalltalk does not support it (there used to be an experimental implementation in early ST versions, but was abandoned later).
Java provides interfaces which do solve the polymorphism issues of MI. Smalltalk does not require interfaces for that reason, as any message can be sent to any object (no matter what the class of the receiver is).
In most situations, the methods code which corresponds to a selector is reached quickly by an indirect function call.
(***)
For the curious:
All smalltalk implementations use various tricks (caching) to avoid the above search (also called method lookup) if possible.
In most situations, the methods code which corresponds to a selector is reached quickly by an indirect function call.

Instance variables

An object may contain internal state (also called attributes). In smalltalk, this state usually consists of references to other objects (*).
The slots which hold those references are called instance variables (**).
(Some refer to them as "private variables".)

All instances of a class provide the same message protocol, but typically contain different internal state.
It is actually the class, which provides the definition of the protocol and amount of internal state of its instances.

Example,
'hi' and 'world' are both instances of the String class and respond to the same set of messages. But the internal state of the first string consists of the characters "h" and "i", whereas the second contains the characters "w", "o", "r", "l", "d".

An objects instance variables are only accessable via protocol, which is provided by the object - there is no way to access an objects internals except by sending messages to it.
This is true for every object - even for the strings in the example above.
There is no need for the sender of a message to actually know the class of the receiver - as long as it responds to the message and performs the appropriate action.

Example,
a string provides access to its individual characters via the 'at:' message. You could write an ExternalString class, which fetches characters from a file and returns them from this message. The sender of the 'at:' message would not be affected at all by this (except for a possible performance degration ;-).
What is more important: as long as the required protocol is implemented, every program which used to work with instances of String will also work unchanged with instances of ExternalString - there is no need to change the program in any way; there is not even a need to recompile, rebuild or any other means of telling the system about this new class.
Such additions are even possible while the program is executing.

FootNotes:
(*)
For the curious:
other state which is not held in instanceVariables, and which are not references to other objects are the instances size (collections) and its hashKey.
These are not accessable as instanceVariable - special protocol is provided to access those (#basicSize, #identityHash etc.).
(**)
For the curious:
technically, those references are mostly pointers to the referred object, with a few exceptions: smallIntegers keep the numeric value as a bit pattern, strings and others store raw bytes but simulate holding character objects to the outside world; finally, some Smalltalk implementations represent the nil-Object internally by a special NULL-pointer.

Metaclasses

Since smalltalk is a pure object oriented language, everything within the smalltalk world is an object - this implies that every object's behavior is determined by its class.
This is even true for classes themself - to the smalltalk system, these are just like any other object and their protocol is specified by the classes class. These `class classes' are called Metaclasses.

Thus, when we send a message to some `normal' object, the corresponding class object provides the behavior - when some message is sent to a class object, the corresponding metaclass provides the behavior.
Technically, messages to classes are treated exactly the same way as messages to non-class objects: take the receiver's class, lookup the method in its method table, execute the method's code.

Since different metaclass may provide different protocol for their class instances, it is possible to add or redefine class messages just like any other message.
As a concrete example, take instance creation which is done in Smalltalk by sending a "new"-message to a class.
In Smalltalk, there is no such thing as a built-in "new" (or any other built-in) instance creation message - the behavior of those instance creation (class) messages is defined exclusively by metaclass protocol.
Therefore, it is possible (and often done) to redefine the "new" method for special handling; for example singletons (classes which have only a single unique instance), caching and pooling (the "new" message returns an existing instance from a cache), tracing and many more are easily implemented by redefining class protocol.

Abstract Classes

Abstract classes are classes which are not meant to be instantiated (i.e. no instances of them are to be created). Their purpose is to provide common functionality for their subclass(es).
In Smalltalk, the most obvious abstract class is the Object-Class, which provides a rich protocol useful for all kinds of objects (comparing, dependency mechanism, reflection etc.).

Smalltalk Language Syntax

To a newcomer, the smalltalk language syntax may look somewhat strange at the beginning; however, you will notice, that the syntax is highly orthogonal and pretty simple compared to most other programming languages (except lisp ;-).
Interrestingly, people which have not been previously exposed to languages such as C or C++ find Smalltalk much more intuitive than hard code programmers.

As we will see shortly, smalltalk programs only consist of messages being sent to objects.
Since even control structures (i.e. conditional evaluation, loops etc.) are conceptionally implemented as messages, a common syntax is used in your programs both for the programs flow control and for manipulating objects.
Once you know how to send messages to an object, you also know how to write and use fancy control structures.

Smalltalks power (and difficulty to learn) does not lie in the language itself, but instead in the huge protocol provided by the class libraries objects.

Lets start with languages building blocks ...

Spaces and program layout

The smalltalk syntax is format free (as opposed to the Fortran language, for example). Spaces and line breaks may be added to a smalltalk program without changing the meaning, except for the following:

Comments

In smalltalk a comment is anything enclosed in double-quotes ("). A comment ".." may spawn multiple lines.
Examples:
    "some comment"
    "this
     is
     a
     multiline comment"
    "
     another multiline comment
    "

As a language extension, ST/X also allows end-of-line comments; these are introduced by the character sequence "/ (doubleQuote-slash) and treat everything up to the end of the line as a comment:

    "/ this is an end-of-line comment

Literal Constants

Literal constants in a smalltalk source are processed by the compiler, which creates corresponding objects at compilation time.
This is in contrast to run-time created objects, which are typically created by some variant of the #new message sent to a class or the #copy message sent to some instance..

The following literal constants are allowed:

Identifiers

Identifiers (variable names) identify a variable. In smalltalk, a variable holds a reference to some object (technically, a pointer to some object - not an objects contents)
Variables come in various flavours - differing in their scope (i.e. the visibility) and their lifetime.
Among others, there are global variables, class variables, classInstance variable, instance variables, arguments and local variables.

Identifiers must start with a letter or an underscore character. The remaining characters may be letters, digits or the underline character (*).
Examples:

By convention, you should use upperCase identifiers for global- and class-Variables.
Instance variables, arguments and local variables should start with a lowerCase character.

FootNotes:
(*)
Characters in variable names:
since not all smalltalk dialects allow underscore characters in a variable name, this may be disabled in ST/X, to support portability checking of your code.

For portability with some (VMS-)VisualWorks Smalltalk variants, a dollar character ($) can also be allowed inside an identifier as a compiler option.

Special Identifiers (builtin names)

Messages

A message consists of three parts; the receiver, the message name, called the selector and optional arguments.

Unary Messages

Messages without arguments are called Unary Messages.
For example:
    1 negative
sends the message "negative" to the number 1, which is the receiver of the message.

Unary messages, like all other messages, return a result, which is simply another object.
In the above case, the answer from the "negative" message is a boolean object; in case of the number 1, false is returned.

Evaluate this in a workspace (using printIt);
try different receivers (especially: try a negative number).

Unary messages parse left to right, so, for example:

    1 negative not
first sends the "negative"-message to the number 1. Then, the "not"-message is sent to the returned value. The response of this second message is returned as the final value.
If you evaluate this in a workspace using printIt, the returned value will be true.

Try a few unary messages/expressions in a workspace:

    1 negated
    -1 negated
    -1 abs
    1 abs
    5 sqrt
    1 isNumber
    $a isNumber
    1 isCharacter
    $a isCharacter
    'someString' first
    'hello world' size
    'hello world' asUppercase
    'hello world' sort
    #( 17 99 1 57 13) sort
    1 class name
    1 class name asUppercase

Keyword Messages

This type of message allows for arguments to be passed with a message. A keyword message consists of one or more keywords, each followed by an argument.
Each keyword is simply a name whereby the first character should be lower case by convention, and followed by a colon.
The arguments may be literal constants, variables or other message expressions (must be grouped using parenthesis, if another keyword messages result is to be used as argument).
For instance, in the message
    5 between:3 and:8
"between:" and "and:" are the keywords, and the numbers 3 and 8 are the arguments. The object representing the number 5 is the receiver of the message.

The messages actual selector is formed by the concatenation of all individual keywords; in the above example, it is "between:and:".
This is different to both a "between:" and a "and:"-message, which often leads to beginners errors.
(Of course, "between:and:" and "and:between:" are also different messages.)
In the browser, the corresponding method is be listed under the name: "between:and:".

Keyword messages do parse left to right, but if another keyword follows a keyword message, the expression is parsed as a single message (taking the keywords concatenation as selector).
Thus, the expression:

    a max: 5 min: 1
would send a "max:min:"-message to the object referred to by the variable "a".
This is not the same as:
    (a max: 5) min: 1
which first sends the "max:"-message to "a", then sends the "min:"-message to the result.
Try these in a workspace (don't fear the error...)

To avoid ambiguity you must place parentheses around.

Try a few keyword messages/expressions in a workspace:

    1 max: 2
    1 min: 2
    (2 max: 3) between: 1 and: 3
    (1 max: 2) raisedTo: (2 min: 3)

Unary messages have higher precedence than keyword messages, thus

    9 max: 16 sqrt
evaluates to 9.
because it is evaluated as: "9 max: (16 sqrt)" which is "9 max:4".
It is not "(9 max: 16) sqrt", which is "16 sqrt" and would give 4 as answer.

Binary Messages

A binary message takes 1 argument. Its selector is formed from one or two nonAlphanumeric special characters.
Some characters, such as braces, parenthesis or period cannot be used as binary selectors (*).
Binary messages are mostly used for messages which perform arithmetic operations - although this is not enforced by the system; i.e. no semantic meaning is known to the smalltalk compiler, and binary messages can be defined and used for any class.

An example of a binary message is the one which implements arithmetic addition for numeric receivers (it is implemented in the Number classes):

    1 + 5
This is interpreted as a message sent to the object 1 with the selector '+' and one argument, the object 5.

Binary messages parse left to right (like unary messages).
Therefore,

    2 + 5 * 3
results in 21, not 17.
(first, '+' is sent to 2, with 5 as argument. This first message returns 7.
Then, '*' is sent to 7, with 3 as argument, resulting in 21 being answered.)

To change the execution order or to avoid ambiguity you should place parentheses around:

    2 + (5 * 3)
Now, the execution order has changed and the new result will be 17.

Unary messages have higher precedence than binary messages, thus

    9 + 16 sqrt
evaluates as "9 + (16 sqrt)", not "(9 + 16) sqrt".

On the other hand, binary messages have higher precedence than keyword messages, thus

    9 + 16 max: 3 + 4
evaluates as "(9 + 16) max: (3 + 4)" which is "25 max: 7" and answers 25.
It is not the same as "9 + (16 max: 3) + 4" (which results in 29) or "((9 + 16) max: 3) + 4" (which in this case also results in 29)

Again, we highly recommend the use of parentheses - even when the default evaluation order matches the desired order; it makes your code much more readable, and helps beginners a lot.

Try a few binary messages/expressions in a workspace:

    1 + 2
    1 + 2 * 3
    (1 + 2) * 3
    1 + (2 * 3)
    -1 * 2 abs
    (-1 * 2) abs
    5 between:1 + 2 and:64 sqrt
    5 between:(1 + 2) and:(64 sqrt)

The second example above shows why parenthesis are so useful: from reading the code, it is not apparent, if the evaluation order was intended or is wrong.
You will be happy to see parenthesis when you have to debug or fix a program which contains a lot of numeric computations.
Here are a few more "difficult" examples:
    1 negated min: 2 negated
    1 + 2 min: 2 + 3 negated

Notes:
(*)
Binary Characters:
There is no real standard on which characters are actually allowed. For example, ST/X does allow for "#" or "!" to be used as binary selector, while other smalltalk implementations do not.
Also, ST/X allows up to three characters, while other smalltalk implementations only allow two.
For portable code, do not use more than 2 characters other than:
"+" ,"-" , "*" , "/" , "\" , "," , "%" , "&" , "|" , "<" , ">" , "=" , "?".

In ST/X, the actual set of allowed characters can be queried from the system by evaluating (and printing) the expression "Scanner binarySelectorCharacters".

(**)
For the curious:
Technically, binary messages do not add any new functionality to the smalltalk language - they are just syntactic shugar and smalltalk could have easily be defined without them
(i.e. in a Lisp-style, using keyword messages like 'plus:', 'minus:' etc.)

Message syntax summary

For some (especially for C or Java programmers), smalltalk's message syntax might seem strange at first.
Interestingly, people with less programming experience seem to have less problems with this syntax - they often even find it more intuitive !

If you compare your favourite programming language against regular english, you will find smalltalk to be much more similar to plain english than most other programming languages.
For example, consider the order to a person called tom, to send an email message to a person called jane:
(assuming that tom, jane, theEmail refer to objects)

English Smalltalk C++ or Java
tom, send an email to jane. tom sendEmailTo: jane. tom.sendEmail(jane);
tom, send theEmail to jane. tom send: theEmail to: jane. tom.sendEmail(theEmail, jane);
tom, send theEmail to jane with subject: 'hi'. tom send: theEmail to: jane withSubject: 'hi'. tom.sendEmail(theEmail, jane, 'hi');
Now, viewed from that angle, smalltalk's syntax looks less strange, and actually pretty much like plain english.
Smalltalk was actually designed to be easily read by non-programmers initially.

Message examples & explanations

Here are a few message expressions as examples:
1 negated
sends "negated" to the number 1, which gives us a -1 (minus-one) as result.

1 negated abs
demonstrates left-to-right evaluation of unary messages; first sends "negated" to the number 1, which gives us an intermediate result of -1 (minus-one); then, the message "abs" is sent to it, giving us a final result of 1 (positive-one).

-1 abs negated
first sends "abs" to the number -1 (minus-one), which gives us a 1 (positive one) as intermediate result. Then this object gets a "negated" message.
The final return value is the number "-1" (minus-one).

1 + 2
that seems obvious, but is a message send in smalltalk: it sends the message "+" to the number 1, passing it the number 2 as argument. The returned object is 3.
Notice, that strictly speaking, the smalltalk language does not define or required that the performed operation is an addition; instead, this is defined by how numbers react on (i.e. implement) the "+" message.
However, programmers would have a hard time if this was not defined as "addition", so in general, messages in the smalltalk class libraries perform the action one would expect.

1 + 2 + 3
demonstrates left-to-right evaluation of binary messages; first, the message "+" is sent to the number 1, passing it the number 2 as argument. Then, another "+" message is sent to the intermediate result, passing the integer-object 3 as argument.

1 + 2 * 3
that is less obvious - however, from the above you should understand, that left-to-right evaluation is always done in smalltalk (since the language does not define any arithmetic semantic for any message.
So, the outcome will be 9; not 7 as one would expect from mathematical precidence rules.

-1 abs + 2
demonstrates precidence rules, when mixing unary and binary messages.
first sends "abs" to the number -1 (minus-one), then sends "+" to the result, passing 2 as argument.
The final return value is the number "3".

1 + -2 abs
demonstrates precidence rules, when mixing unary and binary messages.
first sends "abs" to the number -2, then sends "+" to the number 1, passing the result of the first message as argument.
The final return value is the number "3".
Remember: unary messages have higher precedence than binary messages

-1 abs + -2 abs
demonstrates precidence rules, when mixing unary and binary messages.
first sends "abs" to the number -1 (minus-one) and remembers the result. Then sends "abs" to the number -2 and passes this as argument of the "+" message to the remembered object.
The final return value is the number "3".

1 + 2 sqrt
demonstrates precidence rules, when mixing unary and binary messages.
first sends "sqrt" to the number 2, then passes this as argument of the "+" message to the number 1.
The final return value is the number "2.41421".
Remember: unary messages have higher precedence than binary messages

(1 + 2) sqrt
first sends "+" to the number 1, passing 2 as argument. Then sends "sqrt" to the result.
The final return value is the number "1.73205".

1 min: 2
sends the "min:" (minimum) message to the number 1, passing 2 as argument.
The return value is the number "1" (the smaller one).

(1 max: 2) max: 3
first sends the "max:" (maximum) message to the number 1, passing 2 as argument. Then sends "max:" to the returned value, passing 3 as argument.
The final return value is the number "3" (the largest one).

(1 + 2 max: 3 + 4) min: 5 + 6
first sends "+" to the number 1 passing 2 as argument and remembers the result. Then, "+" is sent to the number 3, passing 4 as argument. Then, "max:" is sent to the remembered first result, passing the second result as argument. The result is again remembered. Then, "+" is sent to the number 5, passing 6 as argument. Finally, the "min:" message is sent to the remembered result from the first max: message, passing the result from the "+" message.
The final return value is the number "7".
Remember: binary messages have higher precedence than keyword messages

1 max: 2 max: 3
tries to send "max:max:" message to the number 1, passing the two arguments, 2 and 3.
Since numbers do not respond to a "max:max:" message, this leads to an error (message-not-understood).

This example illustrates why parenthesis are highly recommended - especially with concatenated keyword messages.

'hello' at:1
sends the "at:" message to the string constant.
The return value is the character "h" (which displays itself prefixed by a $ dollar).

'hello' , ' world'
sends the "," binary message to the first string constant, passing another string as argument.
This message is implemented in the String class, and returns a concatenation of the receiver object and its argument.
The returned object is a new string, consisting of the characters 'hello world'.

'hello' , ' ' , 'world'
first sends the "," binary message to the first string constant, passing ' ' as argument. Then, the result gets another "," message, passing 'world' as argument.
The returned object is a new string, consisting of the characters 'hello world'.

#(10 2 15 99 123) min
sends the "min" unary message to an array object (in this case: a constant array literal). All collections respond to the min message, by searching for the smallest element, and returning it.

Statements

Multiple message expressions or assignments (see below) may be evaluated in sequence, by separating individual expressions with a '.' (period) character. For example:
    -1 negated.
    1 + 2.
first sends the "negated" message to -1 (minus one), ignoring the result. Then, the "+" message is sent to 1 (positive one), passing the number 2 as argument.

Notice that there is actually no need for a period after the last statement (its a statement-separator) - it does not hurt, though.
We will encounter more (useful) examples for multiple statements below.

Variables

In smalltalk, a variable holds a reference to some object - we say, a variable "is bound" to some object.
A variable may refer to any object - there is no limitation as to which type of object (i.e. the objects class) a variable may refer to.

Every variable is automatically initialized to nil when created.

For now, only some global variables and local variables are described (because we need them for more interesting examples); the other variable types will be described later.

Global Variables

Global variables hold references to objects which are of common interest; especially, most classes can be referred to via a global variable
(for the curious: it is possible to create anonymous classes, which cannot be referred to via a global variable).

Beside classes, a few other objects can be referred to via a global; the most interesting for now is:

In general, from a software engineering point of view, the use of global variables for anything other than classes is considered to be bad style.
Making something globally visible is usually not really required - we recommend using class variables (see below) and provide access to those via access-methods.
Even a simple access to the Transcript as above leads to trouble when multiple transcript views are open.

That sayd (and kept in mind), being able to access the Smalltalk console via the Transcript is often very helpful: it allows to send debugging and informative messages from the program.
For example:

    Transcript show: 'Hello world'
shows that greeting in the Transcript window, and
    Transcript cr
advances its text cursor to the next line.
There is also a combined message, as in:
    Transcript showCR: 'Hello world'

Workspace Variables

When executing (example-) expressions in a workspace, it is often helpful to be able to refer to an object via a globally known name (for example, to be able to send messages to it later).
For this, Smalltalk/X provides Workspace variables.
These behave much like global variables, in that their livetime is not limited to a method or block execution or to a particular instance. However, they are only visible in the context of some workspaces doIt evaluation - they do not conflict with a corresponding globals name.

Workspace variables are created and destroyed via corresponding menu functions in the workspace window.

Local Variables

A local variable declaration consists of an opening '|' (vertical bar) character, a list of identifiers and a closing '|'. It must be located before any statement within a code entity (a block or method, which are described below).
For example:

    | foo bar baz |
declares 3 local variables, named 'foo', 'bar' and 'baz'.

A local variables lifetime is limited to the time the enclosing context is active - typically, a method or a block (you will learn later, what a block is).

Notice, that when a piece of code is evaluated in a workspace window, the system generates an anonymous method and invokes it for execution. Therefore, a local variable declaration is also allowed with doIt-evaluation (the variables lifetime will be the time of the execution).

Instance Variables

Instance variables are private to some object and their lifetime is the lifetime of the object.
We will come back to instance variables, once we have learned how classes are defined.

Assigning a Value to a Variable

A variable is bound to (refer to) an object by an assignment expression.
Assuming, that "foo" and "bar" have beend declared as variables before, you can assign a value with:
    foo := 1
or:
    bar := 'hello world'

This makes the variable refer to the object as specified by what is written after (to the right of) the assignment symbol. This may be either a literal (i.e. a constant), the value of some other variable, or the outcome of some message expression.
Multiple assignments are allowed, as in:
    foo := bar := baz := 1

Notice:
Beginners should be careful to not forget the colon character ":".
If you write "=" instead, that is a message send expression and means "is equal to" (i.e. its a comparison operator).
Therefore,
    foo := baz = 1
would assign true or false to "foo", depending on whether "baz" is equal to 1 or not.

All variables are initially bound to nil.
(I.e. the same behavior as found in Java or C#; Opposed to C or C++, you will never get random values in a Smalltalk variable.)

Keep in mind, that only a reference to an object is stored into the variable, not the state of the object itself. This means, that multiple variables may refer to the same object.
For example:

    |var1 var2|

    "create an Array with 5 elements ... and assign it to var1"
    var1 := Array new:5.

    "and also to var2"
    var2 := var1.

    "change the 2nd element..."
    var1 at:2 put:1.

    Transcript showCR:var1.
    Transcript showCR:var2.
The previous example demonstrates, that both var1 and var2 refer to the same array object. I.e. that in smalltalk, a variable actually holds a reference to an object, and that more than one variable may refer to the same object
Technically speaking: a variable holds a pointer to the object.

This is especially true with multiple assignments; so:

    foo := bar := 'hello'
binds both "foo" and "bar" to the same string-object.

Be careful when assigning to globals - do not (by accident) overwrite a reference to some other object:

    Array := nil
To prevent beginners from doing harm to the system, ST/X checks for this situation and gives a warning.
However, other (smalltalk-) systems may silently perform the assignment and leave you with an unusable system.
Keep in mind, that this danger is also responsible for Smalltalks flexibility: it allows you to replace almost any existing class with your own; in that concrete example, you could write your own Array-class and use that instead.

As a general rule:
do not assign to global variables - its usually a sign of bad design if you have to (as you will see below, there are other variable types which can be used in most situations).

Knowing about variables, we can try more interesting messages:

Ask the Float class for the π (pi) constant:

    Float pi
Ask the Transcript object to raise its top view:
    Transcript topView raise
Ask the Transcript object to flash its view:
    Transcript flash
Ask the WorkspaceApplication class to create a new instance and open a view for it:
    WorkspaceApplication open
Declare a local variable, assign a value and display it on the transcript window:
    |foo|

    Transcript show:'foo is initially bound to: '.
    Transcript showCR:foo.

    foo := -1.
    Transcript show:'foo is now bound to: '.
    Transcript showCR:foo.

    foo := foo + 2.
    Transcript show:'foo is now bound to: '.
    Transcript showCR:foo.
Remember, that a variable may refer to any object.
Thus, the following is legal (although not considered a good style):
    |foo|

    foo := -1.
    Transcript show:'foo is: '.
    Transcript show:foo.
    Transcript cr.

    foo := 'hello'.
    Transcript show:'foo is now: '.
    Transcript show:foo.
    Transcript cr.

Cascade Message Expressions

Sometimes, it is useful to send multiple messages to the same receiver.
For example, to add elements to a freshly created collection, you could write:
    | coll |

    coll := Set new.    "/ create an empty Set-collection
    coll add:'one'.
    coll add:'two'.
    coll add:3.
A cascade expression sends another message (possibly with arguments) to the previous receiver.
The following cascade is semantically equivalent to the above albeit a bit shorter:
    | coll |

    coll := Set new.    "/ create an empty Set-collection
    coll add:'one'; add:'two'; add:3.

Blocks

Blocks are one of the most powerful features of the Smalltalk language.
A block represents a piece of executable code as an object. This can be stored in a variable, passed around as argument or returned as value from a method - just like any other object.
When required, the block can be evaluated at any later time, which results in the execution of the blocks statement(s).

For C/C++ and Java programmers:
As a first approximation, regard a block as a reference to an anonymous function, which can be defined without a name, passed to other objects and eventually executed. However, blocks are more powerful, as they have access to variables of their defining context.

For Lispers/Schemers:
Blocks are lambdas with access to their static enclosing environment(s) !

A block is defined simply by enclosing its statements in brackets, as in:

    | someBlock |

    someBlock := [  Transcript flash ].
later, when the block has to be evaluated (i.e. its statements executed), send it the "#value" message:
    ...
    someBlock value.
    ...
Blocks may be defined with 0 (zero) or more argument(s);
A block with argument(s) is defined by giving the formal argument identifiers after the opening bracket - each prepended by a colon-character. The list is finished by a vertical bar.
For example:
    |someBlock|

    ...
    someBlock := [:a | Transcript showCR:a ].
    ...
defines a block which expects (exactly) one argument.
To evaluate it, send it the "#value:" message, passing the desired argument object.
For example, the above block can be evaluated as:
    someBlock value:'hello'
(here, a string-object is passed as argument).

Blocks can be defined to expect multiple arguments, by declaring each formal argument preceeded by a colon. For evaluation, a message of the form "#value:...value:" with a corresponding number of arguments must be used.
For example, the block:

    |someBlock|

    ...
    someBlock := [:a :b :c |
			Transcript show:a.
			Transcript show:' '.
			Transcript show:b.
			Transcript show:' '.
			Transcript show:c.
			Transcript cr
		  ].
    ...
can be evaluated with:
    someBlock value:1 value:2 value:3

When evaluated, the return value of the message is the value of the blocks last expression.
I.e.:

    |someBlock|

    ...
    someBlock := [:a :b :c | a + b + c].
    ...
    Transcript showCR:(someBlock value:1 value:2 value:3).
    ...
displays "6" on the Transcript window.
and:
    |someBlock|

    ...
    someBlock := [:a :b :c | a + b + c].
    ...
    result := someBlock value:1 value:2 value:3.
    ...
assigns 6 to the variable 'result'.

Blocks have many nice applications: for example, a GUI-Buttons action can be defined using blocks, a timer may be given a block for later execution, a batch processing queue may use a queue of block-actions and a SortedCollection may use a block to specify how elements are to be compared.

However, the most striking application of blocks is in defining control structures (like if, while, repeat, loops etc.) as known in other languages.
Recall, that the above description of the smalltalk language did not describe any syntax for control-flow - the reason is simple: there is none.
Instead, all program control is defined by appropriate message protocol; mostly in the Boolean and Block classes.

Conditional execution (if)

Conditional execution is defined by the ifTrue: / ifFalse: protocol as implemented by the boolean objects bound to the globals "true" and "false": these correspond to the if-then and if-then-else statements in traditional languages.

So, to compare two variables and send some message to the Transcript window, you can write:

    ...
    (someVariable > 0) ifTrue:[ Transcript showCR:'yes' ].
    ...
of course, you may change the indentation to reflect the program flow;
this is what a C-Hacker (like I used to be) would write:
    ...
    (someVariable > 0) ifTrue:[
	(someVariable < 10) ifTrue:[
	    Transcript showCR:'between 1 and 9'
	] ifFalse:[
	    Transcript showCR:'positive'
	]
    ] ifFalse:[
	Transcript showCR:'zero or negative'
    ].
    ...
and that is how a Lisper (and many Smalltalkers) would write it:
    ...
    (someVariable > 0)
	ifTrue:
	    [(someVariable < 10)
		ifTrue:
		    [Transcript showCR:'between 1 and 9']
		ifFalse:
		    [Transcript showCR:'positive']]
	ifFalse:
	    [Transcript showCR:'zero or negative'].
    ...
Because the above constructs are actually message sends (NOT statement syntax), they do also return a value when invoked. Thus, some Smalltalkers or Lispers would probably prefer a more functional style, as in:
    ...
    Transcript showCR:
	((someVariable > 0)
	    ifTrue:
		[(someVariable < 10)
		    ifTrue:['between 1 and 9']
		    ifFalse:['positive']]
	    ifFalse:
		['zero or negative']).
    ...
Which one you prefer is mostly a matter of style, and you should use the one which is more readable - sometimes, deeply nested expressions can become quite complicated and hard to read.

As a final trick, noticing the fact that every object responds to the #value-message, and that the #if-messages actually send #value to one of the alternatives and returns that, you may even encounter the following coding style sometimes (notice the non-block args of the inner ifs):

    ...
    Transcript showCR:
	((someVariable > 0)
	    ifTrue:
		[(someVariable < 10)
		    ifTrue:'between 1 and 9'
		    ifFalse:'positive']
	    ifFalse:
		'zero or negative').
    ...
The above "trick" should (if at all) only be used for constant if-arguments and only when using the if for its value. With message-send arguments, both alternatives would be evaluated, which is probably not the desired effect.
Warning:
It is a common beginners error, to forget that the above are really messages to some object and that the argument(s) of an if-message ought to be blocks.
Therefore, except for the above "trick", it is usually an error to use round parenthesis instead of brackets.
(the if-expression would evaluate both alternatives and use the condition to choose the returned value.)

Looping (while)

While-loops are defined in the Block class: Examples:
    |someVar|

    someVar := 1.
    [someVar < 10] whileTrue:[
	Transcript showCR:someVar.
	someVar := someVar + 1.
    ]
Warning:
It is a common beginners error, to forget that the above are really messages to some object and that the receiver of a while-message ought to be a block.
Therefore, it is an error to use round parenthesis instead of brackets.
(i.e. "(someVar < 10)" would return a boolean, which does not implement the while messages.)
A nice use of this (and a demonstration of how powerful blocks are) is when the condition block is not static as in the above example, but passed in as an argument to some looping code. For example:
    condition := [ something evaluating to a Boolean ].
    ...

    condition whileTrue:[
	...
    ]
If while-loops are used that way, the condition is typically passed in as an argument or configured in some instance variable.

The above while-loops check the condition at the beginning - i.e. if the condition block evaluates to false initially, the loop-block is not executed at all.

The Block class also provides looping protocol for condition checking at the end (I.e. where the loop-block is executed at least once):

    [
	...
	loop statements
	...
    ] doWhile: [ ...condition... ]
and also:
    [
	...
	loop statements
	...
    ] doUntil: [ ...condition... ]

Endless Loop (forever)

An endless loop is normally not what the programmer wants, except for server processes (which handle incoming requests) or iterative calculations. Such loops can be written as an endless loop, which is left (if at all) by other means (typically by terminating the process).

Of course, an obvious way to write an endless loop is:

    [true] whileTrue:[
	...
	endless loop statements
	...
    ]
However, to document the programmers intention, it it better to use one of the explicit endless loop constructs (#loop or #repeat), as in:
    [
	...
	endless loop statements
	...
    ] loop

Looping over elements of a collection (enumerating)

All collection classes (Array, Set, Dictionary etc.) provide for messages to enumerate their elements, and evaluate a given block for each of them.
The most useful of those enumeration messages is: for example, enumerating an arrays elements is easily done as in:
    |anArray|

    anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
    anArray do:[:eachElement | Transcript showCR:eachElement ].
of course, again you should indent the code to reflect control flow; with C-style indentation the code looks as:
    |anArray|

    anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
    anArray do:[:eachElement |
	Transcript showCR:eachElement
    ].

Looping over a range of numbers

Of course, the traditional (C and Java) loop styles, where a range of numbers is enumerated is also available in Smalltalk:
    |anArray|

    anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).

    1 to: 6 do: [:idx |
	Transcript showCR: (anArray at: idx)
    ].
or, with an increment,
    |anArray|

    anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).

    1 to: 6 by: 2 do: [:idx |
	Transcript showCR: (anArray at: idx)
    ].
However, no real Smalltalk programmer would use #"to:do:" to enumerate a collections element.
There are many, many useful enumeration messages provided in the collection classes, and we highly recommend that you have a look at them.
Open a browser, and look at the implementation of #reverseDo:, #collect:, #detect:, #select:, #findFirst: etc.

Hint:
It is very common for beginners, to use simple #do-loops or even #while-loops with indexing to enumerate elements for element searching or processing.
Please do have a look at the full enumeration protocol and browse for uses of them. It really helps, saves code and avoids bugs. In addition, many of the enumeration messages are implemented in a much more efficient way than naive loop code would be. Do not reinvent the wheel !

Class Library

Now, we reached a point, where we realize that the key to becoming a Smalltalker lies in the knowledge of the systems class library. Although this is true for all big programming systems, it is even more true for smalltalk, since even control structures and looping is implemented by message protocol as opposed to being a syntax feature.

No programming is possible if you dont know the protocol of the classes in the system, or at least part of it.
To give you a starting point, we have compiled a list of the most useful messages as implemented by various classes in the ``list of useful selectors'' document.

Beginners may want to this list and use it as a reference card.
A rough overview of the most common classes and their typical use is found in the "Basic Classes Overview". Please, read this document now.


Continue in "Playing with objects".


[stx-logo]
Copyright © Claus Gittinger Development & Consulting
Copyright © eXept Software AG

<cg@exept.de>

Doc $Revision: 1.26 $ $Date: 1999/10/14 13:13:06 $