To a hammer, everything looks like a nail.
Before you ask "why would I need this feature - I dont need that in C/Java/TCL/..." (for example, when getting confronted with Metaclasses or closures) think a second about the above phrase and read on.
Before describing the language and how you can create your own programs, we should explain a few basics - both to give you some background and to define the technical terms used in the documentation (and literature).
Keep in mind, that this text is only a short introduction -
we recommend reading of a standard textbook on the language for more detailed
information on the language
(-> 'literature').
In contrast to hybrid systems like C++ or Java, everything is an object
in Smalltalk; this includes integers, characters, arrays, classes
and even the programs stackframes, which hold the local variables
during execution.
In Smalltalk, there are not such things as ``builtin'' types or classes,
which are treated different.
+, -, * ... messages.
asUppercase, asLowercase ... messages.
On the other hand, it makes the system very flexible; for example,
it is very easy to extend the numeric class hierarchy by additional
things like Complex numbers, Matrices, Functional Objects etc.
All that is required for those new objects to be handled correctly is
that they respond to some basic mathematical protocol for arithmetic,
comparison etc.
Classes may have zero, one or many instances.
You may wonder how a class without instances could be
useful - this will become clear when inheritance and abstract
classes are described further down in this document.
Integer class
Float class
String class
Button class
UndefinedObject class
Every class keeps a table which
associates the name of the message (the so called message selector) to a method.
When a message is sent to an object, the classes method table (called "MethodDIctionary")
is searched for a corresponding entry and - if found - the associated
method is invoked (more details below ...).
Since smalltalk is a pure object oriented language,
this table is an object and accessable at execution time;
it may even be modified during execution
and allows objects to learn about new messages dynamically.
(Of course, the interactive programming environment heavily depends on
this;
for example, the browser is a tool which adds new items to this table when
a methods new or changed code is to be installed...)
A class inherits all protocol as defined by its superclass(es) and may optionally redefine individual methods or provide additional protocol.
Therefore, a message send performs the following actions (**):
#doesNotUnderstand:) to
the receiver with the message object as argument.
Object, there is no need to.
Actually, it may occasionally make sense for a class
to inherit from no class at all (i.e. to have no superclass).
The effect is that instances of such classes do not inherit ANY protocol
and will therefore trigger an error for all received messages.
All instances of a class provide the same message protocol,
but typically contain different internal state.
It is actually the class, which
provides the definition of the protocol and amount of internal
state of its instances.
String class
and respond to the same set of messages. But the internal state of the first
string consists of the characters "h" and "i", whereas the second contains
the characters "w", "o", "r", "l", "d".
An objects instance variables are only accessable via protocol,
which is provided by the object - there is no way to access an objects
internals except by sending messages to it.
This is true for every object - even for the strings in the example above.
There is no need for the sender of a message to actually know the class of
the receiver - as long as it responds to the message and performs the
appropriate action.
'at:' message. You could write
an ExternalString class, which fetches characters from a file
and returns them from this message.
The sender of the 'at:' message would not be affected at all by this
(except for a possible performance degration ;-).
#basicSize, #identityHash etc.).
Thus, when we send a message to some `normal' object, the corresponding class
object provides the behavior - when some message is sent to a class object,
the corresponding metaclass provides the behavior.
Technically, messages to classes are treated exactly the same way as
messages to non-class objects: take the receiver's class, lookup the method in its
method table, execute the method's code.
Since different metaclass may provide different protocol for their class
instances, it is possible to add or redefine class messages just like any other
message.
As a concrete example, take instance creation which is done in Smalltalk
by sending a "new"-message to a class.
In Smalltalk, there is no such thing as a built-in "new" (or any other built-in)
instance creation message
- the behavior of those instance creation (class) messages is defined exclusively by metaclass protocol.
Therefore, it is possible (and often done) to redefine the "new" method for special handling;
for example singletons (classes which have only a single unique instance), caching and pooling
(the "new" message returns an existing instance from a cache), tracing and many more are easily
implemented by redefining class protocol.
Abstract Classes
more to come
As we will see shortly,
smalltalk programs only consist of messages being sent to objects.
Since even control structures
(i.e. conditional evaluation, loops etc.)
are conceptionally implemented as messages,
a common syntax is used in your programs both for
the programs flow control and for manipulating objects.
Once you know how to send messages to an object,
you also know how to write and use fancy control structures.
Smalltalks power (and difficulty to learn) does not lie in the language itself, but instead in the huge protocol provided by the class libraries objects.
Lets start with languages building blocks ...
"some comment"
"this
is
a
multiline comment"
"
another multiline comment
"
As a language extension, ST/X also allows end-of-line comments;
these are introduced by the character sequence "/ (duoubleQuote-slash) and treat everything
up to the end of the line as a comment:
"/ this is an end-of-line comment
#new
message sent to a class or the #copy message sent to some instance..
The following literal constants are allowed:
Integer constants:
6, -1, 12345678901234567890
8r0777, 16r80000000000, 16rAFFE, 2r0111000
Float) constants:
1.234, 1e10, 1.5e15
16r10.1) are allowed, but should not be used
with a radix > 14.
Boolean constants:
true, false
UndefinedObject constant:
nil
Character constants from the ascii 8-bit character set:
$c
String constants:
'foo' or 'a long string constant'
Symbol constants:
#bar or #'foo bar baz'
Array constants:
#(1 2 $b)
#(1 #two #(3 4) #( #(5 6) 7) ).
#(1 two (3 4) ( (5 6) 7) )
ByteArray constants:
#[0 1 2 3 4]
Identifiers must start with a letter or an underscore character.
The remaining characters may be letters, digits or the underline character (*).
Examples:
foo
aVeryLongIdentifier
anIdentifier_with_underline_characters
For portability with some (VMS-)VisualWorks Smalltalk variants, a dollar character ($) can also be allowed inside an identifier as a compiler option.
nil
true and false
self
super
thisContext
here
Since "here" is a Smalltalk/X language extension,
its builtin-ness is less strict than that of the other special variables:
if a variable named "here" is defined, here will refer
to that variable;
otherwise, it refers to the receiver (with different lookup semantics).
1 negative
sends the message "negative" to the number 1, which is the receiver of the
message.
Unary messages, like all other messages, return a result,
which is simply another object.
In the above case, the answer from the "negative" message is a boolean object;
in case of the number 1, false is returned.
Evaluate this in a
workspace
(using printIt);
try different receivers
(especially: try a negative number).
Unary messages parse left to right, so, for example:
first sends the
1 negative not
negative message to the number 1.
Then, the not message is sent to the returned value.
The response of this second message is returned as the final value.
If you evaluate this in a workspace using printIt,
the returned value will be true.
Try a few unary messages/expressions in a workspace:
1 negated
-1 negated
-1 abs
1 abs
5 sqrt
1 isNumber
$a isNumber
1 isCharacter
$a isCharacter
'someString' first
'hello world' size
'hello world' asUppercase
'hello world' sort
#( 17 99 1 57 13) sort
1 class name
1 class name asUppercase
5 between:3 and:8
"between:" and "and:" are the keywords,
and the numbers 3 and 8 are the arguments.
The object representing the number 5 is the receiver of the message.
The messages actual selector is formed by the concatenation of all individual
keywords; in the above example, it is "between:and:".
This is different to both a "between:" and a "and:"
message, which often leads to beginners errors.
(Of course, "between:and:" and "and:between:"
are also different messages.)
In the browser, the corresponding method is be listed
under the name: "between:and:".
Keyword messages do parse left to right,
but if another keyword follows a keyword message, the expression is parsed as
a single message (taking the keywords concatenation as selector).
Thus, the expression:
would send a
a max: 5 min: 1
"max:min:" message to the object referred to by the variable
"a".
This is not the same as:
which first sends the
(a max: 5) min: 1
"max:" message to "a",
then sends the "min:" message to the result.
Try these in a
workspace
(don't fear the error...)
To avoid ambiguity you must place parentheses around.
Try a few keyword messages/expressions in a workspace:
1 max: 2
1 min: 2
(2 max: 3) between: 1 and: 3
(1 max: 2) raisedTo: (2 min: 3)
Unary messages have higher precedence than keyword messages,
thus
evaluates to 9.
9 max: 16 sqrt
because it is evaluated as: "9 max: (16 sqrt)" which is "9 max:4".
It is not "(9 max: 16) sqrt", which is "16 sqrt" and would give 4 as answer.
An example of a binary message is the one which implements arithmetic addition
for numeric receivers (it is implemented in the Number classes):
This is interpreted as a message sent to the object 1 with the selector '+'
and one argument, the object 5.
1 + 5
Binary messages
parse left to right (like unary messages).
Therefore,
results in 21, not 17.
2 + 5 * 3
(first, '+' is sent to 2, with 5 as argument. This first message returns 7.
Then, '*' is sent to 7, with 3 as argument, resulting in 21 being answered.)
To change the execution order or to avoid ambiguity you should place parentheses around:
Now, the execution order has changed and the new result will be 17.
2 + (5 * 3)
Unary messages have higher precedence than binary messages, thus
evaluates as "9 + (16 sqrt)", not "(9 + 16) sqrt".
9 + 16 sqrt
On the other hand, binary messages have higher precedence than
keyword messages, thus
evaluates as "(9 + 16) max: (3 + 4)" which is "25 max: 7" and answers 25.
9 + 16 max: 3 + 4
It is not the same as "9 + (16 max: 3) + 4" (which results in 29) or
"((9 + 16) max: 3) + 4" (which in this case also results in 29)
Again, we highly recommend the use of parentheses - even when the default evaluation order matches the desired order; it makes your code much more readable, and helps beginners a lot.
Try a few binary messages/expressions in a workspace:
1 + 2
1 + 2 * 3
(1 + 2) * 3
1 + (2 * 3)
-1 * 2 abs
(-1 * 2) abs
5 between:1 + 2 and:64 sqrt
5 between:(1 + 2) and:(64 sqrt)
The second example above shows why parenthesis are so useful:
from reading the code, it is not apparent, if the evaluation
order was intended or is wrong.
You will be happy to see parenthesis when you have to debug
or fix a program which contains a lot of numeric computations.
Here are a few more "difficult" examples:
1 negated min: 2 negated
1 + 2 min: 2 + 3 negated
In ST/X, the actual set of allowed characters can be queried from the system
by evaluating "Scanner binarySelectorCharacters".
If you compare your favourite programming language
against regular english,
you will find smalltalk to be much more similar to plain english
than most other programming languages.
For example, consider the order to a person called tom,
to send an email message to a person called jane:
(assuming that tom, jane, theEmail refer to objects)
| English | Smalltalk | C++ or Java |
|---|---|---|
| tom, send an email to jane. | tom sendEmailTo: jane. | tom.sendEmail(jane); |
| tom, send theEmail to jane. | tom send: theEmail to: jane. | tom.sendEmail(theEmail, jane); |
| tom, send theEmail to jane with subject: 'hi'. | tom send: theEmail to: jane subject: 'hi'. | tom.sendEmail(theEmail, jane, 'hi'); |
1 negated
"negated" to the number 1, which gives
us a -1 (minus-one) as result.
1 negated abs
"negated" to the number 1, which gives
us an intermediate result of -1 (minus-one);
then, the message "abs" is sent to it, giving us
a final result of 1 (positive-one).
-1 abs negated
"abs" to the number -1 (minus-one), which gives
us a 1 (positive one) as intermediate result. Then this object
gets a "negated" message.
1 + 2
"+" to the number 1, passing it
the number 2 as argument. The returned object is 3.
"+" message.
1 + 2 + 3
"+" is sent to the number 1, passing it
the number 2 as argument. Then, another "+" message is sent to
the intermediate result, passing the integer-object 3 as argument.
1 + 2 * 3
-1 abs + 2
"abs" to the number -1 (minus-one), then sends "+"
to the result, passing 2 as argument.
1 + -2 abs
"abs" to the number -2, then sends "+"
to the number 1, passing the result of the first message as argument.
-1 abs + -2 abs
"abs" to the number -1 (minus-one) and remembers the result.
Then sends "abs" to the number -2 and passes this as argument
of the "+" message to the remembered object.
1 + 2 sqrt
"sqrt" to the number 2, then passes this as argument
of the "+" message to the number 1.
(1 + 2) sqrt
"+" to the number 1, passing 2 as argument.
Then sends "sqrt" to the result.
1 min: 2
"min:" (minimum)
message to the number 1, passing 2 as argument.
(1 max: 2) max: 3
"max:" (maximum)
message to the number 1, passing 2 as argument. Then sends "max:"
to the returned value, passing 3 as argument.
(1 + 2 max: 3 + 4) min: 5 + 6
"+" to the number 1 passing 2
as argument and remembers the result.
Then, "+" is sent to the
number 3, passing 4 as argument.
Then, "max:" is sent to the remembered first result,
passing the second result as argument. The result is again
remembered.
Then, "+" is sent to the number 5, passing
6 as argument.
Finally, the "min:" message is sent to the
remembered result from the first max: message, passing
the result from the "+" message.
1 max: 2 max: 3
"max:max:"
message to the number 1, passing the two arguments, 2 and 3.
"max:max:" message,
this leads to an error (message-not-understood).
This example illustrates why parenthesis are highly recommended - especially with concatenated keyword messages.
'hello' at:1
"at:"
message to the string constant.
'hello' , ' world'
","
binary message to the first string constant, passing another string as argument.
'hello' , ' ' , 'world'
","
binary message to the first string constant, passing ' ' as argument.
Then, the result gets another "," message, passing 'world' as
argument.
#(10 2 15 99 123) min
"min"
unary message to an array object (in this case: a constant array literal).
All collections respond to the min message, by searching for the smallest
element, and returning it.
Every variable is automatically initialized to nil when created.
For now, only some global variables and local variables are described (because we need them for more interesting examples); the other variable types will be described later.
Beside classes, a few other objects can be referred to via a global; the most interesting for now is:
Transcript
show:something
cr
showCR:something
show: followed by cr.
flash
Workspace variables are created and destroyed via corresponding menu functions in the workspace window.
A local variable declaration consists of an opening '|' (vertical bar) character,
a list of identifiers and a closing '|'.
It must be located before any statement within a code entity (a block or method, which are described below).
For example:
declares 3 local variables, named 'foo', 'bar' and 'baz'.
| foo bar baz |
A local variables lifetime is limited to the time the enclosing context is active - typically, a method or a block (you will learn later, what a block is).
Notice, that when a piece of code is evaluated in a workspace window, the system generates an anonymous method and invokes it for execution. Therefore, a local variable declaration is also allowed with doIt-evaluation (the variables lifetime will be the time of the execution).
Instance variables are private to some object and their lifetime is
the lifetime of the object.
We will come back to instance variables, once we have learned how classes are defined.
foo := 1
or:
bar := 'hello world'
This makes the variable refer to the object as specified
after the assignment symbol - which may either be a literal (i.e. a constant),
the value of some other variable, or the outcome of some message expression.
Keep in mind, that only a reference to an object is stored into the variable,
not the state of the object itself.
This means, that multiple variables may refer to the same object.
For example:
The previous example demonstrates,
that both var1 and var2 refer to the same array object.
I.e. that in smalltalk, a variable actually holds a reference to an object,
and that more than one variable may refer to the same object.
|var1 var2|
"/ ask the Array class for a new intance
"/ with 5 elements ...
"/ ... and assign it to var1
var1 := Array new:5.
"/ and also to var2
var2 := var1.
"/ now show what those vars refer to in the transcript
Transcript show:'var1 refers to: '.
Transcript show:var1.
Transcript cr.
Transcript show:'var2 refers to: '.
Transcript show:var2.
Transcript cr.
"/ change the 2nd element ...
Transcript show:'now changing the second element...'.
Transcript cr.
var1 at:2 put:1.
"/ show what those vars refer to
Transcript show:'now, var1 refers to: '.
Transcript show:var1.
Transcript cr.
Transcript show:'now, var2 refers to: '.
Transcript show:var2.
Transcript cr.
Multiple assignments are possible, for example:
binds both 'foo' and 'bar' to the same string-object.
foo := bar := 'hello'
Be careful when assigning to globals - do not (by accident) overwrite
a reference to some other object:
To prevent beginners from
doing harm to the system, ST/X checks for this situation
and gives a warning - however, other (smalltalk-) systems may silently
perform the assignment and leave you with an unusable system.
Float := nil
As a general rule:
do not assign to global variables - its usually a sign of
bad design if you have to (as you will see below, there are other variable
types which can be used in most situations).
Knowing about variables, we can try more interesting messages:
Ask the Float class for the π constant:
Ask the
Float pi
Transcript object to flash its view:
Ask the
Transcript flash
WorkspaceApplication class to create a new instance and open
a view for it:
Declare a local variable, assign a value and display it on the transcript
window:
WorkspaceApplication open
Remember, that a variable may refer to any object.
|foo|
foo := -1.
Transcript show:'foo is now bound to: '.
Transcript show:foo.
Transcript cr.
foo := foo + 2.
Transcript show:'foo is now bound to: '.
Transcript show:foo.
Transcript cr.
Thus, the following is legal (although not considered a good style):
|foo|
foo := -1.
Transcript show:'foo is: '.
Transcript show:foo.
Transcript cr.
foo := 'hello'.
Transcript show:'foo is now: '.
Transcript show:foo.
Transcript cr.
| coll |
coll := Set new. "/ create an empty Set-collection
coll add:'one'.
coll add:'two'.
coll add:3.
A cascade expression sends another message (possibly with arguments) to
the previous receiver.
| coll |
coll := Set new. "/ create an empty Set-collection
coll add:'one'; add:'two'; add:3.
-1 negated.
1 + 2.
first sends the 'negated' message to -1 (minus one), ignoring the result.
Then, the '+' message is sent to 1 (positive one), passing the number 2
as argument.
For non-Smalltalkers:
As a first approximation, regard a block as a reference to
an anonymous function, which can be defined without a name, passed
to other objects and eventually executed.
For Lispers:
Its a Lambda !
A block is defined simply by enclosing its statements in brackets,
as in:
later, when the block is to be evaluated,
send it the
| someBlock |
someBlock := [ Transcript flash ].
#value message:
Blocks may be defined with argument(s);
...
someBlock value.
...
A block with argument(s) is defined by giving the formal argument identifiers
after the opening bracket - each prepended by a colon-character. The list
is finished by a vertical bar.
For example:
defines a block which expects (exactly) one argument.
|someBlock|
...
someBlock := [:a | Transcript showCR:a ].
...
To evaluate it, send it the value: message, passing the desired
argument object.
For example, the above block can be evaluated as:
(here, a string-object is passed as argument).
someBlock value:'hello'
Blocks can be defined to expect multiple arguments, by declaring each
formal argument preceeded by a colon. For evaluation, a message of the form
value:...value: with a corresponding number of arguments must
be used.
For example, the block:
can be evaluated with:
|someBlock|
...
someBlock := [:a :b :c |
Transcript show:a.
Transcript show:' '.
Transcript show:b.
Transcript show:' '.
Transcript show:c.
Transcript cr
].
...
someBlock value:1 value:2 value:3
When evaluated, the return value of the message is the value of the
blocks last expression.
I.e.:
assigns 6 to the variable 'result'.
|someBlock|
...
someBlock := [:a :b :c | a + b + c].
...
result := someBlock value:1 value:2 value:3.
...
Blocks have many nice applications: for example, a GUI-Buttons action can be define with a block, or a timer may be given a block for later execution.
However, the most striking application of blocks is in defining control
structures (like if, while, repeat, loops etc.) as known in other languages.
Recall, that the above description of the smalltalk language did not
describe any syntax for control-flow - the reason is simple: there is none.
Instead, all program control is defined by appropriate message protocol;
mostly in the Boolean and Block classes.
ifTrue: / ifFalse:
protocol as implemented by the boolean objects true and false:
ifTrue: aBlock
ifFalse: aBlock
ifTrue:trueBlock ifFalse: falseBlock
ifFalse: falseBlock ifTrue:trueBlock
So, to compare two variables and send some message to the Transcript
window, you can write:
of course, you may change the indent to reflect the program flow;
...
(someVariable > 0) ifTrue:[ Transcript showCR:'yes' ].
...
this is what a C-Hacker (like I used to be) would write:
and that is how a Lisper (and many Smalltalkers) would write it:
...
(someVariable > 0) ifTrue:[
(someVariable < 10) ifTrue:[
Transcript showCR:'between 1 and 9'
] ifFalse:[
Transcript showCR:'positive'
]
] ifFalse:[
Transcript showCR:'zero or negative'
].
...
...
(someVariable > 0)
ifTrue:[
(someVariable < 10)
ifTrue:[
Transcript showCR:'between 1 and 9']
ifFalse:[
Transcript showCR:'positive']]
ifFalse:[
Transcript showCR:'zero or negative'].
...
whileTrue: loopBlock
whileFalse: loopBlock
whileTrue
whileFalse
|someVar|
someVar := 1.
[someVar < 10] whileTrue:[
Transcript showCR:someVar.
someVar := someVar + 1.
]
Warning:
"(someVar < 10)" would return a boolean, which does
not implement the while messages).
The above while-loops check the condition at the beginning - i.e. if the condition block evaluates to false initially, the loop-block is not executed at all.
The Block class also provides loop methods, which evaluate the condition at the end. I.e. where the loop-block is executed at least once.
do: aOneArgBlock
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
anArray do:[:el | Transcript showCR:el ].
of course, again you should indent the code to reflect control flow;
with C-style indentation the code looks as:
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
anArray do:[:el |
Transcript showCR:el
].
There are many, many useful enumeration messages provided in the collection
classes, and we highly recommend that you have a look at them.
#collect:,
#detect:, #select:, #findFirst: etc.
|anArray|
anArray := #( 'one' 'deux' 'drei' 'quatro' 5 6.0 ).
1 to: 6 do: [:idx |
Transcript showCR: (anArray at: idx)
].
No real Smalltalk programmer would do this to enumerate a collections element, though.
Now, we reached a point, where we realize that the key to becoming a Smalltalker lies in the knowledge of the systems class library. Although this is true for all big programming systems, it is even more true for smalltalk, since even control structures and looping is implemented by message protocol as opposed to being a syntax feature.
No programming is possible if you dont
know the protocol of the classes in the system, or at least part of it.
To give you a starting point, we have compiled a
list of the most useful messages as implemented by
various classes in the
``list of useful selectors''
document.
A rough overview of the most common classes and their typical use is found in the "Basic Classes Overview". Please, read this document now.
Copyright © Claus Gittinger Development & Consulting
Copyright © eXept Software AG
<cg@exept.de>