Files processed by the stc smalltalk-to-c compiler are usually
generated by filing out class code directly with the
SystemBrowser,
or indirectly, by checking some class into the source code repository
(also via the SystemBrowser) and then checking it out into a directory via
a "cvs update"
or "cvs checkout"
command.
Of course, since they are regular files, you can alternatively use
any text editing tool to edit and manipulate these files, working
in the traditional edit-compile-link mode if you prefer.
In this mode, think of the file as describing one class;
comparable to programming in C++ or similar languages.
Stc file format
Files compiled by stc must be in smalltalks
fileout format
(i.e. the file consists of smalltalk expressions,
separated by '!'-characters).
'!'-characters within the text have to be doubled;
this need for doubling also applies to exclamation marks within comments and strings.
Since ST/X replaces doubled '!'-characters by a single '!' when filing in,
you will see only single '!'-characters in the browser.
You have to be careful, when editing a source using
the fileBrowser or another editor.
Notice, that the SystemBrowser cares for this
doubling when classes are filedOut - but the fileBrowser does not, since it
treats smalltalk source code files just like any other text file.
Currently, stc can only compile files which contain either one single classes definition (with optional private classes), or a "methods-only file", which contains methods, but no class definition.
The source syntax for compiled smalltalk implements a subset of the messages used to create/manipulate classes and methods. Other expressions than those listed below are not allowed/supported.
The first expression in a "class-definition file" must be a class-definition expression;
a "methods-only file" may only consist of method definions (i.e. methodsFor-expresisons).
Class definition
("class-definition files" only)
The stc compiler accepts the following (and only those) class definition
expressions:
superclass subclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1 sharedPool2...'
category:'some-category'
to define class as a subclass of superclass.
Class variables are visible in both class- and instance methods, of the defining class and of every subclass (and subclasses of subclasses). Class variables are shared (unless redefined) - meaning that access is to the same physical memory "slot" from both the defining class and all subclasses. You can think of class variables as globals with limited accessablility: only the defining class and its subclasses 'see' them.
See below for class instance variables, which are class private (i.e. each class provides its own physical "slot").
UndefinedObject
and SmallInteger
) which CANNOT be subclassed.
for the curious:
the reason is that instances of these are no real objects, but are marked
by a special tag-bit or object-pointer value.
Thus these instances do not have a class field in memory.
This makes it impossible for the VM (= virtual machine or runtime-system)
to know the class of such a sub-instance.
for the curious:
these are especially Object
, SmallInteger
and all classes which are also
known by the VM and/or the compiler. The reasons are:
for Object
:
since there are some classes which inherit from Object
,
and which are not represented by pointers (i.e. UndefinedObject
and
SmallInteger
). Since these cannot have instance variables, all
superclasses of them may also not define any instance variables.
This means, that all classes between Object
and
SmallInteger
(i.e.
Magnitude
, ArithmeticValue
, Number
and Integer
) are also not
allowed to have instance variables.
for the built-in classes: (Actually, the following is also true for the classes mentioned above)
all classes known by the VM (i.e. Float
, SmallInteger
,
Character
,
Array
, String
, Method
, Block
,
Class
, Metaclass
etc.) must have a layout
as compiled into the VM. Since the VM accesses these instance
variables (and is not affected by a class change) it would use
wrong offsets when accessing an instance of such a changed class.
Since instance variables are inherited, this also affects all super-
classes of the above listed classes.
You will get an error-notification, if you try to change such a
class within the browser.
"poolDictionaries:"
argument must be an empty string.
Technically, classvariables are implemented as globals with a special name constructed as:
ClassName:ClassVarNamehowever, you should not have to care for or depend on this, except for the fact that class variables are visible when inspecting the Smalltalk dictionary and can be accessed easily from C-functions as globals (named "ClassName_ClassVarName").
Do not depend on any specific implementation of class variables, the current implementation may change without notice. Actually, it is planned to separate classVariables from Smalltalk globals in future ST/X versions and use multiple dictionaries within the VM.
superclass variableSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
to define class as a subclass of superclass with indexed instance
variables even if superclass had no indexed instance variables. An
error will be generated, if the superclass is a variableByte- or
variableWord class.
superclass variableByteSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
to define class as a subclass of <superclass> with indexed instance
variables which are byte-valued (0 .. 255) integers.
An error will be generated, if the superclass is a variable class (i.e. has indexed instances) AND it has NO byte valued elements.
superclass variableWordSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
to define class as a subclass of superclass with indexed instance
variables which are word-valued (0 .. 16rFFFF) integers
(i.e. unsigned shorts in c-world).
It is an error if superclass has non-word indexed instance variables.
superclass variableFloatSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
or:
superclass variableDoubleSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
to define class as a subclass of superclass with indexed instance
variables which are shortfloat- or doublefloat-valued rational numbers.
(i.e. floats and doubles in c-world).
Float- and DoubleArrays were added to support 3D graphic packages (i.e. GL), which use arrays of float internally to represent matrices and vectors. They provide much faster access to their elements than the alternative using byteArrays and floatAt:/doubleAt: access methods.
Also, storage is much more dense than in arrays, since they store the values directly instead of pointers to the float objects.
A 1000-element floatArray will need 1000*4 + OHDR_SIZE = 4012 bytes, while a 1000-float-element array needs 1000*4 + OHDR_SIZE + 1000*(12+8) = 20012 bytes. (each float itself requires 8-bytes plus 12-byte header)
superclass variableLongSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
or:
superclass variableSignedWordSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
or:
superclass variableSignedLongSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
category:'some-category'
to define class as a subclass of superclass with
long (32 bit integers in the range 0 .. 16rFFFFFFFF), signed short
(i.e. -16r8000 .. 16r7FFFF) and signed long (-16r80000000 .. 16r7FFFFFFF)
indexed instance variables.
These types were added for easier bulk data exchange with C language functions.
They are not used in smalltalk.
Be aware, that indexable classes with float, double, signedWord, long
and signedLong elements may (are) NOT be available on other smalltalk implementations.
Using them may make your application non portable to other systems.
(however, these can be simulated by subclassing ByteArray and redefining the
access methods).
Class comment
("class-definition files" only)
A class comment may be defined with an expression of the form:
an alternative to using a comment is to define class methods under the category "documentation",
consisting of comments only.
ClassName comment:'some string'
Empty methods do not use ANY code space in ST/X,
and have the positive effect
of not eating up data space in the smalltalk executable (which the comment does)
Class instance variables
("class-definition files" only)
A class may have instance variables, these MUST be declared before
the first class method is declared. The declaration has the form:
Do not confuse class variables with class instance variables.
ClassName class instanceVariableNames:'string with varNames'
Only one such class-instance-variable definition is allowed per input file.
Method definition
The expressions following the class definition are to be
method definitions of the form:
or
!ClassName methodsFor:'method-category'!
aMethod
...
!
aMethod
...
!
...
lastMethodInCategory
...
! !
"class-definition files" may only contain method definitions for the class defined in the class definition.
!ClassName class methodsFor:'category'!
aClassMethod
...
!
...
lastClassMethodInCategory
...
! !
"method-only files" may only methods for any class - but no class definitions.
Instance methods and class methods may be in any order.
To allow compilation of classes filed out from ENVY, stc
also recognizes the selectors privateMethodsFor:
and
publicMethodsFor:
.
In addition, the special selector ignoredMethodsFor:
tells stc to ignore all followup methods up to an empty chunk.
There is a limit in the maximum number of arguments
methods can be defined with and messages can be sent with (currently 15).
This limit will be removed eventually, allowing an arbitrary number
of arguments.
Other limits are:
For very complicated expressions (especially when these are generated automatically), the temporary limit could be reached in theory. In practice, so far no Smalltalk code (available PD programs and alphaTesters application code) has ever hit those limits.
Since most terminals cannot display the Smalltalk assignment character
'<-' (backarrow as one character with same ascii-code as '_'),
the scanner also accepts the character sequences ":=" (colon-equal)
to express assignment.
This is compatible to similar extensions found in other Smalltalk
implementations. Of course, the '_' is also accepted.
Use ':='
Support for '_' may be removed in later versions.
Also, Smalltalk/X, like newer Smalltalk-80 versions allows
underscores in identifiers - no longer treating them as assignment.
Although not defined in the book, Smalltalk-80 expressions seem to require
(blank) characters to separate tokens
(i.e. "Point origin: point1 corner: point2
").
Smalltalk/X does not need these
(i.e. "Point origin:point1 corner:point2
" is fine)
I do not know at the moment, if this makes any problem when porting
Smalltalk/X code to other Smalltalk implementations.
(if required, the fileOut-methods may have to be changed to add blanks)
Assignment and init-expressions
In contrast to Smalltalk-80's fileIn format (where any expression is allowed),
expressions other than above must be of the form:
(constant may be any integer, float, string, symbol, true, false or nil)
Smalltalk at:#name put:constant
or of the form:
(classname must be the name of the class defined in this source-file)
classname initialize
These expressions allow globals to be set to a predefined value at startup
and/or class initialization.
Example:
...
Smalltalk at:#MyVariable put:true !
...
Example Class
Point subclass:#Point3D
instanceVariableNames:'z'
classVariableNames:''
poolDictionaries:''
category:'Graphics-Primitives'
!
Point3D comment:'
this class defines a point in 3-dimensional space
'!
!Point3D class methodsFor:'instance creation'!
x:newX y:newY z:newZ
"answer a new point with coordinates newX and newY"
^ ((self basicNew) x:newX y:newY) z:newZ
! !
!Point3D methodsFor:'accessing'!
z
"Answer the z coordinate"
^ z
!
z:newZ
"set the z coordinate"
z := newZ
! !
!Point3D methodsFor:'printing'!
printString
"answer my printString"
^ super printString , '@' , z printString
! !
Semantic Details
The following only lists non obvious semantic details - for a description
of the smalltalk language, please refer to standard literature.
Evaluation order
Expression arguments and receiver are evaluated left to right,
starting with the receiver (with exceptions as described below).
Smalltalk is an eager evaluating language - that is, all arguments
are evaluated before the message send - even if not used by the called code.
Lazy evaluation can be simulated partially by using blocks as arguments,
or by special code (see the LazyValue class and its documentation).
Side effects
If any argument of an expression has a side effect on an instance variable,
and the expression uses that instance variable, it is NOT DEFINED
if the original or modified value of that instance variable is used.
For example:
in the
Object subclass:SomeClass
instanceVariableNames:'i'
...
i:aNumber
i := aNumber
!
increment
i := i + 1.
^ i
!
undefinedBehavior
^ self increment + i
!
undefinedBehavior2
^ i + self increment
!
test
Transcript showCR:'undefinedBehavior returns: '
, (SomeClass new i:0) undefinedBehavior printString.
Transcript showCR:'undefinedBehavior2 returns: '
, (SomeClass new i:0) undefinedBehavior2 printString.
!
#undefinedBehavior
method, the value used for i
in the #+
message may or may not be the incremented value.
Warning:
do never depend on the particular behavior of a smalltalk or compiler;
the semantic here is not defined. Even in ST/X, the behavior may differ
between versions, or between the incremental and batch compiler.
(actually, in the current ST/X version, the incremental compiler returns
2 for the first, and 1 for the second method.
In contrast, stc generated code returns 2 for both).
Notice:
this is to be considered a bug, because it conflicts with the evaluation order as defined above
(although it is bad coding style...).
In practice, there have been no problems due to this in the past.
#become
The behavior of your program is undefined, if instance variables
of the receiver are accessed in a method,
after a #become:
message was sent to the receiver.
If the #become:
changed the receiver into some other
object with less or no instance variables, even a nonrecoverable
fatal error may occur. Otherwise, the access will be to the
corresponding instance variable slot as defined by the other class.
For example, the following may lead to unexpected behavior
(or even a nonrecoverable fatal error):
The use of message sends to access the instance variables
removes the above danger:
Object subclass:SomeClass
instanceVariableNames:'i'
...
badMethod1
i := 0.
self become:somethingElse.
^ i
!
badMethod2
self become:somethingElse.
i := 0.
!
Notice:
Object subclass:SomeClass
instanceVariableNames:'i'
...
i
^ i
!
i:newValue
i := newValue
!
fixedMethod1
self i:0.
self become:somethingElse.
^ self i
!
fixedMethod2
self become:somethingElse.
self i:0.
!
ST/X typically falls into a segmentation violation exception,
which can be cought by an appropriate exception handler.
Literal array of a method
Stc generated code does not (currently) access the literal array; instead,
the literal array of a method is created for the debugger (to find senders)
only. Modifying the literal array (which is bad coding style anyway) has
no effect on machine compiled code.
In contrast, bytecode-interpreted methods use the values found in the literal array. A modified literal array will change the behavior of the method. This modified behavior is not reflected in the methods source code.
And finally, the dynamic compiler (JITTER) generates code which accesses literals inline (i.e. it takes the literalArrays contents at compilation time and creates inline constant accesses). Thus, JITTED code behaves like static compiled code, in that changing the literal array does not affect the execution. (However, since the system may chose to flush its dynamic code cache, and recompile later, the changed literal array will affect the execution then).
For these reasons, we highly recommend keeping the literal arrays untouched.
(experts may do so, but have to ensure that the method gets recompiled (by converting a
static compiled method into a dynamic one, and flushing the code cache entry for this method, if
its a dynamic compiled one).
Builtin methods
For a number of message sends, both the stc- and the incremental compiler
create inline code which performs the function without doing any message send.
Redefinition of any method listed below will have no effect on your
program; also, tracing and breakpointing of these methods is not possible
(since they are never executed).
In theory, many more methods could be inlined; the current set represents a compromise between performance (inlined code is much faster) and flexibility (inlined methods cannot be redefined/traced).
In general, only methods for which a changed semantic would make the system unusable anyway, are inlined. With the stc compiler, the degree of inlining can be further controlled by command line arguments.
Inlined messages:
ifTrue:[ifFalse:] [ ... ]
ifFalse:[ifTrue:] [ ... ]
with bytecode interpretation, the receiver is checked for
being either true or false, and an errorSignal is raised if not.
STC compiled- and just-in-time generated code simply compares the
receiver against true or false, showing undefined behavior if the
receiver is not a boolean.
(i.e. "foo ifTrue:"
is compiled as "foo == true ifTrue:"
and "foo ifFalse:"
is compiled as "foo ~~ true ifTrue:"
)
When debugging programs, you may want to disable just-in-time compilation, to have the system check for non-boolean receivers and detect those error situations.
We are aware of the fact, that this different behavior is bad, and are still looking for an easy fix (which does not cost performance and does not blow up the generated code too much).
[
any] whileTrue: [ ... ]
[
any] whileFalse: [ ... ]
as above for the blocks value
timesRepeat:[]
to:
aSmallInteger do:[]
to:
aSmallInteger by:
aSmallInteger do:[]
+
aSmallInteger
-
aSmallInteger
*
aSmallInteger
//
aSmallInteger
bitAnd:
aSmallInteger
bitOr:
aSmallInteger
negated
arguments are checked for being smallIntegers
and the expression is evaluated without sending the message.
Depending on the compiler's optimization settings, this may also be
done partially for float or mixed float & smallInteger operands.
at:
aSmallInteger
at:
aSmallInteger put:
anObject
the array access is performed inline, if the index is within the bounds and it is likely, that the argument is an array.
at:
aSmallInteger
at:
aSmallInteger put:
anObject
the string access is performed inline, if the index is within the bounds and it is likely, that the argument is a string.
class
isMemberOf:
direct access to the objects (hidden) class slot
==
~=
an identity compare produces true or false without a message send
isNil
notNil
an identity compare against nil is generated
perform:
aMessage
inline as any message, if the argument is a constant symbol
yourself
no message send is generated - the receiver is directly evaluated
?
anObject
the receiver is evaluated and compared against nil. If nonNil, results in the receiver - otherwise the argument.
Character space
Character tab
Character value:aSmallInteger
no message - the space-Character constant is directly returned.
This is also done for tab
, cr
and a few other
common character constants.
SmallInteger maxVal
no message - the maximum SmallInteger constant is directly returned.
This is also done for minVal
, maxBits
and maxBytes
=
anObject
an identity test is performed first; no equality test is performed, if the objects are identical and true is generated by inline code.
~=
anObject
an identity test is performed first; no equality test is performed, if the objects are identical and false is generated by inline code.
<
anObject
if the arguments are SmallIntegers, the comparison is done inline.
Otherwise, a regular message send is generated.
The same is done for the other relational operators.
Extensions to Smalltalk-80 (Blue book version)
Compiler directives
Comments of the form:
are recognized by the stc-compiler as directives. Since directives are
hidden within comments, these will be ignored by
other Smalltalk systems; making ST/X sources transferable to
other Smalltalks.
"{ something ... }"
"{ Line: n }"
tells stc that line-numbering should continue with line n.
Line numbers in following warning- and error-messages will be relative
to n.
"{ Symbol: aSymbolString }"
tells stc that a primitive wants to access a symbol.
Stc includes a definition for that symbol and generates code to create the
symbol at startup time; within the primitive, the symbol can be refered to
by a C-conforming name as described in
``How to write inline C code''.
Symbols can also be created using the (slower) _MKSYMBOL()
function at
runtime. This also allows C-Strings to be converted to symbols.
(example:
in the XWorkstation-class where keypress-characters are converted to
symbols like #Home
, #Down
etc.)
This directive is no longer needed and may not be supported in future versions.
Use the @symbol
-mechanism,
since it reliefs you of the need to know about name translations.
"{ Class: className }"
after an instance-, class-, or local-variable declaration tells
stc, that this variable will always be assigned an object of class:
className.
Various optimizations in the code are possible if the type of an object is known (especially for simple types such as "SmallInteger", "Character" "Point" or "String").
Currently everything but SmallInteger
, Float
and Point
-definitions in method
local declarations are ignored by the compiler.
Even with these type declarations, the compiler still generates code which checks assignments for correct typing (i.e. an assignment of a float to a SmallInteger-typed variable will generate a runtime error).
With the improvements of the type-tracker and optimizations performed in stc,
this feature seems now much less useful in many situations
- especially, when considering the limited reusability of the generated code.
(see benchmark results of sieve/sieveWithInteger, atAllPut/atAllPut2 etc.
some show very small differences between the untyped and typed versions)
We recommend using type hints only in performance critical code, for fully debugged code.
Sometimes, finer control (i.e. over individual methods) is needed. Comments of
the form:
instruct stc to change its code generation startegy for a single method.
Keyword must (currently) be one of:
"{ Pragma: keyword }"
+optspeed
" or "+optSpeed
"
+optmath
" or "+optMath"
+inlinemath
" or "+inlineMath
"
+inlinemath2
" or "+inlineMath2
"
"although the whole class is complied '+optinline',
the following class-initialization method is compiled for space,
since it is only called once ..."
initialize
"{ Pragma: +optspace }"
....
....
!
Changing compilation to "+optspace
" is useful for methods which are
seldom called (such as class-initialization methods, which are usually invoked
only once during startup) or error reporting methods,
which are only invoked for abnormal events.
The effect of pragmas can be turned off with the "-noPragmas
" stc
command line argument - with this option, optimizations are under control
of command line arguments only.
Currently, not all possible trigonometric function generate inline code with the inlineMath options - there may be more in the future if there is a need.
"{ NameSpace: nameSpaceID }"
declares the namespace, into which the following class is to be installed.
It must preceede any class definition message in the source file
(i.e. it should be located somewhere at the files beginning).
The current projects defaultNameSpace is used, if no namespace directive is present in a loaded sourceFile.
"{ Package: 'package-identifier' }"
defines a package identifier, which is attached to all methods and classes
which are defined in that file.
This is mostly useful, if individual methods for existing (Smalltalk-)
classes are to be filed in, and you want those to be easily identified later.
For example, the "tgen" package adds a few methods to the Object
and
Array
classes. In order to identify those later (i.e. find them
quickly for removal), the change file contains a line defining a package
identifier of "tgen"
; therefore, all of the redefined methods
get this as their package identifier.
Thus, you can later use the ProjectViews "browse" menu item,
to open a browser on all those methods.
The current projects defaultPackage identifier is used,
if no package directive is present in a loaded sourceFile.
Multiple Namespaces
Especially when filing in third party code or you are working in a big team, you may encounter name conflicts with class names. These conflicts are very inconvenient, since (without nameSpaces) you had to manually browse those files (before filing in) and change all names - which is especially inconvenient, since the systemBrowser cannot be used for this.
To allow a reasonable handling of this case, Smalltalk/X provides (starting with rel3.1) multiple namespaces, which effectively allow you to have two or more classes with the same name to reside in one image/executable.
By default, all classes are defined in the Smalltalk
namespace.
"{ NameSpace: NamespaceIdentifier }"
at the beginning of the ST-source file,
stc ... -nameSpace=NamespaceIdentifier ... file.st
Both are equivalent, and tell stc,
that all globals defined in this module are not to be
entered into the default nameSpace 'Smalltalk'
,
but instead into a space called NamespaceIdentifier.
Also, globals used within the compiled class are first searched for in
NamespaceIdentifier, THEN in the Smalltalk
nameSpace.
NamespaceIdentifier must be a single identifier starting with an
upper-case letter; underscores are allowed, but spaces or non-alphanumeric characters
are not.
(i.e. since a global variable will be created for it, it must
be a valid global variable identifier)
Smalltalk
.
Example:
You get some code, which defines a class "Button"
,
which should not
conflict with the builtin Button
class.
To allow both classes to reside in one image, either load it into
some (say) "MyWidgets" namespace using the fileBrowser,
or stc-compile it with:
The class defined by the module will then NOT conflict (i.e. overwrite)
the existing
stc -c -NMyWidgets filename.st
Button
.
The loaded class will not even be visible in the Smalltalk
dictionary.
However, classes within the same nameSpace may refer to the new class
as Button
.
In rare cases, it may be nescessary, to access globals from different
namespaces within one module. Consider the above case (Button
in
MyWidgets
), and you need access to the original Button
from within that
module.
To access to original Button
from within the module, you can either
use the explicit:
or use the (nonstandard) construct:
Smalltalk at:#Button
To access the new Button from other modules, use either:
Smalltalk::Button
or the (nonstandard):
MyWidgets at:#Button
MyWidgets::Button
For compatibility with VA and VW5.x, the dot-notation:
is also supported when filing in code.
MyWidgets.Button
Notice:
The following "using"-directive is not yet released (in vsn 3.1)
(its currently being evaluated and tested).
If you don't want to change the sourcecode, you can also define the
namespaces to use for searching in a line as:
at the beginning of the source file, or with an stc command line
argument (if you don't want to modify the file):
"{ Using: name1 name2 ... nameN }"
The names given define the nameSpaces to search for globals, in the
given order. Thus a line:
stc ... -Uname1 -Uname2 ... -UnameN filename.st
will force searching for globals in the
"{ Using: MyWidgets }"
MyWidgets
nameSpace first,
THEN in the standard Smalltalk
nameSpace;
thus the name Button
refers to MyWidgets::Button
automatically.
"Smalltalk at:#Button"
or "MyWidgets at:Button"
),
since this is compatible to other smalltalk implementations
(i.e. it can be simulated using poolDictionaries or changing some methods).
#foo
method
to an existing base system class. If this method is only required
within that application, AND the creation of instances of that class
is under that applications control, the following trick encapsulates
the added method in a nice way:
"{ NameSpace: MyNameSpace }"
NameOfSystemClass subclass:#NameOfSystemClass
...
...
...
<added foo method here>
All instance creations of "NameOfSystemClass"
(by code
within that nameSpace) will now
create instances of the modified subclass - which inherits and therefore
mimics the original classes's behavior except for the added foo
-method.
There is no need to add foo
to the main class.
Of course, the above has its limitations, in that subclasses of the original
baseclass are not affected by the new foo method - which could also be called
a feature, since those classes are completely protected
from any changes done in the private version ...
Starting with rel2.11, ST/X allows classes to be declared as
being private (i.e. owned) by some other class. These private classes are
not visible to the outside of the owning class - there may even be a
globally known class with the same name.
Private classes help in organizing large projects in that
additional information is hidden and name conflicts are avoided.
Certain restrictions apply to Private classes:
Like regular classes, private classes are created by a class definition
expression. Additional variants of the subclass-creation messages are
provided for private classes:
or:
superclass subclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
privateIn:OwningClass
and so on ...
superclass variableSubclass:#class
instanceVariableNames:'instVar1 instVar2...'
classVariableNames:'classVar1 classVar2...'
poolDictionaries:'sharedPool1...'
privateIn:OwningClass
(notice the additional privateIn: keyword argument.)
Within the owning class, any reference to class refers to the
private class - even if some other (global) class with that name exists.
A global class with the same name can be referred to as
"Smalltalk::
class",
or - if you prefer portable code -
with "Smalltalk at:
class".
Technically, a special classVariable is created, but its visibility is
limited to the owning class.
I.e. private classes are hidden from subclasses of the owning class.
This visibility is still to be evaluated and could be changed in future
versions.
If a private class is to be referenced by a subclass, use access methods
in the owners class protocol (which is better style, anyway).
Use the Systembrowsers "new private class" item in its class list menu, to get a template for private class creation.
Private classes and namespaces use the same basic mechanism - a namespace
is actually a dummy class, providing a home for its classes.
Therefore, you should avoid any conflicts between namespace names and class
names.
Recommendations:
In contrast to the above described nameSpace mechanism and fileIn format,
private classes use a slightly different definition format, which
is NOT backward compatible with systems that do not
support this feature.
We therefore recommend, to not use private classes for your projects,
but instead use namespaces, if you ever plan to port your application to other
smalltalk systems.
Conflicts within a namespace are much easier
to avoid than overall conflicts, and the added encapsulation provided by
having classes absolutely private is often not needed.
Of course, since ST/X's system classes are probably never of any interest to other system vendors, these can and do make use of private classes ;-)
Limitations:
A class may not be a subclass of one of its private classes
(technically, this constellation is possible to create in the browser,
but is not possible to fileIn).
Local ('here'-) sends
In some situations, it is strictly nescessary, that a send goes to a locally
defined method. For example, many private methods are supposed to be not
redefined by subclasses. In standard smalltalk, there is no way for an implementor
of a class, to make certain that his own methods are called by selfSends, if
other programmers use this class as (abstract-) superclass and create subclasses
based on it.
To offer some safety in this situation, Smalltalk/X offers an extension to the
standard smalltalk language, the so called hereSend.
It is used just like a superSend, using the (new) pseudovariable "here"
as
the receiver. The semantic of the hereSend is much like that of a superSend.
However, while a superSend starts the method lookup in the superclass of the
class which contains the method,
hereSends start it in the class containing the method.
(A normal selfSend starts it in the class of the receiver - independent of
where the method is defined.)
Also, remember that hereSends are a special ST/X feature. Code using them will probably not be portable to other smalltalk implementations.
thisContext markForXXX
to a method's code.
The supported pragmas are:
marks the context of the method as an exception-raising method.
This has the same effect as adding the statement
<exception: #raise>
"thisContext markForRaise"
to the beginning of the method.
marks the context of the method as an exception-handling method.
This has the same effect as adding the statement
<exception: #handle>
"thisContext markForHandle"
to the beginning of the method.
marks the context of the method as an unwind-handling method.
This has the same effect as adding the statement
<exception: #unwind>
"thisContext markForUnwind"
to the beginning of the method.
<context: #return>
marks the context of the method as a possible target of a return message -
i.e. the context will be created such that it will allow returning from
it via the #return message.
"thisContext sender sender return"
).
Primitive Definitions
Definitions and declarations common to
all primitive code in allmethods can be placed into a single global
primitive definition section.
Typically, C-include or C-define statements and/or type declarations are placed in these.
A primitive definition is defined with:
The contents of this chunk will be remembered internally and included whenever
methods which contain primitive code are to be compiled.
!className class primitiveDefinitions!
%{
... anything you like ...
%}
! !
Additional C-functions must be declared in a primitiveFunctions
chunk,
which is not included when individual methods are compiled (otherwise, you would
get linkage errors, due to multiple definitions of the same function).
Finally, C-variables are to be declared in a primitiveVariables
chunk.
The SystemBrowsers class menu includes items to show those
primitive definitions (for example, see the definitions in ExternalStream
).
Resource Definitions
Methods may be marked as resource-accessing-methods by adding
a definition like:
or:
<resource #resource>
Both of the above forms have no semantic meaning - except for the methods
being marked specially. This marking allows that those methods are
easier (and especially: quicker) to locate, without a need to
scan all methods source code.
<resource #resource ( list of additional symbols ) >
The launcher provides a menu item (in the classes .. menu),
to quickly search for specific resource accesses and open a browser on them.
For example, all methods which depend on the keyboard mapping are marked
in ST/X as:
<resource #keyboard>
If present, a resource definition must be at the very top of a method's code - before any local variable definition or statement (but after the methods argument specification).
Trick:
Resource definitions can also be used to mark methods for yourself, or
for your project management;
a definition like:
or:
<resource #toBeFixed>
may help you to locate those methods easily later.
<resource #toBeReviewed>
Do not use resource definitions for things which is common to ALL
of your methods (i.e. never automatically generate resources
containing your name, date or other version information).
Such information should be recorded in a method comment (ST/X already provides
a mechanism to do this automatically: the HistoryManager
, which
can be enabled via the launchers settings menu does exactly this for you).
If many methods are marked with a resource tag, the fast search will
degrade into a slow overall search.
Method Annotations
Methods can get additional attributes via annotations.
The syntax is similar to a resource or pragma definition, with an arbitrary keyword:
The annotation can be extracted from a method via the
<keyword: arguments...>
#hasAnnotations
,
#annotations
and similar messages.
Also, searching or other operations (such as marking menu methods) are possible using annotations.
Standard conventions
For the ST/X system, we use the following conventions
when marking a method with a resource symbol:
Sending via #perform:
is always possible, since this is equivalent
to a self-send (thus, even private and protected methods can be reached via #perform:
).
Late note:
We did not find this feature very useful (although many ex-c programmers asked for it in the first place),
and are probably not going to further support it in new browser versions.
Lexical Stuff
Some extensions to Smalltalk as described in the blue-book were made by
ParcPlace up to OW4.1.
Some of these extensions are also available in ST/X.
ByteArray literals
A literal byteArrays are created by enclosing the elements in #[ .. ]
.
The elements must be in the range 0 .. 255.
Example:
x := #[ 1 2 3 4 ].
masks := #[ 2r10000000
2r01000000
2r00100000 ]
Underline in identifiers
The underline character is treated like a letter when encountered in an identifier.
This extension was added to ST-80 with the introduction of rel4.
Notice, that the underline character parsed as an assignment token in older
ST-80 versions, which results in "var1_var2
" being parsable
both as a single identifier and as an assignment statement.
Currently, ST/X parses the above as an identifier iff no space characters
are contained in the construct. I.e. "var1_var2
" will parse as
a single identifier, while "var1 _ var2
" parses as an assignment.
This is compatible with most oldStyle code, but may lead to trouble if
spaces are missing; for example, the following code fragment
(found in the Squeak smalltalk system)
parses incorrectly, if the underline option is not turned off:
|foo|
foo_ 10.
The old-style assignment is supported to allow old Smalltalk code to be
loaded; however, it is recommend, to not use the underline character as an assignment
operator and convert old code to use the new syntax.
Future Smalltalk versions may no longer support this (backward compatible construct).
The degenerated identifier consisting of an underline alone is only
allowed within a keyword-message selector; i.e. the following is
legal: "self _:1 _:2 _:3
", and compiles to a #_:_:_:
message send.
For portable code, you should not use this, since not all other smalltalk implementations
allow this.
Non alphanumeric characters in symbols
Usually symbols are defined as #xxx,
where xxx consists of a letter followed
by letters or digits.
There are also keyword and binary symbol literals,
such as: #at:put:
, #at:
or #+
.
Symbols with other characters can be specified by enclosing them in single
quotes, where the first quote must immediately follow the '#'-character.
Example:
#'a symbol with spaces' - spaces
#'123' - starts with a digit
#'hello_world' - underscore
Symbols with unprintable characters must be created at runtime,
by sending #asSymbol
to an appropriate string.
Empty local variable declaration
The list of local variables may be empty, as in:
the same is true for blocks:
myMethod
| |
....
x := [:a | ] - as in-the-book
x := [:a | | | ] - with empty locals
Empty methods
a totally empty method is legal; it is equivalent to a simple ^ self
.
Thus:
all behave identically (returning self).
myMethod1
!
myMethod2
| |
!
myMethod3
| aLocal |
!
myMethod4
"only a comment"
!
myMethod5
^ self
!
Special 'constants' as Array literals
Smalltalk/X allows "nil", "true" and "false"
to be used in literal arrays.
Thus it is possible, to declare an array as:
Within an array literal, both simple identifiers AND identifiers prefixed by
the #-character are accepted and define a symbol within that literal.
#('string1' 'string2' nil 1 1.2 false true wow)
However, if a symbol named 'nil', 'true' or 'false' is required as an array
element (i.e not the value), a #-character MUST be preceeded, as in:
In the above example, the 5th element will be the symbol true, while
the last element will be the object true.
(Which -for your confusement- is the object bound to the symbol true :-)
#(1 2 3 #nil #true #false true)
'Double' constants
Although Smalltalk/X does not differentiate between Floats and Doubles
as Smalltalk-80 does (i.e. short floats vs. double-floats),
float constants with a trailing "d" are accepted. However, these
literals will be compiled in any case into an ST/X Float object (which
is the equivalent to a Double in ST-80).
This will be changed in an upcoming version.
End-of-line comments
Smalltalk/X allows special comments,
which start with the character sequence:
and are treated as a comment to the end if the source-line.
I.e. everything up to the end-of-line is ignored, even if it contains another
comment, or comment closing character.
Within string constants, this character sequence is ignored (i.e. not a comment).
"/ (double-quote followed by slash)
Notice, that this feature is NOT compatible to other ST versions; code containing these to-end-of-line comments will not compile on other Smalltalks.
However, it simplifies porting of existing code to ST/X, since parts of the code can be easily commented out, by adding "/ to the beginning of each such line.
Stc allows subclasses to define instance variables with the same name
as already defined in superclasses. Normally, to do so is not a good idea
and discuraged. However, in certain situations (i.e. only a binary of the
subclass is available or you do not want to or may not change the source),
allowing this makes sense.
The flag "-errorInstVarRedef"
tells stc
to output a warning instead of an error,
and continue with the compilation.
A typical use for this flag is when you want to port a class from some other
smalltalk implementation, which includes an instance variable conflict due to
a different internal implementation of one the classes superclasses in the
original smalltalk vs. Smalltalk/X.
With this flag, this new class will access its own instance variable under
that name (which was obviously the original intention when the class was
written).
This flag should be used only when porting (unmodifyable) code to ST/X -
new classes should follow the rules.
Lowercase vs. uppercase
Normally it is required (by convention - not by language syntax) that
all globals and class-variable names start with an upper case character,
while instance variables and method/block args & vars start with a lower case
character. By default, stc will stop compilation with an error if these rules
are not followed. The compiler flags "-errorLowerGlobal"
and
"-errorUpperLocal"
turn these into warning messages.
(even those warnings can be turned off ...)
These flags should only be
used when porting (unmodifyable) code to ST/X - new classes should
follow the rules.
The 'here' pseudovariable
Smalltalk/X supports another type of send beside the normal
'self' and 'super' sends: the 'here'-send.
To make this extension be compatible with existing code,
'here'
is only
recognized as the pseudoVariable, if no other variable named as
here
is defined in the compilation scope.
Thus, if any instance-, local or argument variable exists with a
name of 'here'
,
the compiler will produce code for a normal send - not creating 'here'-sends.
Read the above section on the semantic and use of 'here'-sends.
Extended Binary Operators
Starting with release 4.1.3,
binary operators may consist of up to 3 special characters
(the Blue Book specified a maximum of 2 characters).
Thus, it is now possible to define messages named: #'<=>'
, #'==>'
,
#'==='
or even #':=:'
.
Binary operators may be constructed from 1 to 3 characters from the following characterSet:
excluded is, of course, the assignment:
- + * / \
= < > ~
& | @ #
, ? ! % :
#':='
, and multiple hash characters
(for backward compatibility, ## is interpreted as the hash symbol itself).
Unicode String- and Character Literals
Starting with release 5.2, unicode is allowed in string- and character literals.
CharacterArrays will now be instances of String, Unicode16String or Unicode32String,
depending on the highest codePoint present in the string.
The string classes have been enhanced to both handle Unicode (isNationalLetter, isUppercase,
asUppercase etc.) and to perform automatic conversion as required.
(For example, when concatenating 8 bit and 16bit strings).
Notice that, although ST/X does handle 32 bit strings, both the X11 and the windows
display interfaces may be still limited to 16bit strings at the time of this
writing. Therefore, we recommend not going beyond a codePoint of 16rFFFF.
The external source code file format is now utf8.
For backward compatibility, ST/X marks utf8-encoded files by writing
an encoding pragma:
near the beginning of generated sourceFiles, and detects utf8 encoded
files by the presence of the "encoding:" string somewhere near the beginning of
the file.
"{ Encoding: utf8 }"
If no such pragma is found in a sourceFile, the file is assumed to be
iso8859-1 (i.e. latin1) encoded.
Tools which read or write external files (i.e. the bytecode compiler, the external stc-compilerm Workspace and FileBrowser) look for and care for this pragma.
Please note, that this format is backward compatible to other (non-utf8) smalltalks,
and it is still possible to fileIn ST/X source files into Squeak, VisualWorks etc.
This is actually even possible if non-ascii characters are present in String literals,
as these would appear in the target system as funny strings, which could (in theory) be still
utf8 decoded (manually in the browser, or at runtime or automatically during fileIn).
Sorry, but portability is lost if non-ascii Character literals are present in the filedOut code
- these will lead to a syntax error when filed into a non utf8 smalltalk system.
We therefore recommend to NOT use non-ascii character literals, instead use Strings wherever possible,
and use as "(Character value:xxx)" construct (which is evaluated at compile-time by the ST/X compilers)
when required.
Notice that the Smalltalk language has only been extended for String- and Character
literals.
Non-ascii letters/digits are still NOT allowed for message selectors, variable-
and class names etc.
This was done by purpose - allowing this would probably make the code less readable.
(and also much less portable).
Additional checks performed are:
!Someclass methodsFor:'foo and bar'!
foo
|var|
var ifTrue:[
... do something ...
].
!
or:
bar
|var1 var2|
var1 := var2 + 1
!
if compiled using the incremental bytecode compiler, the above methods
will lead to a runtime error (doesNotUnderstand
),
while stc refuses to compile these right away.
!Someclass methodsFor:'foo and bar'!
foo
||
... do something ...
!
stc will report a syntax error.
In the Blue Book, it is not specified if the above is
valid or not. Currently, the incremental bytecode compiler accepts it
(to allow easier fileIn of alien code), while stc refuses
to compile this.
UndefinedObject
SmallInteger
True
False
Float
(restriction removed with releae 2.10.5)
Symbol
subclass may not be
seen as true symbols in many places; subclasses of String
will return an instance of String
when asked to copy,
convert etc.
In general, be very careful in subclassing any of:
Float
String
Symbol
Context
& BlockContext
Method
& Block
Late note:
Some restrictions and strange behavior were removed with release 2.10.5.3;
now, you can subclass Context
, Method
,
Block
and Behavior
AND have these objects
be treated correctly by the VM's runtime system
(i.e. accept and treat them like other codeObjects and classObjects respectively).
It is planned for such features to be at least partially supported in future versions.
ifTrue:
,
ifFalse:
, whileTrue:
, whileFalse:
,
timesRepeat:
, to:do:
and to:by:do:
.
For to:do:
and to:by:do:
,
this bug will show up only for Integer arguments
where stc can deduce Integer types at compile time.
(anObject xxx) foo; bar; baz
while
anObject foo; bar; baz
will be ok.
!MyClass class primitiveDefinitions!
%{
struct abc {
int field1;
char field2;
};
%}
! !
!MyClass methodsFor:'foo'!
method
|local1 field2|
...
will lead to an error, since the name field2
is used both in a c-structure
and as a method local.
This may also happen with other C-names (i.e. typedefs,
structure names, enum values etc.) Care should be taken, since these
name conflicts may also be due to some #define
in an included
C header file.
Compiling code with such conflicts will usualy lead to errors in the C-compilation phase. Since stc does not parse (and understand) the structure of primitive code, it will not notice this conflict.
perform:withArguments:
.
A suggested workaround is to create some collection and put local values into that.
There is (currently) a limit of 31 temporaries, leading to a maximum expression nesting of 31 (since for every nesting level, one such temporary is needed).
The compiler is reusing temporaries as much as possible, so this limit is hardly ever reached - if it does, rewrite the complicated expression, using method locals as explicit temporaries.
This has been partially fixed with release 2.10.6:
LargeInteger constants with radix 2, 8, 10 and 16 are now supported,
up to a maximum value of 2^1023-1
Stc cannot currently generate LargeInteger constants.
Versions before 2.10.2 did not even detect overflow in integer constants,
silently generating wrong code.
Stc versions after 2.10.2 will quit compilation with an error.
You have to make sure, that your integer constants fit into 31 bits
(including the sign-bit, this gives 30bits of absolute value).
Thus, the following code will lead to a compilation error:
The built-in incremental compiler DOES handle large integer constants
correctly; the above only applies to stc-compilation.
|v|
v := 16r12345678. "ok, fits into 31 bits"
v printNL.
v := 16r87654321. "not ok, does not fit into 31 bits"
v printNL.
Add a class variable (such as MYLONGCONST) and initialize it in the
classes #initialize
method from a string.
I.e. instead of:
use:
...
x := 12345678901234567890.
...
...
classVariableNames:'MYCONST'
...
initialize
MYCONST := '12345678901234567890' asInteger.
...
...
x := MYCONST.
...
Starting with release 5.3, SharedPools are implemented as classes whose
class variables are imported and visible by other classes.
The pools are defined as subclass of SharedPool
,
and the values should be set in the sharedPool's #initialize
method.
See OpenGLConstants
as an example.
As a side effect of the implementation (in the current 5.3 release), any classes' set of classVariable can be imported by another class as a sharedPool. Do not depend on this, as this feature may be removed without notice in future versions.
myDict at:name
Initialize the dictionary in the classes' initialize method using:
myDict at:name1 put:value.
...
myDict at:nameN put:value.
...
"
commented out method definition
"
!
...
instead, you have to include the chunk separator ('!') in the comment:
...
"
commented out method definition
!
"
...
This is of course incompatible with the smalltalk fileOut format definition
and will be fixed in later stc versions.
Copyright © 1995 Claus Gittinger Development & Consulting
<cg@exept.de>