Difference between revisions of "SCM Instruction"

From GTAMods Wiki
Jump to navigation Jump to search
m (Removed strange and incorrect remark about starting strings with spaces, and fixed a few typos.)
(Major rework)
Line 1: Line 1:
{{This|This article deals with the technical information on the opcode format. For the opcodes documentation see [[list of opcodes]]}}
+
{{This|This article deals with the technical information on the SCM instruction format. For a list of opcodes along with descriptions, see [[list of opcodes]].}}
Each script instruction is represented by a number called '''opcode''' or [[Wikipedia:Opcode|operation code]] which is implemented using an [[Wikipedia:UINT16|16 bit unsigned integer]]. By this number the game engine identifies an action to perform. Say, an opcode [[0001]] tells to wait for amount of time, [[0003]] shakes the camera, [[0053]] creates a player, etc.  
+
Each script instruction is represented by a number called an '''opcode''' (short for "[[Wikipedia:Opcode|operation code]]") which is a [[Wikipedia:UINT16|16 bit unsigned integer]]. The opcode tells the game which operation to perform. For example, the opcode [[0001]] tells the game to wait for a certain amount of time, [[0003]] shakes the game camera and [[0053]] creates a player.  
  
 
__TOC__
 
__TOC__
  
== Format ==
+
== Instruction Format ==
  
Initially the instructions are written in a human-readable form. For example, a ''wait'' instruction could look like this:
+
Initially the instructions are written as text. For example, a <code>wait</code> instruction could look like this:
 
  wait 0
 
  wait 0
where the word ''wait'' is a compiler-dependent representation of the opcode 0001 and the number <code>0</code> is a single parameter for this particular instruction. When a mission script is assembled, the instructions are written back in raw byte form:
+
where <code>wait</code> represents the opcode <code>0001</code> and the number <code>0</code> is an argument passed with the instruction. Different compilers may use different words to represent each opcode. When the code is compiled, the instruction is converted to raw bytes:
  
  {{hint|0100|opcode}}&nbsp;{{hint|04|Data type}}&nbsp;{{hint|00|Parameter value}}
+
  {{hint|01 00|Opcode}}&nbsp;{{hint|04|Data type}}&nbsp;{{hint|00|Argument}}
  
* First part is the opcode number in a [[Wikipedia:Endianness|little-endian]] format.
+
* The first two bytes (<code>01 00</code> hex) are the opcode bytes in [[Wikipedia:Endianness|little-endian]] order.
* Second part is the [[#Data types|data type]]
+
* The third byte (<code>04</code> hex) is the code for the [[#Data types|data type]].{{Ref|05B6|[*]}} <code>0x4</code> is the code for a signed byte.
* Third part is the parameter value
+
* The final byte (<code>00</code> hex) is the argument value. This is often more than one byte: the size is determined by the data type.
  
As it has been said, an opcode is UINT16 number. It means the minimum opcode is [[0000]] and maximum opcode is 0xFFFF. However due to a specific of the [[SCM language]], any numbers above 0x7FFF denote negative [[Conditional statement|conditional opcodes]]. The original unmodded game supports a way smaller amount of opcodes (maximum [[0A4E]] for [[San Andreas]]), but there are tools adding new ones, most notably [[CLEO|CLEO Library]].
+
While the opcode bytes are always present, there may not always be arguments passed, so there may be no argument bytes.
  
There could be zero or more instruction parameters following the opcode number{{Ref|05B6|[*]}}.
+
Although an unsigned 16-bit integer (<code>uint16</code>) can hold values in the range <code>0x0000</code> - <code>0xFFFF</code>, the range of used opcodes in unmodded games is much smaller. For example, [[San Andreas]] has opcodes going up to <code>0x0A4E</code>. There are tools that add more instructions to the game, most notably the [[CLEO|CLEO Library]].  
  
=== Parameters ===
+
Any values over the ''signed'' 16-bit integer maximum value (<code>0x7FFF</code>) denote negative [[Conditional statement|conditional opcodes]], which suggests that the opcodes are not always treated as unsigned integers by the game.
The game engine knows amount of parameters for each instruction (1 for [[0001]], 2 for [[0004]], 13 for [[014B]], etc). If the script contains other number of parameters it causes a crash.  
 
  
The value of a parameter could be one of following types:
+
<!--
 +
Changed "Parameters" to "Arguments" because arguments are the values passed, and parameters are the declared variables to which the arguments are assigned.
 +
By this rule, the following section discusses arguments more than it does parameters.
 +
-->
 +
== Arguments ==
 +
Each instruction takes a certain number of arguments. If a script passes an incorrect number of arguments, the game will crash.
 +
 
 +
An argument could be one of the following types:
 
* Immediate values
 
* Immediate values
** integer numbers
+
** Integer (<code>1</code>, <code>324</code>)
** floating-point numbers
+
** Float (<code>0.43</code>, <code>672.0</code>)
** [[#Strings|fixed-length strings]]
+
** Strings (<code>"abc"</code>, <code>""</code>)
** {{icon|sa}} [[#Strings|variable-length strings]]
+
*** [[#Strings|Fixed-length strings]]
 +
*** [[#Strings|Variable-length strings]] {{icon|sa}}
 
* Variables
 
* Variables
** global variables
+
** Global variables
** local variables
+
** Local variables
* {{icon|sa}} {{icon|lcs}} {{icon|vcs}} [[#Arrays|arrays]]
+
* [[#Arrays|Arrays]] {{icon|sa}} {{icon|lcs}} {{icon|vcs}}
  
A concrete type of the value is determined by a single byte written before it{{ref|vcstr|[*]}}. This byte is called a data type. The purpose of it is to tell to the game how much bytes to read and how to treat it.
+
=== Concrete data types ===
 +
Each of the types listed above can be represented in a number of ways in compiled code for various reasons:
 +
* Integer values can be signed or unsigned, and there are also different sizes of integer that the game can use.
 +
* There are multiple types of string that allow different numbers of characters.
 +
* Variables use different type codes based on the type of value that they are referencing.
  
==== Data types ====
+
The concrete type of an argument is determined by a single byte before the value bytes{{ref|vcstr|[*]}}. This byte tells the game what value is coming next so that it knows how many bytes to read, and how to treat the value once read.
A '''[[wikipedia:Data type|data type]]''' is a classification of identifying the type of a value. Types commonly used in GTA include integer numbers, floating-point numbers, and strings. The amount of data allowed to be stored is limited by the type and size of the data. For integers there are two ways to represent the data, signed and unsigned, whereas floating-point values are always signed. A signed data range includes negative numbers while unsigned do not include negatives. The following list shows the types and sizes of data in bytes.
 
  
 
{|class="mw-collapsible mw-collapsed wikitable"
 
{|class="mw-collapsible mw-collapsed wikitable"
!Data type<br/>(hex)
+
!Type code</br>(hex)
!Arg.<br/>length
+
!Value length</br>(bytes)
!Target<br/>game
+
!Support
 
!Description&nbsp;
 
!Description&nbsp;
 
|-
 
|-
!colspan=6|Typified
 
 
|-valign="top"
 
|-valign="top"
 
|00||0||{{icon|3}} {{icon|vc}} {{icon|sa}}||End of argument list (EOAL, [[004F]] or [[0913]] and similar){{ref|partype0|[*]}}
 
|00||0||{{icon|3}} {{icon|vc}} {{icon|sa}}||End of argument list (EOAL, [[004F]] or [[0913]] and similar){{ref|partype0|[*]}}
Line 148: Line 157:
 
m_pScriptPC += 6;</source>
 
m_pScriptPC += 6;</source>
 
|-
 
|-
!colspan=6|Untypified
+
!colspan=6|No type code
 
|-valign="top"
 
|-valign="top"
 
|N/A||8||{{icon|3}} {{icon|vc}}||Immediate 8-byte string{{ref|vcstr|[*]}}
 
|N/A||8||{{icon|3}} {{icon|vc}}||Immediate 8-byte string{{ref|vcstr|[*]}}
Line 159: Line 168:
 
|}
 
|}
  
Depending of the preceeding data type, the parameter value compiled in two bytes <code>{{hint|02 00|Little-endian order}}</code>, could be treated either as the global variable <code style="color:blue">$2</code>, or a local variable <code style="color:blue">2@</code> or a number of <code style="color:maroon">2</code>. The data type allows the game to determine the correct parameter meaning.
+
Type codes for [[Liberty City Stories]] and [[Vice City Stories]] are very different from other games:
 
+
* In some instances, the type code itself denotes the argument value. For example:
Data types for [[Liberty City Stories]] and [[Vice City Stories]] are much different. First of all, many data types itself denote an immediate value. For example, data type 01 is a value of 0, data type 02 the value 0.0, etc. Floating-point values are packed (1, 2 or 3 bytes of length instead of the [[Wikipedia:IEEE_754-1985#Single-precision_32-bit|common 4]]). Some data types itself are somewhat the identifier of a variable.
+
** <code>0x1</code> represents the integer value 0
 +
** <code>0x2</code> represents 0.0
 +
* The type code can sometimes denote a variable.
 +
* Floating point values are packed (8, 16 or 24 bits as opposed to the more common [[Wikipedia:IEEE_754-1985#Single-precision_32-bit|32 bits]]).
  
 
{|class="mw-collapsible mw-collapsed wikitable"
 
{|class="mw-collapsible mw-collapsed wikitable"
!Data type<br/>(hex)
+
!Type code</br>(hex)
!Arg.<br/>length
+
!Value length</br>(bytes)
!Target<br/>game
+
!Support
 
!Description&nbsp;
 
!Description&nbsp;
 
|-
 
|-
Line 270: Line 282:
 
{{Incomplete}}
 
{{Incomplete}}
  
==== Integer numbers ====
+
=== Integer numbers ===
  
An '''[[wikipedia:Integer (computer science)|integer]]''' is a number without a decimal or fractional component.
+
An [[wikipedia:Integer (computer science)|integer]] is a number without a decimal or fractional component.
  
 
{| class="wikitable"
 
{| class="wikitable"
!rowspan="2"| Size<br>({{hint|bytes|Each byte is 4 bits}}) ||colspan="6"| Range
+
!rowspan="2"| Size<br>(bytes) ||colspan="6"| Range
 
|-
 
|-
 
! || Signed || Name || || Unsigned || Name
 
! || Signed || Name || || Unsigned || Name
Line 286: Line 298:
 
|}
 
|}
  
==== Floating-point numbers ====
+
=== Floating-point numbers ===
A '''[[wikipedia:Floating point|floating point]]''' is a number with a decimal component and can store extremely large or small numbers while sacrificing significant digits. This is achieved by internally using exponents in scientific notation.
+
A [[wikipedia:Floating point|floating point]] number is a number with a decimal component.
  
 
{| class="wikitable"
 
{| class="wikitable"
! Size<br>({{hint|bytes|Each byte is 4 bits}}) || Range || Name
+
! Size<br>(bytes) || Range || Name
 
|-
 
|-
 
| 4 || ±1.1754944×10<sup>-38</sup> to ±3.4028234×10<sup>38</sup> || [[wikipedia:Single precision floating-point format|SINGLE]], FLOAT
 
| 4 || ±1.1754944×10<sup>-38</sup> to ±3.4028234×10<sup>38</sup> || [[wikipedia:Single precision floating-point format|SINGLE]], FLOAT
 
|}
 
|}
  
==== Strings ====
+
=== Strings ===
A ''string'' is a sequence of characters not treated as numbers. Those include {{hint|letters|A..Z}}, {{hint|numbers|0..9}}, and other symbols like _ or @. Strings can start with any character (spaces included).
+
A [[wikipedia:String (computer science)|string]] is a piece of text. Strings can include letters, numbers and symbols.
  
There are two kinds of strings.
+
There are two kinds of string:
 +
* Fixed-length. This is the most common type of string and has been used since [[GTA 3]]. The string length is fixed. When compiled these strings occupy a certain number of bytes (8 or 16) even if the text is actually shorter (any unused bytes are filled with [[Wikipedia:null terminator|null values]]).
 +
* Variable length (SA only). Variable-length strings are encoded as a single byte specifying the length followed by the string character bytes. These strings are not null-terminated. The maximum length depends on the instruction{{Ref|longstringslimits|[*]}}. The longest in the original game is 40 characters.
  
{{note|str8}} A '''fixed-length string''' or a null-terminated string. This is the most common type and has been used since [[GTA 3]]. The string length is fixed. When compiled these strings occupy 8 bytes of a [[SCM]] file even if they are actually shorter (the rest is filled with [[Wikipedia:null terminator|null bytes]]).
 
 
{{note|sa16}} San Andreas introduces data type 15 for strings containing up to 15 characters. They behave the same as 8 byte strings, but always occupy 16 bytes in a SCM file. These strings are only supported by [[Sanny Builder]].
 
 
{|{{Prettytable}}
 
{|{{Prettytable}}
!width="250px" align="left"|String
+
!width="250px" align="left"|String value
!width="250px" align="left"|Equivalent in SCM
+
!width="250px" align="left"|Compiled bytes
 +
|-
 +
| <code>"MAIN"</code>||{{hint|09|Type code}}&nbsp;&nbsp;&nbsp;{{hint|4D 41 49 4E 00 00 00 00|ASCII values of string characters}}
 
|-
 
|-
| <span style="color:red">'MAIN'</span>||{{hint|09|Data type}}&nbsp;&nbsp;&nbsp;{{hint|4D 41 49 4E 00 00 00 00|ASCII code of string characters}}
+
| <code>"MODDING"</code>||{{hint|09|Type code}}&nbsp;&nbsp;&nbsp;{{hint|4D 4F 44 44 49 4E 47 00|ASCII values of string characters}}
 
|-
 
|-
| <span style="color:red">'MODDING'</span>||{{hint|09|Data type}}&nbsp;&nbsp;&nbsp;{{hint|4D 4F 44 44 49 4E 47 00|ASCII code of string characters}}
+
| <code>"SAVE_OUR_SOULS!"</code>||{{hint|0F|Type code}}&nbsp;&nbsp;&nbsp;{{hint|53 41 56 45 5F 4F 55 52 5F 53 4F 55 4C 53 21 00|ASCII values of string characters}}
 
|-
 
|-
| <span style="color:red">'SAVE_OUR_SOULS!'</span>||{{hint|0F|Data type}}&nbsp;&nbsp;&nbsp;{{hint|53 41 56 45 5F 4F 55 52 5F 53 4F 55 4C 53 21 00|ASCII code of string characters}}
+
| <code>"Variable length string"</code>||{{hint|0E|Type code}}&nbsp;&nbsp;&nbsp;{{hint|16|Length (22 decimal)}}&nbsp;&nbsp;&nbsp;{{hint|56 61 72 69 61 62 6C 65 20 6C 65 6E 67 74 68 20 73 74 72 69 6E 67|ASCII values of string characters}}
 
|}
 
|}
{{note|str16}} A '''variable-length string'''. This type was first introduced in San Andreas. Maximum length depends on the instruction{{Ref|longstringslimits|[*]}} (the longest parameter ever read has got 40 characters).
 
 
{{Incomplete}}
 
 
==== Arrays ====
 
{{note|varray}} Native ''[[Wikipedia:Array|arrays]]'' support was introduced in GTA SA, however there were different implementations of [[VC_Arrays|arrays in Vice City]]. In SA SCM arrays are assembled as 2 ''UINT16''s, 1 ''INT8'' and a ''UINT8'':
 
2b - UINT16 - array offset{{ref|arrayoffset|[*]}}
 
2b - UINT16 - array index{{ref|arrayindex|[*]}}
 
1b - INT8  - array size
 
1b - UINT8  - array properties
 
 
{{note|arrayoffset}} An array offset is basically a variable number. If it's a global array, the offset is a global variable index from which the array begins. For example, if the global array offset is 150 (<code>{{hint|96 00|Little-endian order}}</code>) it means that the first element of the array is <span style="color:blue">$150</span>, the second one is <span style="color:blue">$151</span>, etc. Same valid for the local arrays (offset is a local variable index). <!-- global variables are multiplied by 4 -->
 
  
{{note|arrayindex}} An array index is a variable number (global or local one) that holds the value of array index. For example, if array index is 3 (<code>{{hint|03 00|Little-endian order}}</code>), the game will read either global variable <span style="color:blue">$3</span> or local variable <span style="color:blue">3@</span> depending on the array properties (see below). This variable holds the number which is array element ID to work with. For example, if the array index is <span style="color:blue">$3</span>, and <span style="color:blue">$3</span> holds number <span style="color:maroon">5</span>, the game will read 5th element of the array.
+
=== Arrays ===
 +
In GTA SA, an array is a static reference to a group of successive variables. There is no actual representation of an array ''value'' as a concrete type, they do have type codes. When an "array" is passed as an argument to an instruction, what is actually passed is a specific variable in that array.
  
===== Properties =====
+
This section describes the format of array accesses in GTA SA. Vice City also has arrays, but [[VC_Arrays|in a different format]].
Array properties describe the data type of each array element, held by the first 7 bits of the reference field, plus a flag which signals if the array was declared in a global scope, as the most significant bit indicates:
 
  
<source lang="cpp" style="margin-top: 8px">enum eArrayElementType
+
<source lang="cpp">
{
+
// Example: not real code.
ELEMTYPE_INT,
+
// Applies only to GTA: SA.
ELEMTYPE_FLOAT,
+
struct ArrayAccess {
ELEMTYPE_TEXT_LABEL,
+
    enum ElementType : uint8_t {
ELEMTYPE_TEXT_LABEL16
+
        Int,
 +
        Float,
 +
        String8,
 +
        String16
 +
    };
 +
   
 +
    // Offset of first variable in the array.
 +
    uint16_t startOffset;
 +
   
 +
    // Index being accessed ("array[index]").
 +
    // This can be a local variable ("array[someLocalVar]") or a global variable ("array[someGlobalVar]").
 +
    uint16_t index;
 +
   
 +
    // Array length.
 +
    int8_t length;
 +
   
 +
    // Array element type. 7 bits.
 +
    ElementType type : 7;
 +
   
 +
    // Determines whether the index is a global variable (true) or a local variable (false).
 +
    // Only 1 bit.
 +
    bool globalIndex : 1;
 
};
 
};
 +
</source>
  
struct ArrayProperties
+
The array offset is a variable which should have a value more than or equal to 0 and smaller than the array size. Global and local variables can be used as index variables – the purpose of the <code>globalIndex</code> field is to tell the game which has been used. The first element in the array is at index <code>0</code>, and the last is at <code>length - 1</code>.
{
 
unsigned char m_nElementType : 7;
 
unsigned char m_bIsIndexGlobalVar : 1;
 
};</source>
 
  
 
{|{{Prettytable}}
 
{|{{Prettytable}}
Line 355: Line 374:
  
 
== Notes ==
 
== Notes ==
{{note|vcstr}} In [[GTA 3]], [[Vice City]] and [[Liberty City Stories]] short strings (8 bytes) have no data type preceeding it. If the byte does not fit data type range (00-06 for GTA 3 and VC), it's recognized as a beginning of a string and next 8 bytes are read.
+
{{note|vcstr}} In [[GTA 3]], [[Vice City]] and [[Liberty City Stories]], short strings (8 characters) have no type code. If the first byte of an argument does not fit data type range (<code>0x0</code> - <code>0x6</code> for GTA 3 and VC), it's recognized as the beginning of a string and the remaining 7 bytes of the string are read.
  
{{note|partype0}} Some instructions have variable amount of parameters. The most known one is [[004F]] that creates a new [[thread]] and passes arguments to it. The number of such parameters could vary, so the special data type denotes the end of parameters.
+
{{note|partype0}} Some instructions have a variable number of parameters. One such instruction is [[004F]] that creates a new [[thread]]. To allow extra script setup, the instruction can take a variable number of arguments. The game uses the special data type to end such argument lists.
  
The maximum amount of parameters for any instruction is 16 for GTA 3 and VC, 32 for SA, LCS and VCS. However, those that admit an undefined amount of arguments can pass 18 parameters for GTA 3 and VC, 34 for SA, 106 for LCS and VCS (this information still needs confirmation).
+
For non-variadic instructions, the maximum number of parameters is 16 for GTA 3 and VC and 32 for SA, LCS and VCS. Variadic instructions can be passed a maximum of 18 arguments in GTA 3 and VC; 34 for SA; and 106 for LCS and VCS, though these values are unconfirmed.
  
{{note|05B6}} {{Icon|SA}} Opcode [[05B6]] is a special instruction that defines a table. Immediately after opcode number the stream of data (128 bytes) follows, without a data type.
+
{{note|05B6}} In {{Icon|SA}}, [[05B6]] is a special instruction that defines a table. Immediately after the opcode a 128 byte stream of data follows, without a type code.
  
 
{{note|longstringslimits}} {{GTAF|post|261006|3940262|Post by Seemann describing limits for the long strings in SA}}
 
{{note|longstringslimits}} {{GTAF|post|261006|3940262|Post by Seemann describing limits for the long strings in SA}}

Revision as of 11:42, 18 August 2020

This article deals with the technical information on the SCM instruction format. For a list of opcodes along with descriptions, see list of opcodes.

Each script instruction is represented by a number called an opcode (short for "operation code") which is a 16 bit unsigned integer. The opcode tells the game which operation to perform. For example, the opcode 0001 tells the game to wait for a certain amount of time, 0003 shakes the game camera and 0053 creates a player.

Instruction Format

Initially the instructions are written as text. For example, a wait instruction could look like this:

wait 0

where wait represents the opcode 0001 and the number 0 is an argument passed with the instruction. Different compilers may use different words to represent each opcode. When the code is compiled, the instruction is converted to raw bytes:

01 00 04 00
  • The first two bytes (01 00 hex) are the opcode bytes in little-endian order.
  • The third byte (04 hex) is the code for the data type.[*] 0x4 is the code for a signed byte.
  • The final byte (00 hex) is the argument value. This is often more than one byte: the size is determined by the data type.

While the opcode bytes are always present, there may not always be arguments passed, so there may be no argument bytes.

Although an unsigned 16-bit integer (uint16) can hold values in the range 0x0000 - 0xFFFF, the range of used opcodes in unmodded games is much smaller. For example, San Andreas has opcodes going up to 0x0A4E. There are tools that add more instructions to the game, most notably the CLEO Library.

Any values over the signed 16-bit integer maximum value (0x7FFF) denote negative conditional opcodes, which suggests that the opcodes are not always treated as unsigned integers by the game.

Arguments

Each instruction takes a certain number of arguments. If a script passes an incorrect number of arguments, the game will crash.

An argument could be one of the following types:

Concrete data types

Each of the types listed above can be represented in a number of ways in compiled code for various reasons:

  • Integer values can be signed or unsigned, and there are also different sizes of integer that the game can use.
  • There are multiple types of string that allow different numbers of characters.
  • Variables use different type codes based on the type of value that they are referencing.

The concrete type of an argument is determined by a single byte before the value bytes[*]. This byte tells the game what value is coming next so that it knows how many bytes to read, and how to treat the value once read.

Type code
(hex)
Value length
(bytes)
Support Description 
00 0 GTA III Vice City San Andreas End of argument list (EOAL, 004F or 0913 and similar)[*]
01 4 GTA III Vice City San Andreas Immediate 32-bit signed int
scriptParam.m_iIntValue = *(int *)m_pScriptPC;
m_pScriptPC += 4;
02 2 GTA III Vice City San Andreas Global integer/floating-point variable
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
m_pScriptPC += 2;
03 2 GTA III Vice City San Andreas Local integer/floating-point variable
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
m_pScriptPC += 2;
04 1 GTA III Vice City San Andreas Immediate 8-bit signed int
scriptParam.m_iIntValue = *(char *)m_pScriptPC++;
05 2 GTA III Vice City San Andreas Immediate 16-bit signed int
scriptParam.m_iIntValue = *(short *)m_pScriptPC;
m_pScriptPC += 2;
06 2 GTA III Immediate 16-bit fixed-point (see remark)
scriptParam.m_fFloatValue = (float)(*(short *)m_pScriptPC) / 16.0f;
m_pScriptPC += 2;
06 4 Vice City San Andreas Immediate 32-bit floating-point
scriptParam.m_fFloatValue = *(float *)m_pScriptPC;
m_pScriptPC += 4;
07 6 San Andreas Global integer/floating-point array[*]
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
08 6 San Andreas Local integer/floating-point array[*]
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
09 8 San Andreas Immediate 8-byte string[*]
strcpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC);
m_pScriptPC += 8;
0A 2 San Andreas Global 8-byte string variable
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
m_pScriptPC += 2;
0B 2 San Andreas Local 8-byte string variable
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
m_pScriptPC += 2;
0C 6 San Andreas Global 8-byte string array[*]
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
0D 6 San Andreas Local 8-byte string array[*]
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
0E 1 + (n - 1) San Andreas Immediate variable-length string[*] (non null-terminated)
char cStrLength = *(char *)m_pScriptPC++;
strncpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC, cStrLength);
memset(&scriptParam.m_szTextLabel[cStrLength], '\0', ucMaxLength - cStrLength);
m_pScriptPC += cStrLength;
0F 16 San Andreas Immediate 16-byte string[*]
strcpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC);
m_pScriptPC += 16;
10 2 San Andreas Global 16-byte string variable
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
m_pScriptPC += 2;
11 2 San Andreas Local 16-byte string variable
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
m_pScriptPC += 2;
12 6 San Andreas Global 16-byte string array[*]
scriptParam.m_usGlobalOffset = *(unsigned short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
13 6 San Andreas Local 16-byte string array[*]
scriptParam.m_sLocalVar = *(short *)m_pScriptPC;
scriptParam.m_sArrayIndexVar = *(short *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 4);
scriptParam.m_arrayProperties = *(ArrayProperties *)(m_pScriptPC + 5);
m_pScriptPC += 6;
No type code
N/A 8 GTA III Vice City Immediate 8-byte string[*]
strcpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC);
m_pScriptPC += 8;
N/A 128 San Andreas Immediate 128-byte string
strcpy(scriptParam.m_szString, (char *)m_pScriptPC);
m_pScriptPC += 128;

Type codes for Liberty City Stories and Vice City Stories are very different from other games:

  • In some instances, the type code itself denotes the argument value. For example:
    • 0x1 represents the integer value 0
    • 0x2 represents 0.0
  • The type code can sometimes denote a variable.
  • Floating point values are packed (8, 16 or 24 bits as opposed to the more common 32 bits).
Type code
(hex)
Value length
(bytes)
Support Description 
Typified
00 0 Liberty City Stories Vice City Stories End of argument list (EOAL)
01 0 Liberty City Stories Vice City Stories Immediate 8-bit signed integer constant 0
scriptParam.m_iIntValue = 0;
02 0 Liberty City Stories Vice City Stories Immediate 8-bit floating-point constant 0.0
scriptParam.m_fFloatValue = 0.0f;
03 1 Liberty City Stories Vice City Stories Immediate 8-bit packed floating-point
unsigned int uiUnpackedFloat = *(unsigned char *)m_pScriptPC++ << 24;
scriptParam.m_fFloatValue = *(float *)&uiUnpackedFloat;
04 2 Liberty City Stories Vice City Stories Immediate 16-bit packed floating-point
unsigned int uiUnpackedFloat = *(unsigned short *)m_pScriptPC << 16;
scriptParam.m_fFloatValue = *(float *)&uiUnpackedFloat;
m_pScriptPC += 2;
05 3 Liberty City Stories Vice City Stories Immediate 24-bit packed floating-point
unsigned int uiUnpackedFloat
    = (*(unsigned short *)m_pScriptPC << 16)
    | (*(unsigned char *)(m_pScriptPC + 2) << 8);
scriptParam.m_fFloatValue = *(float *)&uiUnpackedFloat;
m_pScriptPC += 3;
06 4 Liberty City Stories Vice City Stories Immediate 32-bit signed integer
scriptParam.m_iIntValue = *(int *)m_pScriptPC;
m_pScriptPC += 4;
07 1 Liberty City Stories Vice City Stories Immediate 8-bit signed integer
scriptParam.m_iIntValue = *(char *)m_pScriptPC++;
08 2 Liberty City Stories Vice City Stories Immediate 16-bit signed integer
scriptParam.m_iIntValue = *(short *)m_pScriptPC;
m_pScriptPC += 2;
09 4 Liberty City Stories Vice City Stories Immediate 32-bit floating-point
scriptParam.m_fFloatValue = *(float *)m_pScriptPC;
m_pScriptPC += 4;
0A n + NUL Vice City Stories Immediate null-terminated string[*]
strcpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC);
m_pScriptPC += strlen((char *)m_pScriptPC) + 1;
Untypified (script variables)
0A..0B 1 Liberty City Stories Local timers (TIMERA, TIMERB)
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC++ + 0x5E;
0B..0C 1 Vice City Stories Local timers (TIMERA, TIMERB)
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC++ + 0x5D;
0C..6B 1 Liberty City Stories Local integer/floating-point variable
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC++ - 0x0C;
0D..6C 1 Vice City Stories Local integer/floating-point variable
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC++ - 0x0D;
6C..CB 3 Liberty City Stories Local integer/floating-point array
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC - 0x6C;
scriptParam.m_sArrayIndex = *(unsigned char *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 3);
m_pScriptPC += 3;
6D..CC 3 Vice City Stories Local integer/floating-point array
scriptParam.m_sLocalVar = *(unsigned char *)m_pScriptPC - 0x6D;
scriptParam.m_sArrayIndex = *(unsigned char *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 3);
m_pScriptPC += 3;
CC..E5 2 Liberty City Stories Global integer/floating-point variable[*]
unsigned short usBigEndianWord = *(unsigned short *)m_pScriptPC - 0x00CC;
scriptParam.m_sGlobalVar = (short)((usBigEndianWord << 8) | (usBigEndianWord >> 8));
m_pScriptPC += 2;
CD..E5 2 Vice City Stories Global integer/floating-point variable[*]
unsigned short usBigEndianWord = *(unsigned short *)m_pScriptPC - 0x00CD;
scriptParam.m_sGlobalVar = (short)((usBigEndianWord << 8) | (usBigEndianWord >> 8));
m_pScriptPC += 2;
E6..FF 4 Liberty City Stories Vice City Stories Global integer/floating-point array
unsigned short usBigEndianWord = *(unsigned short *)m_pScriptPC - 0x00E6;
scriptParam.m_sGlobalVar = (short)((usBigEndianWord << 8) | (usBigEndianWord >> 8));
scriptParam.m_sArrayIndex = *(unsigned char *)(m_pScriptPC + 2);
scriptParam.m_ucArraySize = *(unsigned char *)(m_pScriptPC + 3);
m_pScriptPC += 4;
N/A 8 Liberty City Stories Immediate 8-byte string
strcpy(scriptParam.m_szTextLabel, (char *)m_pScriptPC);
m_pScriptPC += 8;

^ This type was introduced in VCS due to the presence of string variables.

^ Given the data type range limit the largest global variable in LCS is $6655, in VCS $6399.

All the data types above haven't been tested in a decompiling process yet, hence they still need a practical confirmation.

This section is incomplete. You can help by fixing and expanding it.

Integer numbers

An integer is a number without a decimal or fractional component.

Size
(bytes)
Range
Signed Name Unsigned Name
1 -128 to 127 INT8, CHAR 0 to 255 UINT8, BYTE
2 -32,768 to 32,767 INT16, SHORT 0 to 65,535 UINT16, WORD, USHORT
4 -2,147,483,648 to 2,147,483,647 INT32, LONG 0 to 4,294,967,295 UINT32, DWORD, ULONG

Floating-point numbers

A floating point number is a number with a decimal component.

Size
(bytes)
Range Name
4 ±1.1754944×10-38 to ±3.4028234×1038 SINGLE, FLOAT

Strings

A string is a piece of text. Strings can include letters, numbers and symbols.

There are two kinds of string:

  • Fixed-length. This is the most common type of string and has been used since GTA 3. The string length is fixed. When compiled these strings occupy a certain number of bytes (8 or 16) even if the text is actually shorter (any unused bytes are filled with null values).
  • Variable length (SA only). Variable-length strings are encoded as a single byte specifying the length followed by the string character bytes. These strings are not null-terminated. The maximum length depends on the instruction[*]. The longest in the original game is 40 characters.
String value Compiled bytes
"MAIN" 09   4D 41 49 4E 00 00 00 00
"MODDING" 09   4D 4F 44 44 49 4E 47 00
"SAVE_OUR_SOULS!" 0F   53 41 56 45 5F 4F 55 52 5F 53 4F 55 4C 53 21 00
"Variable length string" 0E   16   56 61 72 69 61 62 6C 65 20 6C 65 6E 67 74 68 20 73 74 72 69 6E 67

Arrays

In GTA SA, an array is a static reference to a group of successive variables. There is no actual representation of an array value as a concrete type, they do have type codes. When an "array" is passed as an argument to an instruction, what is actually passed is a specific variable in that array.

This section describes the format of array accesses in GTA SA. Vice City also has arrays, but in a different format.

// Example: not real code.
// Applies only to GTA: SA.
struct ArrayAccess {
    enum ElementType : uint8_t {
        Int,
        Float,
        String8,
        String16
    };
    
    // Offset of first variable in the array.
    uint16_t startOffset;
    
    // Index being accessed ("array[index]").
    // This can be a local variable ("array[someLocalVar]") or a global variable ("array[someGlobalVar]").
    uint16_t index;
    
    // Array length.
    int8_t length;
    
    // Array element type. 7 bits.
    ElementType type : 7;
    
    // Determines whether the index is a global variable (true) or a local variable (false).
    // Only 1 bit.
    bool globalIndex : 1;
};

The array offset is a variable which should have a value more than or equal to 0 and smaller than the array size. Global and local variables can be used as index variables – the purpose of the globalIndex field is to tell the game which has been used. The first element in the array is at index 0, and the last is at length - 1.

Array Equivalent in SCM
$150(3@,6f) 07   96 00 03 00 06 01
10@(9@,5s) 0D   0A 00 09 00 05 02

Notes

^ In GTA 3, Vice City and Liberty City Stories, short strings (8 characters) have no type code. If the first byte of an argument does not fit data type range (0x0 - 0x6 for GTA 3 and VC), it's recognized as the beginning of a string and the remaining 7 bytes of the string are read.

^ Some instructions have a variable number of parameters. One such instruction is 004F that creates a new thread. To allow extra script setup, the instruction can take a variable number of arguments. The game uses the special data type to end such argument lists.

For non-variadic instructions, the maximum number of parameters is 16 for GTA 3 and VC and 32 for SA, LCS and VCS. Variadic instructions can be passed a maximum of 18 arguments in GTA 3 and VC; 34 for SA; and 106 for LCS and VCS, though these values are unconfirmed.

^ In San Andreas, 05B6 is a special instruction that defines a table. Immediately after the opcode a 128 byte stream of data follows, without a type code.

^ Post.png GTAForums: Post by Seemann describing limits for the long strings in SA

See also