From GTAMods Wiki
Jump to navigation Jump to search

speech.dat is a file that handles ped speech in the RAGE Engine games.

File Versions

File version is typically indicated by the file extension.

Grand Theft Auto IV & Max Payne 3: Version 0 (no header), .dat
Grand Theft Auto V: Version 4 (0x04000000), .dat4
Red Dead Redemption 2: Version 14 (0x0E000000), .dat14

Version 0 (GTA IV & MP3)

The structure of Grand Theft Auto IV's and Max Payne 3's speech.dat file is relatively simple, being split into four chunks; speech durations, speech entries, speaker entries, and path nametable.

Speech Durations

Duration of the speech variations is ONLY used for DJ speech in radio stations. It's used alongside the data found within audio containers, if the duration is longer than the time thresholds within the audio container, the particular entry variation will not play over the song.

4b - INT32 - entries count
1b - entry array

Structure for each array entry:

1b - INT8 - variation duration

The duration in the speech file is based on the duration in seconds for each individual speech sound rounded to the nearest decimal point and multiplied by 10, for example, a speech sound GENERAL_03 with the duration of 9.72 seconds will be represented as 98 (0x62). Since the duration can only be stored in a single byte for each variation, it's not advised to have speech audio longer than 24 seconds.

Speech Entries

The main block that handles speech data.

4b - INT32 - entries count
14b - entry array

Structure for each array entry:

4b - INT32 - nametable entry id
4b - INT32 - speech duration array offset. used for DJ speech only, must be 0xFFFFFFFF for regular speech
4b - INT32 - name hash
2b - INT16 - number of variations

The name hash is the JOAAT Hash of the sound, without the numbered suffix. For example, FINDTHETIME_02 will become simply FINDTHETIME, without the numbers. Number of variations is where the name numbering becomes relevant. If the speech audio container has FINDTHETIME_01 and FINDTHETIME_02, you will only need a single FINDTHETIME entry with the number of variations set to 2. For DJ speech specifically, each variation will require a duration as covered previously.

Note: It is important that the array is sorted according to the speaker entries, i.e. all speech entries must be in 'blocks' for each speaker.

Speaker Entries

This block is responsible for mapping "speakers," as in the characters to whom the voice belongs to. It's extremely important to be correctly set for scripted speech purposes, as a misconfigured speaker data , will prevent the line from correctly playing in game.

4b - INT32 - entries count
10b - entry array

Structure for each array entry:

4b - INT32 - speech entry array offset
4b - INT32 - speaker name hash
2b - INT16 - number of speech entries

The speech entry array offset and number of speech entries are why sorting them above is important. An offset of 0x00000000 will start mapping the speaker to the first array entry, and will map every single subsequent entry until the specified number of speech entries is reached. The speaker hash is once again JOAAT, and can be anything for as long as the correct speaker name is used when scripting the conversation. DJ speech follows a very specific format: DJ + (radio name) + (audio container name: general/intro/time/to), example; DJ_VCFM_INTRO

Path Nametable

This final block stores information regarding the audio container paths.

4b - INT32 - entries count
4b - nametable offset array
4b - nametable array

Structure for nametable offset array entry:

4b - INT32 - nametable array offset 

Structure for nametable array entry:

string - string with the path of the audio container, null separated

The nametable entry id in speech entries points to a specific offset array entry.

Version 4 (GTA V)

This section is incomplete. You can help by fixing and expanding it.

Version 14 (RDR3)

This section is incomplete. You can help by fixing and expanding it.


The following tools can be used to edit speech.dat files.