ModEnc is currently in Maintenance Mode: Changes could occur at any given moment, without advance warning.

Difference between revisions of "CSF File Format"

From ModEnc
Jump to: navigation, search
(corrected by VK(tm))
(Styling updates, corrections based on executable analysis)
Line 8: Line 8:
 
It is built up like this:
 
It is built up like this:
  
{| cellspacing="2" cellpadding="4" style="border-collapse: collapse; border: 1px solid #000000;background:#F0F0F0;"
+
{| cellspacing="2" cellpadding="4" class="table_descrowdesccol"
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Offset
+
!Offset
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Type
+
!Type
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-left: 1px solid #000000;border-right: 1px solid #000000;"|Description
+
!Description
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x0
+
!0x0
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|char[4]
+
|char[4]
|style="border: 1px solid #aaaaaa;"|
+
|'''"{{tt| FSC}}"'''<br>CSF header identifier<br>If this is not "{{tt| FSC}}", the game will not load the file.
'''" FSC"'''<br>CSF header identifier<br>If this is not " FSC", the game will not load the file.
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x4
+
!0x4
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|
+
|'''CSF Version'''<br>The version number of the CSF format.<br>RA2, YR, Generals, ZH and the BFME series use version 3.<br>Nox uses version 2.<br>Nothing is known about the actual difference between the versions.<br>Thanks to Siberian GRemlin for providing this information ([http://www.ppmsite.com/forum/viewtopic.php?p=130667#130667 see here])!
'''CSF Version'''<br>The version number of the CSF format.<br>RA2, YR, Generals, ZH and the BFME series use version 3.<br>Nox uses version 2.<br>Nothing is known about the actual difference between the versions.<br>
 
Thanks to Siberian GRemlin for providing this information ([http://www.ppmsite.com/forum/viewtopic.php?p=130667#130667 see here])!
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x8
+
!0x8
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|
+
|'''NumLabels'''<br>The total amount of labels in the stringtable.
'''NumLabels'''<br>The total amount of labels in the stringtable.
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-right: 1px solid #000000;border-top: 1px solid #000000;"|0xC
+
!0xC
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|
+
|'''NumStrings'''<br>The total amount of string pairs in the stringtable.<br>
'''NumExtraValues'''<br>The total amount of extra values in the stringtable.
+
''(A string pair is made up of a Unicode {{tt|Value}} and an ASCII {{tt|ExtraValue}}, a label can contain more than one such pair, but only the first pair's {{tt|Value}} is ever actually used by the game.)''
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x10
+
!0x10
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|
+
|'''(unused)'''<br>This is not used by the game, which means it is useless.<br>If you want, you can store an extra information tag there, if your program could use one (assuming you want to write a program that reads CSF files).
'''(nothing)'''<br>This is not read or used by the game, which means it is useless.<br>If you want, you can store an extra information tag there, if your program could use one (assuming you want to write a program that reads CSF files).
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x14
+
!0x14
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|
+
|'''Language'''<br>The language value for this stringtable.<br>See below for a list
'''Language'''<br>The language value for this stringtable.<br>See below for a list
 
 
|}
 
|}
  
Line 67: Line 61:
  
 
A label can be considered an entry in the stringtable (e.g. "GUI:OK" is a label).<br>
 
A label can be considered an entry in the stringtable (e.g. "GUI:OK" is a label).<br>
Each label can have a '''name''' (e.g. "NAME:MTNK"), a '''value''' (e.g. "Grizzly Tank") and an '''extra value''' (no example in the original ra2.csf/ra2md.csf).<br>
+
Each label has a '''name''' (ASCII string, e.g. "{{tt|NAME:MTNK}}") and zero or more string pairs. As mentioned above, a string pair is made up of a Unicode {{tt|Value}} (e.g. "Grizzly Tank") and an ASCII {{tt|ExtraValue}} (no example in the original ra2.csf/ra2md.csf, not used by the game).<br>
While the name and the extra value are ASCII strings, the value is a Unicode string (in order to support Korean, Chinese, etc).
 
  
 
Now let's come to how the data is stored in the CSF file:
 
Now let's come to how the data is stored in the CSF file:
Line 74: Line 67:
 
===Label header===
 
===Label header===
 
The label data begins with a label header, which is built up like this:
 
The label data begins with a label header, which is built up like this:
{| cellspacing="2" cellpadding="4" style="border-collapse: collapse; border: 1px solid #000000;background:#F0F0F0;"
+
{| cellspacing="2" cellpadding="4" class="table_descrowdesccol"
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Offset
+
!Offset
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Type
+
!Type
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-left: 1px solid #000000;border-right: 1px solid #000000;"|Description
+
!Description
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x0
+
!0x0
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|char[4]
+
|char[4]
|style="border: 1px solid #aaaaaa;"|
+
|'''"{{tt| LBL}}"'''<br>Label identifier<br>If this is not "{{tt| LBL}}", the game will not recognize the following data as label data and read the next 4 bytes.
'''" LBL"'''<br>Label identifier<br>If this is not " LBL", the game will not recognize the following data as label data and read the next 4 bytes.
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x4
+
!0x4
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border: 1px solid #aaaaaa;"|
+
|'''Number of string pairs'''<br>This is the number of string pairs associated with this label. Usual value is 1.
'''Number of sub-strings'''<br>This is the number of sub-strings.Usual value is 1
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x8
+
!0x8
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border: 1px solid #aaaaaa;"|
+
|'''LabelNameLength'''<br>This value holds the size of the label name that follows.
'''LabelNameLength'''<br>This value holds the size of the label name that follows.
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0xC
+
!0xC
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|char[LabelNameLength]
+
|char[LabelNameLength]
|style="border: 1px solid #aaaaaa;"|
+
|'''LabelName'''<br>A '''non'''-zero-terminated string that is as long as the DWORD at 0x8 says. If it is longer, the rest will be cut off.
'''LabelName'''<br>A '''non'''-zero-terminated string that is as long as the DWORD at 0x8 says. If it is longer, the rest will be cut off.
 
 
|}
 
|}
 
The first label in ra2md.csf can be found at 0x18.<br>'''Note:''' Spaces, tabs and line breaks will be formatted out of the label's name, therefore they cannot be used.
 
The first label in ra2md.csf can be found at 0x18.<br>'''Note:''' Spaces, tabs and line breaks will be formatted out of the label's name, therefore they cannot be used.
  
 
===Values===
 
===Values===
Directly after the label header, the value data follows.<br>
+
Directly after the label header, the value data (string pairs) follows.<br>
 
This is how it is built up:
 
This is how it is built up:
{| cellspacing="2" cellpadding="4" style="border-collapse: collapse; border: 1px solid #000000;background:#F0F0F0;"
+
{| cellspacing="2" cellpadding="4" class="table_descrowdesccol"
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Offset
+
!Offset
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;"|Type
+
!Type
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-left: 1px solid #000000;border-right: 1px solid #000000;"|Description
+
!Description
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x0
+
!0x0
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|char[4]
+
|char[4]
|style="border: 1px solid #aaaaaa;"|
+
|'''"{{tt| RTS}}"''' or '''"{{tt|WRTS}}"'''<br>Identifier<br>"{{tt| RTS}}" means that there is '''no''' Extra Value for this label.<br>
'''" RTS''' or '''"WRTS"'''<br>Identifier<br>" RTS" means that there is '''no''' extra value for this label.<br>
+
"{{tt|WRTS}}" means that after the Value data, data for the Extra Value follows (see below).<br>
"WRTS" means that after the value data, data for the extra value follows (see below).<br>
 
 
Everything else is invalid.
 
Everything else is invalid.
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x4
+
!0x4
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border: 1px solid #aaaaaa;"|
+
|'''ValueLength'''<br>This holds the length of the Unicode string (the Value) that follows.
'''ValueLength'''<br>This holds the length of the Unicode string (the value) that follows.
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x8
+
!0x8
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|byte[ValueLength*2]
+
|byte[ValueLength*2]
|style="border: 1px solid #aaaaaa;"|
+
|'''Value'''<br>This holds the '''encoded''' Value of the label.<br>Note that this is ValueLength*2 bytes long, because the value is a Unicode string, i.e. every character is a word instead of a byte.<br>
'''Value'''<br>This holds the '''encoded''' value of the label.<br>Note that this is ValueLength*2 bytes long, because the value is a Unicode string, i.e. every character is a word instead of a byte.<br>
+
To decode the value to a Unicode string, '''not''' every byte of the value data (or subtract it from 0xFF, see below for an example).
To decode the value to a Unicode string, '''not''' every byte of the value data (or substract it from 0xFF, see below for an example).
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x8+ValueLength*2
+
!0x8+ValueLength*2
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|DWORD
+
|DWORD
|style="border: 1px solid #aaaaaa;"|
+
|
'''ExtraValueLength'''<br>This holds the length of the extra value string that follow.<br>
+
'''ExtraValueLength'''<br>This holds the length of the extra value string that follow.<br>This and the following line only exists if the identifier is "{{tt|WRTS}}" and not "{{tt| RTS}}".
This only applies if the identifier is "WRTS" and not " RTS".
 
 
|-
 
|-
!bgcolor="#e5e5e5" style="border-bottom: 1px solid #000000;border-right: 1px solid #000000;border-top: 1px solid #000000;"|0x8+ValueLength*2+0x4
+
!0x8+ValueLength*2+0x4
|style="border-left: 1px solid #aaaaaa;border-right: 1px solid #aaaaaa;border-top: 1px solid #aaaaaa;"|char[ExtraValueLength]
+
|char[ExtraValueLength]
|style="border: 1px solid #aaaaaa;"|
+
|'''ExtraValue'''<br>Like the label name, a non-zero-terminated string that is as long as ExtraValueLength says. If it is longer, the rest will be cut off.
'''ExtraValue'''<br>Like the label name, a non-zero-terminated string that is as long as ExtraValueLength says. If it is longer, the rest will be cut off.
 
 
|}
 
|}
 +
 
====Decoding the value====
 
====Decoding the value====
To decode the value to a Unicode string, '''not''' every byte of the value data (or substract it from 0xFF).<br>
+
To decode the value to a Unicode string, '''not''' every byte of the value data (or subtract it from 0xFF).<br>
 
An example in C++:
 
An example in C++:
  int ValueDataLength = ValueLength << 1
+
  int ValueDataLength = ValueLength << 1;
  for(int i = 0; i < ValueDataLen;i++)
+
  for(int i = 0; i < ValueDataLength; ++i) {
{
+
   ValueData[i] = ~ValueData[i];
   ValueData[i]=~ValueData[i]
 
 
  }
 
  }

Revision as of 18:16, 23 July 2011

CSF files hold stringtables for RA2/YR (also for Generals/ZH and probably others).
For more information about what a CSF file is, go to the CSF page.

On this page you will find a guide to how the format is built up.

The Header

The header of a CSF file is 0x18 bytes long.
It is built up like this:

Offset Type Description
0x0 char[4] " FSC"
CSF header identifier
If this is not " FSC", the game will not load the file.
0x4 DWORD CSF Version
The version number of the CSF format.
RA2, YR, Generals, ZH and the BFME series use version 3.
Nox uses version 2.
Nothing is known about the actual difference between the versions.
Thanks to Siberian GRemlin for providing this information (see here)!
0x8 DWORD NumLabels
The total amount of labels in the stringtable.
0xC DWORD NumStrings
The total amount of string pairs in the stringtable.

(A string pair is made up of a Unicode Value and an ASCII ExtraValue, a label can contain more than one such pair, but only the first pair's Value is ever actually used by the game.)

0x10 DWORD (unused)
This is not used by the game, which means it is useless.
If you want, you can store an extra information tag there, if your program could use one (assuming you want to write a program that reads CSF files).
0x14 DWORD Language
The language value for this stringtable.
See below for a list


Language

The language DWORD can have the following values (others will be recognized as "Unknown"):

 0 = US (English)*
 1 = UK (English)
 2 = German*
 3 = French*
 4 = Spanish
 5 = Italian
 6 = Japanese
 7 = Jabberwockie
 8 = Korean*
 9 = Chinese*
>9 = Unknown

* RA2/YR has been released in this language.

Labels

After the header, the label data follows.

A label can be considered an entry in the stringtable (e.g. "GUI:OK" is a label).
Each label has a name (ASCII string, e.g. "NAME:MTNK") and zero or more string pairs. As mentioned above, a string pair is made up of a Unicode Value (e.g. "Grizzly Tank") and an ASCII ExtraValue (no example in the original ra2.csf/ra2md.csf, not used by the game).

Now let's come to how the data is stored in the CSF file:

Label header

The label data begins with a label header, which is built up like this:

Offset Type Description
0x0 char[4] " LBL"
Label identifier
If this is not " LBL", the game will not recognize the following data as label data and read the next 4 bytes.
0x4 DWORD Number of string pairs
This is the number of string pairs associated with this label. Usual value is 1.
0x8 DWORD LabelNameLength
This value holds the size of the label name that follows.
0xC char[LabelNameLength] LabelName
A non-zero-terminated string that is as long as the DWORD at 0x8 says. If it is longer, the rest will be cut off.

The first label in ra2md.csf can be found at 0x18.
Note: Spaces, tabs and line breaks will be formatted out of the label's name, therefore they cannot be used.

Values

Directly after the label header, the value data (string pairs) follows.
This is how it is built up:

Offset Type Description
0x0 char[4] " RTS" or "WRTS"
Identifier
" RTS" means that there is no Extra Value for this label.

"WRTS" means that after the Value data, data for the Extra Value follows (see below).
Everything else is invalid.

0x4 DWORD ValueLength
This holds the length of the Unicode string (the Value) that follows.
0x8 byte[ValueLength*2] Value
This holds the encoded Value of the label.
Note that this is ValueLength*2 bytes long, because the value is a Unicode string, i.e. every character is a word instead of a byte.

To decode the value to a Unicode string, not every byte of the value data (or subtract it from 0xFF, see below for an example).

0x8+ValueLength*2 DWORD

ExtraValueLength
This holds the length of the extra value string that follow.
This and the following line only exists if the identifier is "WRTS" and not " RTS".

0x8+ValueLength*2+0x4 char[ExtraValueLength] ExtraValue
Like the label name, a non-zero-terminated string that is as long as ExtraValueLength says. If it is longer, the rest will be cut off.

Decoding the value

To decode the value to a Unicode string, not every byte of the value data (or subtract it from 0xFF).
An example in C++:

int ValueDataLength = ValueLength << 1;
for(int i = 0; i < ValueDataLength; ++i) {
  ValueData[i] = ~ValueData[i];
}