Jump to content
DerelictStudios Forums
jonwil

Details Of The Format Of The Manifest/bin Files

Recommended Posts

Thanks to Scorpio9a for his major contribution to this description:

 

The manifest file starts with a 48 byte header, as follows:

struct ManifestHeader {

int version;

int filesignature;

int manifestsignature;

int itemcount;

int binsize;

int unknown1;

int unknown2;

int unknown3;

int extraentriessize;

int includenamessize;

int streamnamessize;

int sourcenamessize;

};

 

filesignature always matches with the first 4 bytes of the .imp, .relo and .bin files

manifestsignature is always 0xF54ABCCF

itemcount is the count of how many items are in the file

binsize is the size of the .bin file

extraentriessize is the size of the extra entries table

includenamessize is the size of the include names table

streamnamessize is the size of the stream names stable

sourcenamessize is the size of the source names table

If any of the size fields are zero, it means that this file has no table (e.g. no extra entries table).

 

following the manifest header is <itemcount> manifest item headers, each of which is 44 bytes in size

struct ManifestItemHeader {

int filenamehash1;

int filenamehash2;

int filenamehash3;

int filenamehash4;

int extraentryoffset;

int unknown1;

int streamnameoffset;

int sourcenameoffset;

int streamsize;

int unknown2;

int extraentrysize;

};

if this entry has a matching .cdata file, the filenamehash fields contain the 4 numbers in the .cdata filename.

extraentryoffset is an offset relative to the start of the extra entry table (if it is zero, it means this entry has no extra entry)

streamnameoffset is an offset relative to the start of the stream name table

sourcenameoffset is an offset relative to the start of the source name table

streamsize is the size of the stream in the .bin file (if it is zero, then there is no data in the .bin file for this entry)

extraentrysize is the size of the extra entry for this entry.

 

Following the manifest headers is the extra entries table. The purpose of this table is unknown.

 

After that comes the include names table which consists of entries starting with the byte 1 followed by a null terminated string. This table contains references to other manifest files that are to be "included" by this manifest file.

 

Following the include names table is the stream names table which is a simple list of null terminated strings. Each stream name starts with the stream type then a : then the stream name.

 

Following the include names table is the source names table. The source file for a given stream is the XML file which caused this stream to be created (3d models have XML w3x model files as the source file and even textures and audio files have various xml files as the source file)

 

the .bin file has the 4 byte signature (that matches with the filesignature in the manifest file) followed by raw stream data for each stream.

 

Coming up in the next post, details of individual stream types and their layout. (at least those that have been worked out)

Share this post


Link to post
Share on other sites

First, here is a header description that will be used later:

struct StreamHeader {

int unknown; //unknown, always zero

int size; //the actual size of the stream header

int datasize; //the size of the stream data that follows the header

};

 

On to the stream formats. First up we have AptAptData which is basically a StreamHeader followed by the same data that is in a *.apt file in the BFME UI files.

Next up we have AptConstData which is basically a StreamHeader followed by the same data that is in a *.const file in the BFME UI files.

AptDatData is a StreamHeader followed by the same data that is in a *.dat file in the BFME UI files.

AptGeometryData is a StreamHeader followed by 4 bytes of unknown purpose followed by the data that is in a *.ru file in the BFME UI.

Texture data is a StreamHeader followed by a DDS file.

 

AudioFile data has a .cdata file that goes with it which is the actual audio data. Use the filenamehash fields to identify which one.

The format of the data is as follows:

first comes 4 bytes that are always zero

then comes the size of the subtitle string

followed by the offset to the subtitle string

followed by an unknown 4 byte value that has been reported to be the "time" of the audio (in seconds? ms? samples?)

next up comes the 4 byte sample rate of the audio (always 48000 in the files that I have seen). Following that comes 12 bytes of unknown purpose.

After that comes the subtitle string which is a reference presumably to a subtitle for this audio file.

 

The .cdata file (at least the ones I have been looking at) has this format:

2 bytes, 04 and 00

then 2 bytes for the sample rate

then 4 bytes for the "time"

then 4 bytes that is the size of the raw audio data that follows

then 4 bytes for the "time" (no clue why its repeated like that)

The format of the actual audio data is unknown however I can say for certain that it is NOT MP3 audio. It is most likely uncompressed audio in some format or it is compressed using something simple.

Share this post


Link to post
Share on other sites

Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents.

 

btw: update your post with this too:

http://www.derelictstudios.net/forums/inde...pic=15020&st=20

Edited by kmx

Share this post


Link to post
Share on other sites
Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents.

 

btw: update your post with this too:

http://www.derelictstudios.net/forums/inde...pic=15020&st=20

His code is released under the GPL. If you use his code, you have to open the source on your application. If you want to reimplement his code, you are free to do so, and you are under no obligation. You can't use the GPL to cover a method, but you can use it to cover a specific implementation. Just because what it does is reverse engineered does not change that he is the copyright owner of his code and can license it however he'd like. Wine is just a reverse engineering of the Win32 API, but it is licensed under the LGPL, which isn't quite as restrictive as the GPL, but it is still open source.

Share this post


Link to post
Share on other sites

Ok, a little update on sound files (those with 16 bytes header)

 

04 TAG

00/04 channels?

BB80 khz

0000XXXX sample count

0000YYYY data size

0000XXXX sample count

..

data

 

Here's sample decoding function:

 

Decode_sample:

eax = sound info block

[eax+34] = current block/sample data ptr
[eax+38] = current block/sample number

[ecx] = sound data ptr
[ecx+4] = sound block/sample count

00886154:   push    ebx
     mov    ebx, esp
     push    ecx
     push    ecx
     (...)
; mov data start ptr to edx
     mov    edx, esi
     movaps    xmm7, oword ptr    [edi]
     movaps    xmm6, oword ptr    [edi+10h]
; add sample data size (3c)
     add    edx, 3Ch
     mov    edi, [ebp+var_18]

sample_decode_loop:; CODE XREF: sub_886154+293j
     pxor    mm0, mm0
; mov data to mm0
     punpcklbw mm0, qword ptr [esi]
     (decode...)
; mov decoded data to edi buf, xmm3=left, xmm5=right channel
     movlps    qword ptr [edi+100h], xmm5
     movhps    qword ptr [edi+180h], xmm5
     movlps    qword ptr [edi], xmm3
     movhps    qword ptr [edi+80h], xmm3
     add    edi, 8
     add    esi, 4
; check if done
     cmp    esi, edx
     jnz    loc_886345

; sample decoded!
008863ED:  emms
     (...)
0088659B:  retn

Edited by kmx

Share this post


Link to post
Share on other sites

The function works like this:

 

stream = array;

init() {
 curr_sample_ptr = sound_data_ptr;
 curr_sample_num = sample_count;
}

decode_stream {
 init();
 while curr_sample_num != 0 {
   stream.write(Decode_sample(curr_sample_num, curr_sample_ptr)) //stream actually does not exist in the *real world*, its just played in runtime, so to make proper decryption this must be implemented.
   stream.seek(0x200, from_current_pos)
   curr_sample_ptr+=0x4C
   curr_sample_num-=0x80
 }
}

Decode_sample(curr_sample_num, curr_sample_ptr): array {

 sample = array[0x200]
 sample_ptr = 0
 vol_array = CreateVolumeArray(curr_sample_ptr)  //reads 0x10 bytes from ptr
 curr_sample_ptr += 0x10
 sample_end = curr_sample_ptr + 0x3C

 While curr_sample_ptr != sample_end {
   sample[sample_ptr] = MMX_Extract_sample_qdword(vol_array * curr_sample_ptr[0])
   
   curr_sample_ptr += 4
   sample_ptr += 8
 }
 return sample;
}

 

 

Sorry for the crap code, its super pseudo, anyway, in other words:

 

input data from the file is divided into 0x4C byte blocks, each one of them is decoded into 0x200 byte blocks (2*80h left channel, 2*80h right channel). first 10h of each block is a multiplier which i called "volume", the rest, 0x3c is decoded into sample data. the number in the header which I called "sample count" is divided by 80h, which gives the real sample count (+1 from the mod 80h if !=0)

Edited by kmx

Share this post


Link to post
Share on other sites

Its not encryption, its actually some kind of compression (probably something fairly simple given what its doing).

Share this post


Link to post
Share on other sites
Its not encryption, its actually some kind of compression (probably something fairly simple given what its doing).

I didnt call it decryption in my last 2 posts, I called it "decoding" process, which means it's unpacking 0x4c into 2*0x80

Share this post


Link to post
Share on other sites

It's all there in asm code, check the first post for offsets.

The function first does the init stuff (fills the table pointed by param1 with curr_sample_ptr and curr_sample_count (param +34h / +38h), second param points to sound_data_ptr and sample_count (param2 +0 / +4).

 

So its:

-check if the init was alredy done (its run only once per sound stream)

* init

-take 10h and make Volume table

-take 3Ch and go into loop, which uses MMX opcodes in order to produce proper sound sample (80h per channel * 2 = 200h)

Edited by kmx

Share this post


Link to post
Share on other sites

update for manifest item info

struct ManifestItemHeader {
int filenamehash1;
int filenamehash2;
int filenamehash3;
int filenamehash4;
int extraentryoffset;
int extraentrycount;
int streamnameoffset;
int sourcenameoffset;
int streamsize;
int reloentrysize;
int impentrysize;
};
extraentrysize = extraentrycount*8

Edited by Froniki

Share this post


Link to post
Share on other sites

fuckinmanifestbm7.th.jpg

:huh: Here is a screenshot of a small module for my Final Kane program, which reads manifests based on JonWil's, Scorpio9a's & Froniki's descriptions. It's still in a very-very experimental stage and is VERY far from complete. :unsure:

But it already does something =))) :rolleyes:

You can at least see if an entry has Cdata associated, e.t.c.

Hope I'll finish it once =)

Anyway my big thanks to JonWil, Scorpio9a and Froniki :D

Edited by EvilAlex

Share this post


Link to post
Share on other sites

Due to lack of support, I'm leaving this forum... I mean - when you go into nick-picking at least do some research and look for all guys who worked on this. I find it disturbing that after doing contribution to literally all topics/formats here I wasnt credited even once heh, it's not like I'm whining - I don't really care that much. Bye

Share this post


Link to post
Share on other sites
Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents.

 

btw: update your post with this too:

http://www.derelictstudios.net/forums/inde...pic=15020&st=20

His code is released under the GPL. If you use his code, you have to open the source on your application. If you want to reimplement his code, you are free to do so, and you are under no obligation. You can't use the GPL to cover a method, but you can use it to cover a specific implementation. Just because what it does is reverse engineered does not change that he is the copyright owner of his code and can license it however he'd like. Wine is just a reverse engineering of the Win32 API, but it is licensed under the LGPL, which isn't quite as restrictive as the GPL, but it is still open source.

Also:

 

NO, you CAN'T release reverse-engineered code under GPL (crediting yourself as owner), it's simply illegal. Try doing that with something popular, something that could draw attention of the legal owners of the code - wish you good luck in court then.

Edited by kmx

Share this post


Link to post
Share on other sites

A few examples I can think of:

WINE

Samba

Open Office

XCC Mixer

http://www.chillingeffects.org/reverse/faq.cgi#QID198

Sorry, but I don't see any reason why you can't apply the GPL to code that has been reverse engineered. If it is decompiled or otherwise there may be an issue, but if you create your own implementation you are the copyright owner, and are able to license it how you please.

Edited by mastermind

Share this post


Link to post
Share on other sites

I've been checking these .manifest/.bin/.imp/.relo files and all of these packages that have the .version file (with the content '_common' ), have a different manifest header structure than the ones without it.

 

So, here's a possible header structure for the packages that doesn't have the .version file:

 

 

struct ManifestHeader {
int version;
int filesignature;
int manifestsignature;
int itemcount;
int binsize;
int unknown1;
int relosize;
int impsize;
int extraentriessize;
int includenamessize;
int streamnamessize;
int sourcenamessize;
};

 

 

And, on most of them, unknown1 = binsize. Maybe, in these cases, they don't use any kind of compression. I don't know, I'm not good at decoding file formats... but I'm trying...

 

 

 

 

 

Edit after reading kmx's posts: First of all, if I manage to get this thing working on OS BIG Editor, you, Scorpio9a, Franki and jonwil and whoeever else contributed in this topic will be credited for your help.

 

Also, I've made many open source programs that uses and shows how to decode and encode private file formats from Electronic Arts. The same applies for Olaf Van der Spek, who made XCC Utilities. EA is fully aware of what we both did, they have all our personal information to sue us anytime they want. They've never sued me, neither him. We both even went to Los Angeles to visit their studio. They've paid almost everything for us, including a 5 stars hotel and treated us very well, even Louis Castle (EALA's vice president). Ah... I've met Phoib there when I went.. so, Phoib is a good witness ;). So, don't worry about it.

Edited by Banshee

Share this post


Link to post
Share on other sites

Guys, let's not loose focus here by bickering over copyrights.

Granted, they are important and great care should be taken to respect the contribution of everyone involved, but that's not the good way to make progress.

 

Some of you will remember Will Sutton, the guy who made the first versions of the voxeleditor. I had the luck to cooperate with him on a 3dsmax script some 5years ago. Although I have lost track of Will (he married, moved and got another job etc...) I do recall one important lesson he had learned and shared with me:

For his first version he anxiously guarded his code in order to protect his copyright. As professional programmer, he thought he was doing the right thing. Until his HD died and he lost all his work, leaving his work in "beta stage" with no chance of improvement. the second time he started coding, he did put his work as "open source" (available to everyone) and when he left the modding scene, others took over the work and further perfected the software. Same with the early stage work Godwin and me did on the voxel.

 

By opening the work to everyone, you assure the continuity of the modding tools into the future. By not doing so and being over protective, all your work will be obsolete the moment you stop carrying interest in modding CNC.

The excellent work you guys are doing here should be aimed in the first place to help a modding community and not to profile yourselves in a personal manner. There are other, far more professional ways to do that...

 

So, just respect each other’s copyrights and contributions as you would like others respect your work too. That's all there is, frankly.

See it more like a friendly and sporty competition among programmers to get this thing nailed, but please; don’t turn it into a mud throwing party… pretty please?

 

 

signed ,

Grandpa ;)

Share this post


Link to post
Share on other sites

Well I really meant the exact copy of the given code (I can hardly call retyped asm code of one function into C an "own implementation") and releasing it under GPL + crediting yourself as owner, is a bad thing to do. With file formats such as BIG, MIX etc, you dont have to look into the copyrighted code, disassemble the application to make the unpacker - its too simple. With decompression it's a bigger issue, and if it was a big-corp releasing OS GPL, reversed version of some others big-corp compressor, then lawsuit would be huge, trust me. EA is nice to you since it actually helps them, that's it - if they were losing some money over it they would sue you too, that's a fact.

Edited by kmx

Share this post


Link to post
Share on other sites
Guys, let's not loose focus here by bickering over copyrights.

Granted, they are important and great care should be taken to respect the contribution of everyone involved, but that's not the good way to make progress.

 

Some of you will remember Will Sutton, the guy who made the first versions of the voxeleditor. I had the luck to cooperate with him on a 3dsmax script some 5years ago. Although I have lost track of Will (he married, moved and got another job etc...) I do recall one important lesson he had learned and shared with me:

For his first version he anxiously guarded his code in order to protect his copyright. As professional programmer, he thought he was doing the right thing. Until his HD died and he lost all his work, leaving his work in "beta stage" with no chance of improvement. the second time he started coding, he did put his work as "open source" (available to everyone) and when he left the modding scene, others took over the work and further perfected the software. Same with the early stage work Godwin and me did on the voxel.

 

By opening the work to everyone, you assure the continuity of the modding tools into the future. By not doing so and being over protective, all your work will be obsolete the moment you stop carrying interest in modding CNC.

The excellent work you guys are doing here should be aimed in the first place to help a modding community and not to profile yourselves in a personal manner. There are other, far more professional ways to do that...

 

So, just respect each other’s copyrights and contributions as you would like others respect your work too. That's all there is, frankly.

See it more like a friendly and sporty competition among programmers to get this thing nailed, but please; don’t turn it into a mud throwing party… pretty please?

 

 

signed ,

Grandpa ;)

B.t.w. When it comes to crediting the people I credit everyone, who supplied information to me and those, who they tell me was also involved, and I do respect EVERYBODY'S CONTRIBUTION. I SIMPLY can NOT credit THE ENTIRE WORLD.

There is a small good Russian proverb for such a case:

"There is not judgement for "I don't know"".

Pronounced: "Na net suda net" in Russian.

So if I accidently miss someone, or if I simply don't know of someone's contributions, then I'am sorry, but please, be more patient, as I am human too, not some telepath, I can not get any info from the thin air.... :unsure:

Share this post


Link to post
Share on other sites

Hi,

 

I've been kind of lurking on this forum for the last few days - first playing about with an extractor for the big4 files, now playing about with a program for dealing with manifest/bin files. I've got something that seems vaguely usable, but I'm not sure if this is ground you guys have already explored yet or not.

 

Basically, it extracts entries out of the .bin files.

 

Attached is the program, I'll throw source up if people want to see it (take me a little bit of time to clean it up to a non-embarrassing level.)

 

Yes, it piggy-backs on much information talked about in this forum - thanks to all who have been participating.

 

Because of the limitations placed on the extensions of uploaded files by the forum, you'll need to replace the extension with ".rar".

 

--booto

BinOpener.a_rar_file

Share this post


Link to post
Share on other sites

try this Bin Unpacker made by ai_enabled for me yesterday

i don't think that it useful cause

it's not use relo (Relocation file) and imp (Imports File) files

 

USE: CnC3_BIN_Unpacker.exe xxxx.bin

needs manifest file

 

PS: extracting by Virtual Filenames and Groups

PSS: it need no RefPack bin files

CnC3_BIN_Unpacker.zip_

Edited by Froniki

Share this post


Link to post
Share on other sites

Updated some things in the bin/manifest/imp/relo extractor/navigator I've been working on. Extraction doesn't always work, sometimes the data exists in the cdata files or a map file (I assume) rather than in the .bin file. If the displayed 'TotalSize' is within a few bytes of the displayed 'Size' field, it 'should' extract okay. It displays the values associated with a particular stream from the relo/imp file (as far as I can tell - seems okay, each has a final entry of 0xffffffff like a end-of-list marker).

 

I've added type/extension detection to most of the resource types, I haven't been bothered for resources where there only seem to be one instance of the type. These streams appear under the 'Unknown Data' node.

 

Also, the relo (relocation?) values for a particular stream always seem to be less-than-or-equal-to the size of the stream... any ideas?

 

--booto

BinOpener.a_rar_file

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×