jonwil 0 Report post Posted March 4, 2007 Thanks to Scorpio9a for his major contribution to this description: The manifest file starts with a 48 byte header, as follows: struct ManifestHeader { int version; int filesignature; int manifestsignature; int itemcount; int binsize; int unknown1; int unknown2; int unknown3; int extraentriessize; int includenamessize; int streamnamessize; int sourcenamessize; }; filesignature always matches with the first 4 bytes of the .imp, .relo and .bin files manifestsignature is always 0xF54ABCCF itemcount is the count of how many items are in the file binsize is the size of the .bin file extraentriessize is the size of the extra entries table includenamessize is the size of the include names table streamnamessize is the size of the stream names stable sourcenamessize is the size of the source names table If any of the size fields are zero, it means that this file has no table (e.g. no extra entries table). following the manifest header is <itemcount> manifest item headers, each of which is 44 bytes in size struct ManifestItemHeader { int filenamehash1; int filenamehash2; int filenamehash3; int filenamehash4; int extraentryoffset; int unknown1; int streamnameoffset; int sourcenameoffset; int streamsize; int unknown2; int extraentrysize; }; if this entry has a matching .cdata file, the filenamehash fields contain the 4 numbers in the .cdata filename. extraentryoffset is an offset relative to the start of the extra entry table (if it is zero, it means this entry has no extra entry) streamnameoffset is an offset relative to the start of the stream name table sourcenameoffset is an offset relative to the start of the source name table streamsize is the size of the stream in the .bin file (if it is zero, then there is no data in the .bin file for this entry) extraentrysize is the size of the extra entry for this entry. Following the manifest headers is the extra entries table. The purpose of this table is unknown. After that comes the include names table which consists of entries starting with the byte 1 followed by a null terminated string. This table contains references to other manifest files that are to be "included" by this manifest file. Following the include names table is the stream names table which is a simple list of null terminated strings. Each stream name starts with the stream type then a : then the stream name. Following the include names table is the source names table. The source file for a given stream is the XML file which caused this stream to be created (3d models have XML w3x model files as the source file and even textures and audio files have various xml files as the source file) the .bin file has the 4 byte signature (that matches with the filesignature in the manifest file) followed by raw stream data for each stream. Coming up in the next post, details of individual stream types and their layout. (at least those that have been worked out) Share this post Link to post Share on other sites
jonwil 0 Report post Posted March 4, 2007 First, here is a header description that will be used later: struct StreamHeader { int unknown; //unknown, always zero int size; //the actual size of the stream header int datasize; //the size of the stream data that follows the header }; On to the stream formats. First up we have AptAptData which is basically a StreamHeader followed by the same data that is in a *.apt file in the BFME UI files. Next up we have AptConstData which is basically a StreamHeader followed by the same data that is in a *.const file in the BFME UI files. AptDatData is a StreamHeader followed by the same data that is in a *.dat file in the BFME UI files. AptGeometryData is a StreamHeader followed by 4 bytes of unknown purpose followed by the data that is in a *.ru file in the BFME UI. Texture data is a StreamHeader followed by a DDS file. AudioFile data has a .cdata file that goes with it which is the actual audio data. Use the filenamehash fields to identify which one. The format of the data is as follows: first comes 4 bytes that are always zero then comes the size of the subtitle string followed by the offset to the subtitle string followed by an unknown 4 byte value that has been reported to be the "time" of the audio (in seconds? ms? samples?) next up comes the 4 byte sample rate of the audio (always 48000 in the files that I have seen). Following that comes 12 bytes of unknown purpose. After that comes the subtitle string which is a reference presumably to a subtitle for this audio file. The .cdata file (at least the ones I have been looking at) has this format: 2 bytes, 04 and 00 then 2 bytes for the sample rate then 4 bytes for the "time" then 4 bytes that is the size of the raw audio data that follows then 4 bytes for the "time" (no clue why its repeated like that) The format of the actual audio data is unknown however I can say for certain that it is NOT MP3 audio. It is most likely uncompressed audio in some format or it is compressed using something simple. Share this post Link to post Share on other sites
kmx 0 Report post Posted March 4, 2007 (edited) Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents. btw: update your post with this too: http://www.derelictstudios.net/forums/inde...pic=15020&st=20 Edited March 4, 2007 by kmx Share this post Link to post Share on other sites
mastermind 0 Report post Posted March 4, 2007 Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents. btw: update your post with this too: http://www.derelictstudios.net/forums/inde...pic=15020&st=20 His code is released under the GPL. If you use his code, you have to open the source on your application. If you want to reimplement his code, you are free to do so, and you are under no obligation. You can't use the GPL to cover a method, but you can use it to cover a specific implementation. Just because what it does is reverse engineered does not change that he is the copyright owner of his code and can license it however he'd like. Wine is just a reverse engineering of the Win32 API, but it is licensed under the LGPL, which isn't quite as restrictive as the GPL, but it is still open source. Share this post Link to post Share on other sites
jonwil 0 Report post Posted March 4, 2007 Yes, credits to everyone else who has helped with C&C3 reverse engineering so far :) Share this post Link to post Share on other sites
kmx 0 Report post Posted March 4, 2007 (edited) Ok, a little update on sound files (those with 16 bytes header) 04 TAG 00/04 channels? BB80 khz 0000XXXX sample count 0000YYYY data size 0000XXXX sample count .. data Here's sample decoding function: Decode_sample: eax = sound info block [eax+34] = current block/sample data ptr [eax+38] = current block/sample number [ecx] = sound data ptr [ecx+4] = sound block/sample count 00886154: push ebx mov ebx, esp push ecx push ecx (...) ; mov data start ptr to edx mov edx, esi movaps xmm7, oword ptr [edi] movaps xmm6, oword ptr [edi+10h] ; add sample data size (3c) add edx, 3Ch mov edi, [ebp+var_18] sample_decode_loop:; CODE XREF: sub_886154+293j pxor mm0, mm0 ; mov data to mm0 punpcklbw mm0, qword ptr [esi] (decode...) ; mov decoded data to edi buf, xmm3=left, xmm5=right channel movlps qword ptr [edi+100h], xmm5 movhps qword ptr [edi+180h], xmm5 movlps qword ptr [edi], xmm3 movhps qword ptr [edi+80h], xmm3 add edi, 8 add esi, 4 ; check if done cmp esi, edx jnz loc_886345 ; sample decoded! 008863ED: emms (...) 0088659B: retn Edited March 4, 2007 by kmx Share this post Link to post Share on other sites
kmx 0 Report post Posted March 4, 2007 (edited) The function works like this: stream = array; init() { curr_sample_ptr = sound_data_ptr; curr_sample_num = sample_count; } decode_stream { init(); while curr_sample_num != 0 { stream.write(Decode_sample(curr_sample_num, curr_sample_ptr)) //stream actually does not exist in the *real world*, its just played in runtime, so to make proper decryption this must be implemented. stream.seek(0x200, from_current_pos) curr_sample_ptr+=0x4C curr_sample_num-=0x80 } } Decode_sample(curr_sample_num, curr_sample_ptr): array { sample = array[0x200] sample_ptr = 0 vol_array = CreateVolumeArray(curr_sample_ptr) //reads 0x10 bytes from ptr curr_sample_ptr += 0x10 sample_end = curr_sample_ptr + 0x3C While curr_sample_ptr != sample_end { sample[sample_ptr] = MMX_Extract_sample_qdword(vol_array * curr_sample_ptr[0]) curr_sample_ptr += 4 sample_ptr += 8 } return sample; } Sorry for the crap code, its super pseudo, anyway, in other words: input data from the file is divided into 0x4C byte blocks, each one of them is decoded into 0x200 byte blocks (2*80h left channel, 2*80h right channel). first 10h of each block is a multiplier which i called "volume", the rest, 0x3c is decoded into sample data. the number in the header which I called "sample count" is divided by 80h, which gives the real sample count (+1 from the mod 80h if !=0) Edited March 4, 2007 by kmx Share this post Link to post Share on other sites
jonwil 0 Report post Posted March 4, 2007 Its not encryption, its actually some kind of compression (probably something fairly simple given what its doing). Share this post Link to post Share on other sites
kmx 0 Report post Posted March 4, 2007 Its not encryption, its actually some kind of compression (probably something fairly simple given what its doing). I didnt call it decryption in my last 2 posts, I called it "decoding" process, which means it's unpacking 0x4c into 2*0x80 Share this post Link to post Share on other sites
Count von Phoib 0 Report post Posted March 4, 2007 Kmx, please use tags for your code, makes the reading a bit easier :) Share this post Link to post Share on other sites
jonwil 0 Report post Posted March 4, 2007 ok, so what do CreateVolumeArray and MMX_Extract_sample_qdword look like then? Share this post Link to post Share on other sites
kmx 0 Report post Posted March 4, 2007 (edited) It's all there in asm code, check the first post for offsets. The function first does the init stuff (fills the table pointed by param1 with curr_sample_ptr and curr_sample_count (param +34h / +38h), second param points to sound_data_ptr and sample_count (param2 +0 / +4). So its: -check if the init was alredy done (its run only once per sound stream) * init -take 10h and make Volume table -take 3Ch and go into loop, which uses MMX opcodes in order to produce proper sound sample (80h per channel * 2 = 200h) Edited March 4, 2007 by kmx Share this post Link to post Share on other sites
Froniki 0 Report post Posted March 4, 2007 (edited) update for manifest item info struct ManifestItemHeader { int filenamehash1; int filenamehash2; int filenamehash3; int filenamehash4; int extraentryoffset; int extraentrycount; int streamnameoffset; int sourcenameoffset; int streamsize; int reloentrysize; int impentrysize; }; extraentrysize = extraentrycount*8 Edited March 4, 2007 by Froniki Share this post Link to post Share on other sites
EvilAlex 0 Report post Posted March 4, 2007 (edited) :huh: Here is a screenshot of a small module for my Final Kane program, which reads manifests based on JonWil's, Scorpio9a's & Froniki's descriptions. It's still in a very-very experimental stage and is VERY far from complete. :unsure: But it already does something =))) You can at least see if an entry has Cdata associated, e.t.c. Hope I'll finish it once =) Anyway my big thanks to JonWil, Scorpio9a and Froniki :D Edited March 4, 2007 by EvilAlex Share this post Link to post Share on other sites
kmx 0 Report post Posted March 5, 2007 Due to lack of support, I'm leaving this forum... I mean - when you go into nick-picking at least do some research and look for all guys who worked on this. I find it disturbing that after doing contribution to literally all topics/formats here I wasnt credited even once heh, it's not like I'm whining - I don't really care that much. Bye Share this post Link to post Share on other sites
kmx 0 Report post Posted March 5, 2007 (edited) Would be nice if you actually credit all ppl who contributed in those findings, also I dont like the fact you're asking poeple to go open source whenever they use your C implementation of Ref decompression - you can't do that, since it was reverse engineered in the first place, and if you think my point is invalid, I will post my asm code, pasted into inline of the some C function - just my 2 cents. btw: update your post with this too: http://www.derelictstudios.net/forums/inde...pic=15020&st=20 His code is released under the GPL. If you use his code, you have to open the source on your application. If you want to reimplement his code, you are free to do so, and you are under no obligation. You can't use the GPL to cover a method, but you can use it to cover a specific implementation. Just because what it does is reverse engineered does not change that he is the copyright owner of his code and can license it however he'd like. Wine is just a reverse engineering of the Win32 API, but it is licensed under the LGPL, which isn't quite as restrictive as the GPL, but it is still open source. Also: NO, you CAN'T release reverse-engineered code under GPL (crediting yourself as owner), it's simply illegal. Try doing that with something popular, something that could draw attention of the legal owners of the code - wish you good luck in court then. Edited March 5, 2007 by kmx Share this post Link to post Share on other sites
mastermind 0 Report post Posted March 5, 2007 (edited) A few examples I can think of: WINE Samba Open Office XCC Mixer http://www.chillingeffects.org/reverse/faq.cgi#QID198 Sorry, but I don't see any reason why you can't apply the GPL to code that has been reverse engineered. If it is decompiled or otherwise there may be an issue, but if you create your own implementation you are the copyright owner, and are able to license it how you please. Edited March 5, 2007 by mastermind Share this post Link to post Share on other sites
Banshee 0 Report post Posted March 5, 2007 (edited) I've been checking these .manifest/.bin/.imp/.relo files and all of these packages that have the .version file (with the content '_common' ), have a different manifest header structure than the ones without it. So, here's a possible header structure for the packages that doesn't have the .version file: struct ManifestHeader { int version; int filesignature; int manifestsignature; int itemcount; int binsize; int unknown1; int relosize; int impsize; int extraentriessize; int includenamessize; int streamnamessize; int sourcenamessize; }; And, on most of them, unknown1 = binsize. Maybe, in these cases, they don't use any kind of compression. I don't know, I'm not good at decoding file formats... but I'm trying... Edit after reading kmx's posts: First of all, if I manage to get this thing working on OS BIG Editor, you, Scorpio9a, Franki and jonwil and whoeever else contributed in this topic will be credited for your help. Also, I've made many open source programs that uses and shows how to decode and encode private file formats from Electronic Arts. The same applies for Olaf Van der Spek, who made XCC Utilities. EA is fully aware of what we both did, they have all our personal information to sue us anytime they want. They've never sued me, neither him. We both even went to Los Angeles to visit their studio. They've paid almost everything for us, including a 5 stars hotel and treated us very well, even Louis Castle (EALA's vice president). Ah... I've met Phoib there when I went.. so, Phoib is a good witness ;). So, don't worry about it. Edited March 5, 2007 by Banshee Share this post Link to post Share on other sites
flyby 0 Report post Posted March 5, 2007 Guys, let's not loose focus here by bickering over copyrights. Granted, they are important and great care should be taken to respect the contribution of everyone involved, but that's not the good way to make progress. Some of you will remember Will Sutton, the guy who made the first versions of the voxeleditor. I had the luck to cooperate with him on a 3dsmax script some 5years ago. Although I have lost track of Will (he married, moved and got another job etc...) I do recall one important lesson he had learned and shared with me: For his first version he anxiously guarded his code in order to protect his copyright. As professional programmer, he thought he was doing the right thing. Until his HD died and he lost all his work, leaving his work in "beta stage" with no chance of improvement. the second time he started coding, he did put his work as "open source" (available to everyone) and when he left the modding scene, others took over the work and further perfected the software. Same with the early stage work Godwin and me did on the voxel. By opening the work to everyone, you assure the continuity of the modding tools into the future. By not doing so and being over protective, all your work will be obsolete the moment you stop carrying interest in modding CNC. The excellent work you guys are doing here should be aimed in the first place to help a modding community and not to profile yourselves in a personal manner. There are other, far more professional ways to do that... So, just respect each other’s copyrights and contributions as you would like others respect your work too. That's all there is, frankly. See it more like a friendly and sporty competition among programmers to get this thing nailed, but please; don’t turn it into a mud throwing party… pretty please? signed , Grandpa ;) Share this post Link to post Share on other sites
kmx 0 Report post Posted March 5, 2007 (edited) Well I really meant the exact copy of the given code (I can hardly call retyped asm code of one function into C an "own implementation") and releasing it under GPL + crediting yourself as owner, is a bad thing to do. With file formats such as BIG, MIX etc, you dont have to look into the copyrighted code, disassemble the application to make the unpacker - its too simple. With decompression it's a bigger issue, and if it was a big-corp releasing OS GPL, reversed version of some others big-corp compressor, then lawsuit would be huge, trust me. EA is nice to you since it actually helps them, that's it - if they were losing some money over it they would sue you too, that's a fact. Edited March 5, 2007 by kmx Share this post Link to post Share on other sites
EvilAlex 0 Report post Posted March 5, 2007 Guys, let's not loose focus here by bickering over copyrights. Granted, they are important and great care should be taken to respect the contribution of everyone involved, but that's not the good way to make progress. Some of you will remember Will Sutton, the guy who made the first versions of the voxeleditor. I had the luck to cooperate with him on a 3dsmax script some 5years ago. Although I have lost track of Will (he married, moved and got another job etc...) I do recall one important lesson he had learned and shared with me: For his first version he anxiously guarded his code in order to protect his copyright. As professional programmer, he thought he was doing the right thing. Until his HD died and he lost all his work, leaving his work in "beta stage" with no chance of improvement. the second time he started coding, he did put his work as "open source" (available to everyone) and when he left the modding scene, others took over the work and further perfected the software. Same with the early stage work Godwin and me did on the voxel. By opening the work to everyone, you assure the continuity of the modding tools into the future. By not doing so and being over protective, all your work will be obsolete the moment you stop carrying interest in modding CNC. The excellent work you guys are doing here should be aimed in the first place to help a modding community and not to profile yourselves in a personal manner. There are other, far more professional ways to do that... So, just respect each other’s copyrights and contributions as you would like others respect your work too. That's all there is, frankly. See it more like a friendly and sporty competition among programmers to get this thing nailed, but please; don’t turn it into a mud throwing party… pretty please? signed , Grandpa ;) B.t.w. When it comes to crediting the people I credit everyone, who supplied information to me and those, who they tell me was also involved, and I do respect EVERYBODY'S CONTRIBUTION. I SIMPLY can NOT credit THE ENTIRE WORLD. There is a small good Russian proverb for such a case: "There is not judgement for "I don't know"". Pronounced: "Na net suda net" in Russian. So if I accidently miss someone, or if I simply don't know of someone's contributions, then I'am sorry, but please, be more patient, as I am human too, not some telepath, I can not get any info from the thin air.... :unsure: Share this post Link to post Share on other sites
booto 0 Report post Posted March 5, 2007 Hi, I've been kind of lurking on this forum for the last few days - first playing about with an extractor for the big4 files, now playing about with a program for dealing with manifest/bin files. I've got something that seems vaguely usable, but I'm not sure if this is ground you guys have already explored yet or not. Basically, it extracts entries out of the .bin files. Attached is the program, I'll throw source up if people want to see it (take me a little bit of time to clean it up to a non-embarrassing level.) Yes, it piggy-backs on much information talked about in this forum - thanks to all who have been participating. Because of the limitations placed on the extensions of uploaded files by the forum, you'll need to replace the extension with ".rar". --booto BinOpener.a_rar_file Share this post Link to post Share on other sites
Froniki 0 Report post Posted March 5, 2007 (edited) try this Bin Unpacker made by ai_enabled for me yesterday i don't think that it useful cause it's not use relo (Relocation file) and imp (Imports File) files USE: CnC3_BIN_Unpacker.exe xxxx.bin needs manifest file PS: extracting by Virtual Filenames and Groups PSS: it need no RefPack bin files CnC3_BIN_Unpacker.zip_ Edited March 5, 2007 by Froniki Share this post Link to post Share on other sites
booto 0 Report post Posted March 8, 2007 Updated some things in the bin/manifest/imp/relo extractor/navigator I've been working on. Extraction doesn't always work, sometimes the data exists in the cdata files or a map file (I assume) rather than in the .bin file. If the displayed 'TotalSize' is within a few bytes of the displayed 'Size' field, it 'should' extract okay. It displays the values associated with a particular stream from the relo/imp file (as far as I can tell - seems okay, each has a final entry of 0xffffffff like a end-of-list marker). I've added type/extension detection to most of the resource types, I haven't been bothered for resources where there only seem to be one instance of the type. These streams appear under the 'Unknown Data' node. Also, the relo (relocation?) values for a particular stream always seem to be less-than-or-equal-to the size of the stream... any ideas? --booto BinOpener.a_rar_file Share this post Link to post Share on other sites
DetoNato 0 Report post Posted March 8, 2007 Hello booto, Your BinOpener tells me that the file vcl50.bpl is missing. Do you know which program I have to install to use your tool? Share this post Link to post Share on other sites