decompiling c and c++ programs in Linux
Decompiling is the process of generating the source code out of running binaries (eg. file.c out of a.out) . I have used the mocha decompiler for my previous java project when the customer just provided us the class files instead of java source code. We decompiled the classes and generated the source code out of it and studied the structure and logical flow of the project. I must say it was a tedious project.
Now , say for example, you have the source code and then you compiled it and later you will extract the source code from the binary , then how the two codes will differ, just let us see..
For this purpose, I am using the well known decompiler boomerang . It is available for both windows and linux. For linux we need to download it from http://boomerang.sourceforge.net/. As it depends on a seperate libgc, download that too from the same site and copy it to the /lib directory. Then again it depends on the libexpat .. just create a link to the existing expat library in /lib and it won’t complain again
My C program to test decompilation using boomerang is
/**************************************************/
/* Program to check the characteristics of malloc */
/* */
/**************************************************/
#include <stdio.h>
#include <stdlib.h>
int *fun(void)
{
int *a;
a = (int *) malloc(sizeof(int));
free(a);
return a;
}
int main()
{
int *j;
j = fun();
*j = 5;
printf(“%d\n”, *j);
return 0;
}
Then compile it as
cc test.c , now we got the much awaited a.out
then run the boomerang on a.out. You will see something like
./boomerang a.out
Boomerang alpha 0.3 13/June/2006
setting up transformers…
loading…
Warning: dynamic symbol table hack used!
decoding entry point…
decoding anything undecoded…
finishing decode…
found 2 procs
decompiling…
decompiling entry point main
considering main
considering fun
decompiling fun
decompiling main
generating code…
output written to ./output/a
completed in 0 secs.
go to output/a then you will find another test.c
// address: 0×80483d9
int main(int argc, char **argv, char **envp) {
int local7; // r24
local7 = fun();
*(int*)local7 = 5;
printf(“%d\n”, 5);
return 0;
}
// address: 0×80483b4
fun() {
int local5; // r24
local5 = malloc(4);
free(local5);
return local5;
}
Well, almost similar without header files and simple changes. But it is exactly what the previous source code meant to do ..
I will definitely call it a 90% success.