If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register or Login
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

How to get the source code of a software/program

Greetings!

This is my first post here in the CodeGuru Forums, and I'm still a newbie to coding and programming.
Sometimes, I bump against a software or program, and don't understand how a part of it (or the entire software) works. So, I thought whether it might be possible to "Decompile" a program to get its source code, just like you would do on a webpage. Is it possible to be done at all? Is there any software designed for that purpose? If not, how can I build one?

Re: How to get the source code of a software/program

Well, it does make sense, they probably don't want anyone to steal their code, and most non-open-source software doesn't come with the code.
If the EULA doesn't specify that it's forbidden to «tamper with it», then would there be any problem?

Re: How to get the source code of a software/program

I guess not but even so there is no easy way to reverse engineer a program written in C++ or any other compiled language. What you will get is something very similar to running the application in a debugger showing the assembly code. How that looks like it's easy for you to test by attaching the debugger to a running process and then hit break.

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
- Brian W. Kernighan

Re: How to get the source code of a software/program

Should I assume that you just want to know how to do this for curiosity's sake? You do not want to steal anyone's code or do anything malicious, right? You just want to peak under the "hood"? All right then, it is relatively easy to disassemble a program, that is, turn the machine code into assembly code, but it is much more difficult to turn that assembly code into a higher level language with variable and function names that are meaningful to humans. When a program is compiled, all of the variable names and internal function names are lost. They are replaced by arbitrary addresses that do not convey any information by themselves.

Nevertheless, with a thorough understanding of both assembly and high level programming and enough free time on your hands, there are things you can do. For example, sometimes I use Notepad to view C++ code I have written without having to run the C++ development environment. However, the default tab stop in Notepad is eight characters which makes reading my code more difficult. Using white hat hacking techniques, I was able to change the tab stop to four characters, the same as the C++ development environment. I basically added two lines of (high level) code to the executable file. This hack involved coding in machine language.

Another example was when I wanted to know how to set the wallpaper of my desktop through code. By looking at the MS Paint executable, I discovered the name of the Windows API function I needed to call. I then read the documentation for this function and was able to implement it in my own code.

The best use I have found for hacking programs is hacking my own executable files when they crash. When memory gets corrupted, seeing the relationship between various memory locations in the compiled code can be very insightful. Also, it is possible to insert debugging code that will not accidentally "fix" the problem by moving it to some other place.

Re: How to get the source code of a software/program

Originally Posted by Coder Dave

Should I assume that you just want to know how to do this for curiosity's sake? You do not want to steal anyone's code or do anything malicious, right? You just want to peak under the "hood"? All right then, it is relatively easy to disassemble a program, that is, turn the machine code into assembly code, but it is much more difficult to turn that assembly code into a higher level language with variable and function names that are meaningful to humans. When a program is compiled, all of the variable names and internal function names are lost. They are replaced by arbitrary addresses that do not convey any information by themselves.

Nevertheless, with a thorough understanding of both assembly and high level programming and enough free time on your hands, there are things you can do. For example, sometimes I use Notepad to view C++ code I have written without having to run the C++ development environment. However, the default tab stop in Notepad is eight characters which makes reading my code more difficult. Using white hat hacking techniques, I was able to change the tab stop to four characters, the same as the C++ development environment. I basically added two lines of (high level) code to the executable file. This hack involved coding in machine language.

Another example was when I wanted to know how to set the wallpaper of my desktop through code. By looking at the MS Paint executable, I discovered the name of the Windows API function I needed to call. I then read the documentation for this function and was able to implement it in my own code.

The best use I have found for hacking programs is hacking my own executable files when they crash. When memory gets corrupted, seeing the relationship between various memory locations in the compiled code can be very insightful. Also, it is possible to insert debugging code that will not accidentally "fix" the problem by moving it to some other place.

Now, what do you have in mind?

It took me a good five minutes to fully understand what you were saying there. No offense. Well, I have neither knowledge nor free time, so I guess I will have to develop my skills much further than this. I thought I wouldn't have been the first one to have this idea and that a software might have been developed, but turns out none of this happened.

How would you disassemble a program? It seems complicated to me...

And no, I respect the work of coders and programmers, and I know their lives depend on that code, so I'm not going to take that away from them. So no, not a bot of malicious or theft intentions.

Re: How to get the source code of a software/program

Originally Posted by wkwkwkwk1

How would you disassemble a program? It seems complicated to me...

If you have Microsoft Visual Studio installed and an application crashes due to an access violation or invalid machine instruction, you will be asked if you want to start the debugger that comes with Microsoft Visual Studio. This debugger has a rudimentary disassembler that will let you see how assembly language looks. Using a book about x86 machine language, I wrote my own disassembler that shows more in depth information and breaks the machine code out into functions by following the program flow from the beginning. This is definitely not a task for a beginner. I come from the Apple II generation where you had to write programs in machine language just so they would run in real-time. There were no compilers for Apple II computers, but they did have a built-in disassembler.

I do not know of any public available software that does what you want, but you are not the first one to think of this.

Re: How to get the source code of a software/program

Originally Posted by Coder Dave

If you have Microsoft Visual Studio installed and an application crashes due to an access violation or invalid machine instruction, you will be asked if you want to start the debugger that comes with Microsoft Visual Studio. This debugger has a rudimentary disassembler that will let you see how assembly language looks. Using a book about x86 machine language, I wrote my own disassembler that shows more in depth information and breaks the machine code out into functions by following the program flow from the beginning. This is definitely not a task for a beginner. I come from the Apple II generation where you had to write programs in machine language just so they would run in real-time. There were no compilers for Apple II computers, but they did have a built-in disassembler.

I do not know of any public available software that does what you want, but you are not the first one to think of this.

Thank you for the information. So I guess I will have to dig a lot deeper than this in coding. What programming language do you suggest, for a start? I have been around with C++ and Python, but I know very little of both.

Re: How to get the source code of a software/program

There has been disassemblers around as long as I've been in the business but I don't know how good they are today. Anyway, I think it's safe to assume that you don't have to take this from the binary level unless you really want to do so. The task however of translating optimized assembly into something of higher level is very hard. If you really want to go further with this I recommend that you check what sourceforge has to offer. Downloading something from another source might be pretty unsafe since it's in the gray zone of what's legal.

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
- Brian W. Kernighan

Re: How to get the source code of a software/program

General comment --
The reason there aren't books on how to break into people's houses for sale is because the person that writes the book could become liable if someone uses it and then breaks into a house. The same could be said to be true for breaking into software.

It sounds like you (wkwkwkwk1) haven't done much, if any, programming. If that is the case, then starting your education by cracking and hacking existing code is not the way to go. That is like trying to build an aircraft carrier before you've even learned how to build a basic fishing boat. There is a ton (millions if not hundreds of millions of lines) of code available on the internet and on this site. There is a lot of public code to be reviewing if your objective is to learn.

In fact, there is so much good code out there that is open, that it makes the concept of cracking/hacking code to learn seem.... bogus.

Re: How to get the source code of a software/program

I understand what you are saying. Now that I'm starting my holidays, I will have more time to dig deeper into programming.
By the way, what does bogus mean? I'm from Portugal, so my English is quite limited.

Do you have any suggestion about what would be a good programming language to start? I was thinking about python.

Re: How to get the source code of a software/program

I have never used Python so I do not know anything about it. I mainly use C++. It is sort of in the middle in terms of programming languages. It is not tedious like assembly language, but at the same time, it allows you to work directly with pointers and type casts. Other higher level languages are more restrictive. In addition to learning a programming language, I recommend learning some of the theory behind Computer Science. Specifically, I recommend learning about data structures and algorithms. You can do a lot more if you create your own data structures instead of always relying on some one else's code.

Now, the reason I know so much about looking into other's code is because, as I mentioned before, I grew up programming Apple II computers. I was completely self taught until I got into college. There were not many books out there about programming back in those days, so I learned a lot from looking at how existing programs worked. I am the type who learns by example. Today, however, there are a ton of books on programming all sorts of languages. There is also the internet which has lots of valuable sources, including this website. Most of what you will want to learn is already documented, but not everything ... You will still need to experiment.

Re: How to get the source code of a software/program

Thank you for the information. I already tried C++ several times and gave up. Now I have been trying Python and it's going somewhat smoothly. I think it's due to the tutorial itself. And no, it's not «bogus», but I know it's easy to think it, I would do so, too.

It's Java SE SDK 7 you want to use and if you download it bundled with NetBeans you also get a nice development environment.

Source code for learning purposes is almost useless in isolation. It works well only as part of a thorough explanation. Or when you know already exactly what you're looking for. Otherwise it's very hard to absorb what went into a piece of code just by staring at it. It's like trying to become a Rembrandt or a Picasso by running their paintings through a copying machine.

Besides, decompilation has its limits. The original source can seldom be fully recovered because usually lots get lost in compilation. To continue the painting analogy; What you end up with is a blurred black and white version of the crisp colorful original.

Re: How to get the source code of a software/program

The only way to learn to program is to actually write programs. If you have the source code for something, you still need to try it to see if it does what you think it does. A lot of times you may need to make small changes to the code to get it to work, because the documentation may have been written for a different version of the compiler with slightly different libraries or different settings or what not. Also, your implementation of the code may use different assumptions. Experimentation is the key.