Many languages are adopting a model of just-in-time compiling. Do you know how this affects you? This episode will discuss the advantages and disadvantages of just-in-time compiling.
Instead of using the source code directly and interpreting each line and instead of compiling everything up-front, just-in-time gives you benefits from both approaches and then some.
There are some gotchas to be careful of though. For example you might think that because you compile your source code first and distribute an intermediate language, that your source code is safe within your company. But in many cases the first compilation can organize your code and then with all the extra information that is included to make sure that your code is behaving well, you can actually end up with intermediate code that is better than your original.
Listen to the full episode for more about just-in-time compiling, or read the full transcript below.
Transcript
Let’s say some new microprocessor with some new instructions just came out last year. Not all of your customers will have upgraded yet so what do you do? You could have special versions of your application for the old and new processors but managing that and expecting the customer to understand is tough. Even if your customer understands what you’re doing, will the customer be able to figure out which version of your software to install? And what if the customer does upgrade their processor after installing the old version of your software? Will your software know that it can now take advantage of the new instructions?
Flexibility and the ability to run your application on a variety of platforms is really good but there has traditionally been a cost with this. In order to get the most flexibility, you need to interpret your source code. But interpreting the source code on the target platform means that the source code must be available. And many companies don’t want their source code to be publicly available.
For a long time, the only answer to this was to compile the source code to a common and well supported set of microprocessor instructions. This is not so bad really. I mean, sure, the customer still has to make sure to install a Windows version of your software on a PC, or an OS X version on a Mac, or a Linux version on Ubuntu. But most of the time, this can be handled automatically for the user when the user navigates to your download page. Your download page can detect the user’s operating system and provide the correct link. The customer gets a compiled version of your application that runs fast.
But what if there was a way to compile your source code not directly to the microprocessor itself but to something that another program could understand? This intermediate step would be independent of any specific microprocessor and platform. It would need to go through a second compilation before it could actually run. This second compilation could wait until it was about to run and would have the specific information about all the instructions available. You might even get benefit not just from additional microprocessor instructions but from more advanced libraries or newer techniques in software that were unknown previously.
This solves many problems actually. I mentioned that some companies don’t like the idea of releasing their source code. With this system, the source code can remain safely within the company and only the result of compiling the source code to the intermediate language needs to be sent to the customer. However, you do need to consider that the intermediate language is quite well formed and might actually be easier to understand than your original source code.
There are ways to make the intermediate language more confusing by using a program called an obfuscator. An obfuscator mixes up your code that gets sent to your customers making it difficult for anybody to understand what’s going on. Just be aware that like any engineering decision, there’s usually a tradeoff. Making your code harder for somebody to reverse engineer what it does also means it will be harder for you to debug problems.
Anytime your code either in its original source form or in this new intermediate language form needs to be interpreted or compiled, that means there needs to be some other software running on your customer’s computer to do this work.
This is called a runtime.
The runtime might come with the computer operating system or it might need to be downloaded and installed as a separate required step. Either way, it’s extra code that needs to be loaded and running alongside your program.
You’ll need to consider this runtime if you take this approach because the runtime can also have its own versions. It’s possible that your customer could update their runtime and break your program. Or some customer might just refuse to install your application because it needs a runtime version that they’re just not willing to install.
If this seems like a lot of extra roadblocks, you’re right.
But there’s two other benefits that we haven’t thought about yet. They’re both subtle and worth the trouble all by themselves. And together, they make C# a real powerhouse. The benefits are worth the extra time it takes to install your program, and worth the extra time is takes to first start the runtime before your program can begin, and they’re even worth all the extra versioning.
You see, customers will put up with some amount of slowness and in fact might not even notice any slowdown at all due to the second compilation step. But they’ll definitely take notice of a crash. It may only take one crash to cause customers to stop using your application and post negative reviews. The negative reviews will then influence other potential customers to go somewhere else.
A language like C# helps you avoid bugs that can lead to crashes by identifying problems with your code before you release your product to the world. It does this by including extra information in the intermediate language that the runtime can use to double check to make sure you’re doing things as expected.
Let’s take our game of turning left, right, jumping and stepping as an example. We said that the number of steps would be limited to 0 to 5. If this was compiled with C++ and your source code has a bug that tries to take 6 steps, then the C++ compiler won’t always be able to catch this mistake and your program will run with 6 steps. This might work or it might cause some hard to find bug that slips through your testing only to be discovered by your customers. The C# runtime though when it looks at the intermediate language code will be able to see clearly that you are trying to take 6 steps when only 5 are allowed. This is still a bug.
Just going through another round of compilation will not magically fix your mistakes. But this bug will result in a notification to the rest of your program that something is wrong right away. You can write code to deal with this in such a way that your customer has a better experience than losing half a day’s worth of work.
And the other benefit is more for you than the customer. Many types of programs are just plain faster to develop in C# than in C++. This is due in large part to the enormous libraries that come with C# that are really designed to blend together well. Other languages including C++ have large libraries available too but they’re usually designed by separate organizations and companies. This means when you are working with these different libraries that you need to switch how you think and how you write your code to fit each library. With C#, they all resemble each other so much that you can get more done in a shorter time.
I remember one project I worked on where we had tasks estimated to take up to a week each. And there were a lot of them. We switched this portion of the project to C# and found that we could complete each week long task in just a few hours. We saved months of development time.
So to review, a native compilation results in a small and fast application that can run all by itself. It’s specific to a particular microprocessor and platform so tends to target common, well known instructions and may not always be able to take full advantage of the customer’s computer. Your program can even be small enough to run just fine in microcontrollers used in robotics. And it doesn’t normally need to be obfuscated.
Fully interpreted languages need the actual source code in order to run as well as the interpreter. The interpreter could be built into the customer’s browser or some other software that the customer does not even realize is being used to run your software. Your application will run slower because each instruction needs to be translated or interpreted while the program is running and the result of the interpretation is normally not saved anywhere. So if your program does the same thing again, then it will likely need to interpret the same instructions again.
Just-in-time compilation provides all the benefits of compiled code with a slight delay as the intermediate language needs to be compiled before it can run. The second compilation is more efficient because the initial compilation does most of the work of trying to figure out what you wrote in your source code and turns that into a standard intermediate form. This second compilation as well as the additional verification to help find bugs also needs extra software just like the interpreted languages. The runtime for all this tends to be much larger than a simple interpreter. It will take longer to install and it will take longer to load the runtime and it will use up a lot more resources on the customer’s computer to hold the runtime in memory alongside your program.
As computers become more powerful with extra memory and faster hard drives, the extra burden caused by a just-in-time compiler runtime will become less of an issue. And as internet speeds become faster, the time it takes to download a new runtime also becomes less of an issue.
The future does look like it’s moving more towards just-in-time compilation. But at the same time, there will always be a place for small and fast programs that don’t need a runtime and that can squeeze into places that a runtime would never fit.