140: Name Mangling and Overloaded Methods.

You normally don’t have to worry about name mangling. But you should know what it is.

This sounds a lot worse than it is. Name mangling is a technique that compiler vendors use to give overloaded methods unique names. It’s also sometimes called name decoration. You’ll likely see strange names that don’t match what you might expect when debugging or when trying to understand linker errors.

Before C++ or other languages that allow you to create methods with the same name, each method had to have a unique name. It’s easy to take for granted this ability to name methods with the same name. At some point though, the compiler needs to know exactly which method will be called. And the linker also needs to be able to identify one method from another. We can’t use a simple method name for this when that’s not unique.

What most languages do is enforce that the method name along with other factors such as the parameter types is unique. At least that’s what we see when programming. But behind the scenes, the compiler renames methods by using combinations of the method name, special characters, the parameter types, and possibly the calling convention, return type, class name, and other factors. It’s not a very readable name. And there’s no standard way that compilers follow. They’re all different even sometimes between versions.

Sometimes, the name mangling will be standardized. This is usually for simple languages like C so that different compiler vendors can make operating system calls. Yes, even the C language that doesn’t allow method overloading can benefit from name mangling to be able to distinguish how a method should be called. this is called the method calling convention.

Listen to the episode, or read the full transcript below, for more details including why compiler vendors don’t just agree on a standard naming system.

Transcript

This sounds a lot worse than it is. Name mangling is a technique that compiler vendors use to give overloaded methods unique names. It’s also sometimes called name decoration.

You’ll likely see strange names that don’t match what you might expect when debugging or when trying to understand linker errors.

Before C++ or other languages that allow you to create methods with the same name, each method had to have a unique name.

Let’s say that you’re writing an adventure game and want to allow the hero to be healed. There are different ways to heal the hero and two of them might be through the use of bandages or magic. Both methods heal the hero but need different parameters. You couldn’t have two methods both called heal so instead you needed to create a healByBandage method that takes a bandage parameter and another healByMagic method that takes a spell.

This works but you have to come up with unique names for each method. Sometimes the names aren’t as easy as this. Maybe several parameters are involved. Do you really want another method called healByBandageAndOintment?

And then there’s constructors that you can’t just rename however you want. Depending on the language, a constructor usually has to match the class name. Creating a class that’s flexible and can be created with a default constructor, a copy constructor, and several other overloaded constructors would not be possible without method overloading.

It’s easy to take for granted this ability to name methods with the same name. At some point though, the compiler needs to know exactly which method will be called. And the linker also needs to be able to identify one method from another. We can’t use a simple method name for this when that’s not unique.

What most languages do is enforce that the method name along with other factors such as the parameter types is unique. This means that because the healByBandage method takes a bandage parameter and the healBySpell method takes a spell parameter, then they’re already unique because they have different parameter types. The method name can now be just heal for both versions.

At least that’s what we see when programming. But behind the scenes, the compiler renames the simple heal method to maybe something like heal $ 1 bandage $. The name will be some combination of the method name, special characters, the parameter types, and possibly the calling convention, return type, class name, and other factors. It’s not a very readable name. And there’s no standard way that compilers follow. They’re all different even sometimes between versions.

This is the basic explanation of method name mangling. I’ll provide more insight right after this message from our sponsor.

I keep using gaming examples because, well, writing games is fun and it makes programming fun too. How would you like for me to show you exactly how to write a real game from start to finish? I’m organizing a 5-day workshop that will show you everything you need to go from a complete beginner to writing your own game. This will be a graphical 2D side scroller and I’ll show you how to get your game character running, jumping, and climbing through multiple levels. You’ll learn fundamental programming techniques first-hand beyond anything that I can describe in this podcast. Let me know if you’re interested by texting gameweek as a single word to the short number 44222 and I’ll give you all the details. There’s a really good offer that’ll be available for a limited time and there’s only room for 20 students. So if you want to take your programming beyond a hobby and learn how to make a real game that you can sell online, then text gameweek to the number 44222. I know that some phones don’t have access to short numbers like this. Or maybe you’re listening from outside the United States. Just go to takeupcode.com/contact and send me a message.

Alright, back to name mangling. I mentioned that the method parameter types must be unique. Let’s say for a moment that the heal method doesn’t need bandages or spells but the specific number of health points to heal and that you want another heal method to take the number of hours to rest. These are both integer types and even though they might mean very different things, you can’t overload method based on the names of the parameters. The types have to be unique. You also can’t overload a method based on the return value type. The compiler just looks at the parameter types and for some languages such as C++, it can also use the constness of class methods to determine uniqueness. Different languages may have different rules for what makes one method unique from another.

If you do want to use similar parameter types, then you’ll just have to give your methods different names.

You might wonder why the compiler vendors don’t just form a committee and agree on a standard. The problem is that how methods are named is just one aspect of interoperability. How class methods are arranged in vtables, how exceptions are thrown and pass through method calls unwinding the stack, and other aspects too such as padding and memory layout all determine if code that’s compiled with one compiler will work with code compiler with a different compiler. Most languages are not designed to support this level of interoperability. So just agreeing on how method names should be formed could actually make things worse.

If you ever need to interact with other languages or with code built by different compilers, then you’ll want to avoid not just name mangling.

Maybe you want to allow other developers to extend your application through plugins. This is an excellent way to allow fans to build custom features into your application that maybe you don’t have time or the resources to build yourself.

The plugins will need some way to interact with your application usually by calling methods between the application and the plugins. You’re not going to be able to control what development tools are used to build the plugins so you need to avoid any advanced features of your language and stick to something like the C language if your platform has a standard such as what Microsoft worked out with several compiler vendors.

You can do this is by defining a small set of methods that will be used as the communication point between the application and the plugins. By declaring these few methods to use the C language, then the compiler may still change the names but will do so in an agreed manner.

Or you can use some other form of communication such as named pipes or a web server where you’re not really making method calls but just sending data.

Or you can use a binary standard such as COM. I’ll explain these alternate topics in future episodes.

140: Name Mangling and Overloaded Methods.

Transcript

Tags

Leave a ReplyCancel reply

140: Name Mangling and Overloaded Methods.

Transcript

Share this:

Tags

Leave a ReplyCancel reply