How do you design your application so it scales well to a big size? Scaling needs to be verified early in the design to prevent costly mistakes that usually appear later. You can scale in many ways. The number of users, amount of data, and code size are common. Avoid hard limits in the code and leave room to grow. Test large amounts of data and users even if they’re not real.
This episode describes some design decisions I made recently to let my game handle a large number of game objects.
If you’d like to improve your coding skills, then browse the recommended books and resources at the Resources page. You can find all my favorite books and resources at this page to help you create better software designs.
Listen to the episode for more details or read the full transcript below.
Transcript
By the time you notice that your program will not scale, it might be too late for simple fixes.
There’s a balance here that will come with experience. But we usually only learn from experience when we fail. What kinds of failures am I talking about and how can you notice them before they become bigger?
When we’re in school learning how to program or reading a book or even watching an online video, the goal is usually to teach some specific concept that you can use later. There’s lots of little things you should be considering but these would only get in the way of learning the main idea. So they’re left out of the topic.
There’s just too many opportunities to get lost in the details when learning something new. The problems you’ll be learning about and solving will be very specific to the idea and small.
Let’s say that you’re first learning how to count. It might be okay to use your fingers at first. The problems you’ll be working with are designed for this. Things like 2 plus 3 are easy to visualize with fingers. Even some subtraction can be done with fingers.
But try to scale the problems so they’re bigger with sums in the thousands or millions and you run into difficulty.
The same thing happens in programming. If you need to keep track of items, then you might want to use a vector. It’s easy to push new items on to a vector. And you have simple ways to find things and remove them from vectors.
But you need to know when to use a vector in any type of real software application. There are times when it’s the absolute best choice. And I’m not just talking about when the sizes are small.
It can scale very well to large solutions when used properly.
Because it’s so simple, it’s like using your fingers. So a lot of books and classroom lectures and videos will use it when explaining another topic. I have a book right now that I’m using to get ideas for the game library that I’m working on.
Finding something in an unsorted vector means that you have to examine each item one by one to see if it’s the one you want. So this book came up with the idea to use a bitmask first just to know if a particular item exists in the vector. A bitmask lets you quickly test if a binary bit is set to one and if so, then use that as a signal that the item you want is somewhere in the vector. This helps to avoid searching through the whole vector only to end up empty-handed. It’s better to know quickly and avoid the search if it’s not there.
What happens if the bitmask says the item should exist in the vector? Then the code will have to start visiting each item to see if it matches.
Beyond the time needed to check each item, there’s another scalability problem with this design.
Because the book doesn’t use an expandable bitmask, the design has a limited number of items it can support. This number can be either 32 or 64 items.
To me, this is a bigger limitation that will prevent an application from growing bigger. A vector can always be changed with a different data structure. But the use of this bitmask has a broader impact on the overall design of an application.
It requires that each item have a known bit in the bitmask. It makes it harder to add new items because each new item needs to become well known to all the other items.
I decided to take a different approach and register new items instead. The registered items will not have a bit to go in a bitmask.
So this design allows for many more items.
However many are needed. This changes how the code is designed because there’s a whole process around registration now.
It’s the type of change that can be done easy early on. But would require a lot of effort later.
I mentioned at the beginning that this is a balance. I’ve also seen it taken too far. I remember one response I saw several years ago to an interview coding question. The candidate was working with text and needed to be able to tell the difference between vowels and consonants. The proposed solution used a full hash table to hold the five vowels, a, e, i, o, and u.
Nevermind that it left out y. This is a bigger and more elaborate solution than required. It’s not like the English language gets new vowels. I don’t remember the vowels ever changing.
Now I used a hash table to quickly find items in the game library I’m working on. This is what I meant by being able to change a vector for a hash table. The effect on the design is small when just considering the vector vs. hash table but bigger when you consider the other changes this small change allows. Getting rid of the bitmask and taking a registration approach should allow the game library to grow and expand much bigger than a bitmask would allow.
It also simplifies the complexity of the overall solution.
Sure a hash table is a little harder and more complicated to use than a simple vector. And a bitmask is easier and simpler than registering items. But the biggest advantage I think will be in how new items can be added with no knowledge of other items. And I don’t have to go back to older items to make them aware of new items.
So when you’re designing your solutions, look for hard limits and look for places where code knows about other code. If you find yourself creating a constant to track the maximum number of something. Ask yourself what will happen if you eventually need more?
And as you write or change code in one place, if you find that you have to visit several other places to make allowances, then stop to look for ways that you can shift things around so that these changes can be done separately.
Do this even when you think it’s no big deal and still easy to manage. Because eventually, the problem will get bigger and then you’ll feel stuck in a design that’s starting to feel like a lot of work. And when you try scaling even bigger yet, you might really wish you had made a change way back when it was a minor inconvenience.