A game that only has one character with one quest and with one shop that sells one item isn’t going to be very fun. That’s why you need to be able to work with collections of items. You have choices and this episode continues more than a week long exploration of collection types available. You’ll learn when to use each type of collection and why. Up today is another tree called a left-child right-sibling tree. I’ll explain what these trees are and then give you some guidance on when to use them.

Here’s what a left-child right-sibling tree might look like when holding some folders. This example shows a root node “C:” with two children, “Program files” and “Windows”. The “Program Files” folder has its own folder for “Office”. A real file system will have a lot more folders but it will follow this basic pattern shown below:

root -> node(C:)

node(C:).leftChild -> node(Program Files)

node(Program Files).parent -> node(C:)
node(Program Files).leftChild -> node(Office)
node(Program Files).rightSibling -> node(Windows)

node(Office).parent -> node(Program Files)

node(Windows).parent -> node(C:)

Listen to the full episode or you can also read the full transcript below.


Are you familiar with how to arrange your folders and files on your computer? You might have a single root folder on a Mac or Linux or several drive letters on Windows that are root folders. Inside a root folder, you can have other folders and files. And inside a regular folder, you can have other folders and files. This is a hierarchy. It starts with a single node and can branch out quickly, or go deep, or both. You might have just a few folders at the root level but then have some folder that contains thousands of folders or files.

In a situation like this, you’re more concerned about the structure than sorting. That doesn’t mean that finding files is not important. You probably still want some other data structure to keep track of where files are overall. But you need a tree that can let you place a folder or a file in an exact place. This place may not have anything in common with other items nearby but that doesn’t matter. If the user wants to place a file inside some folder, then that’s where it needs to go. We can’t use a binary tree because it places items where it needs to based on some comparison.

Maybe you could have a folder class that contains pointers for child folders. But how many child folder pointers do you need? You don’t know so this needs to be a collection. Should you put child folders and files together in this collection? Probably not because folders need to have the ability to contain other child items while files don’t. If both folders and files were in the same collection, then you would need some common interface for both of them. It’s better to keep child folders and child files in separate collections.

How about using a couple vectors. One for folders and one for files? This would definitely allow you to keep track of a collection of folders and files at any given folder.

Maybe this is exactly what you need. You can get very creative with just some vectors and your own classes. But the problem with this is that the overall structure switches back and forth between vectors and your own classes. If you later find that you need a similar structure for different classes, then you’ll need to implement all this all over again. And if you have code that’s designed to work with your folder and vector structure, then it will also need to be adapted to work with the new class and vector structure.

What if you could create a collection type that allows you to represent a hierarchical folder structure and that would work with any class; not just folders? We’ll keep the files in the folder class because they’re not really part of the hierarchy structure. Although, some files might act like folders such as compressed files. Maybe you do want a common interface for both folders and files after all.

As a side note, I hope you’re picking up how I’m going about analyzing this problem. I’m thinking about structure and behavior and how things are similar and how they’re different. I’m thinking about common behavior that might be useful in other situations. All these things go into your software designs. It’s okay if you don’t think of everything. Software changes. Do you think that Windows always had the ability to navigate into compressed files as if they were a mini file system? This was added later. A little thought now will go a long way with your designs. Just don’t overthink a problem so much that you either never get started or you come up with a design that’s so complicated that nobody could ever hope to understand it.

I’ll explain how you can implement your own tree collection type that meets our file system needs right after this message from our sponsor.

( Message from Sponsor )

What you want is a collection type that resembles a tree but that allows you to place items in specific nodes of your own choosing. Once you have this, you can put your folder class instances in the tree or some other future class with similar needs.

What I’m going to describe here is a type of tree called a left-child right sibling tree and it works like this.

You’ll have your tree class that has a pointer to the root node. So far, it’s just like any other tree class. The real work happens in the tree node class. This is the class that’s designed just for this collection type and allows you to keep your class such as your folder class focused on what it does best. Let the left-child right-sibling tree node class handle the children and siblings.

The first thing is the node class has a parent pointer just like in the binary tree node class. It also has two other pointers and one of them is the left child pointer just like the binary tree. The similarity stops here though. This left child pointer is not based on some comparison. It just points to the first child. What about the second child, or the third? If you think about all of the children in a line, then they form a list. Some of them may have children of their own but let’s not worry about that just yet. Just focus on one set of children. We get to this set of children through the parent’s left-child pointer which points to the first child. Once we get to the first child, then that first child can point to the second child, and the second child can point to the third child. All of the children are siblings of each other. That’s the purpose of the right-sibling pointer. It just points to the next sibling to the right. If there’s only one child or this is the last child in the list, then the right-sibling pointer will be null.

Any of the children in the list can have children of their own and if so, the child uses its left-child pointer to identify the first of its children.

This forms a hierarchy where each node can have as many children as needed. All it needs to do is point to the first child and the children take it from there.

Each node points directly to its parent. this is needed because while nodes are aware of who their next sibling is, they have no idea of which node was the first child. This is just like a single linked list. If you want to go back, you have to start all over again. Having a pointer to the parent allows you to get back to the first child in the list if you need to.

What kind of performance will this collection give you? Since this collection is all about user defined structure and has no rules of its own about how to arrange nodes, then looking for a particular node doesn’t give us anything we can use to rule out other nodes. We need to start looking at the first child, checking if it has a left child and if so either remember this or go ahead and start searching this left child right away. We end up needing to look through the collection one item at a time so this will be a Big-O of N operation. In many ways it’s just like a big single linked list that’s been broken up into smaller single linked lists.

If you were to combine this collection with a binary tree, then you could have the binary tree store items by name or number or whatever you need that then point to the appropriate node in the left-child right-sibling tree. this would allow you to find items quickly in the binary tree and keep their user-defined arrangement in the left-child right-sibling tree. The tradeoff for this approach is that you need to maintain two collections now which will use more memory. But it will allow you to use each collection for what it’s best used for.