113: Data Types: Arrays.

You’ll often need multiple variables of the same type and while you can sometimes just create separate variables with their own names, what if you don’t know ahead of time how many will be needed?

Even if you do know how many variables are needed, sometimes there are just too many to give each one it’s own name. If you need a hundred bools, do you really want to work with them individually as doorOpen01, doorOpen02, etc.? No. Because this makes your code brittle and completely tied to that specific scenario. How many buildings do you know that have exactly 100 doors?

You want your code to be able to work with multiple doors and know if they’re open or closed. But you don’t want to tie it to any specific number. By making your code independent, it can then work on buildings with no doors at all up to giant palaces with thousands of doors.

Arrays let you work with multiple data types. You can still access each element individually. You just don’t need to give each one its own name that the compiler knows about and must track.

You can also declare multidimensional arrays that have 2 or more indexes.

How many items are in the array or can be in the array is very important. But it’s not always part of the type. There are some programming languages like C++ where this is part of the type. So an array of 3 ints is a completely different data type than an array of 4 ints. But even C++ can allow some unwanted flexibility here. Listen to the full episode or you can also read the full transcript below.

Transcript

Arrays let you work with multiple data types. You can still access each element individually. You just don’t need to give each one its own name that the compiler knows about and must track.

There’s a lot you can do with arrays once you understand how to use them. If you need to read some data from a file into memory, then an array can be very useful. Episode 39 already discusses arrays as a type of collection. Refer to that episode for more information. This episode will fill in some gaps not covered by episode 39. Listening to this episode alone won’t be enough for you to understand arrays. But I also don’t want to repeat concepts that are already explained very well in other episodes.

This episode focuses on the array as a data type. When you declare an array, you need to specify what type the array will hold. This could be an array of bools. Or an array of long integers. Or an array of chars. An array of chars is special and will be the topic of the next episode. In some languages, this is all you need to have a string.

You can also declare multidimensional arrays that have 2 or more indexes.

How many items are in the array or can be in the array is very important. But it’s not always part of the type. There are some programming languages like C++ where this is part of the type. So an array of 3 ints is a completely different data type than an array of 4 ints. But even C++ can allow some unwanted flexibility here. One way to make sure that you always get a type with a specific size is to create a class to wrap up the array.

The C++ language has the ability to create templates that take a value as one of the template arguments. You could create your own individual class type based on a template that would allow you to contain types in an array and then treat a single instance of the class containing 3 ints as a different type than one containing 4 ints. This is not directly part of the language. But is a supported use of the language.

Let’s say you wanted to create a class to manage multiple points. These could be 2 dimensional points on graph paper. Or 3 dimensional points in space. Or 4 or even more dimensional points in some physics simulation program. By making these separate and individual types, you can then declare methods that require a single parameter of that specific number of dimensions. This allows you to avoid passing arrays to methods because you’re just passing a single parameter and at the same time enlist the compiler to help make sure that nobody passes you two points when you expect three. Without this ability, you would either have to check at runtime if the dimensions match and if not, then return an error. Or you would have to pass an array which again normally requires you to check how many items are in the array.

There’s still more to learn about arrays as data types that I’ll explain right after this message from our sponsor.

( Message from Sponsor )

Most of the time, you’ll probably find the built-in support for arrays to be a bit limiting and difficult to use. You should have access to standard classes that come with your language libraries that give you all the benefits of an array but are easier and safer to use. This is especially beneficial when you’re working with a variable number of items.

In this case, your data type is no longer an array. You’re working with instances of these standard classes. Deep inside the standard classes may be arrays but that’s part of the implementation of the class.

I rarely have to work with raw arrays directly. They are still useful when I want to declare some data right in the source code. Since this is fixed data that won’t change, a simple array is a good choice.

When you have an array like this, you’ll usually want to declare the values or the array at the same time. The C++ language lets you do this by putting the values in curly braces. You can even leave off the size of the array and the compiler will figure it out for you. In other words, if you declare an array of bools called doorState and then initialize it with true, true, and false, then the compiler knows to create an array big enough to hold three bools.

You can even find out how many bools are in the array at runtime. Now, you might wonder why would you want to do this? You already know there are three bools, right? Well, instead of accessing those bools as doorState at index zero, doorState at index one, and doorState at index 2, which, by the way is already much better than having three separate bool variables called doorState01, doorState02, and doorState03. Instead of accessing the indexes by specific values, it’s better yet to create a loop and use a single integer index that you increment each time through the loop. The only problem is that you need to know how many bools are in the array.

What you could do is declare your loop to start at index zero and stop at index 2. But what happens when you realize that you only need two doors instead of three and then go into the code and remove one of the bools from the array data? If you don’t also change the loop to now stop at index 1, then it’s going to keep trying to loop through three doors and your program will likely crash.

You want to avoid situations like this where making a change requires changing two or more places in the code. There’s just too many opportunities for this design to bite you. Instead, make your loop figure out for itself how many items are in the array. If you’re using one of the standard classes that wrap up arrays or provide some other collection type, then they should have a nice property called length or count that you can call to find out how many items are present.

But the built-in array doesn’t have this information. In this case, only the compiler knows. You just need to know the proper way to ask the compiler how many items are in the array. There is in C++ an operator called sizeof that will tell you how many bytes something occupies. This is close but if we call sizeof for the doorState array, it’ll return three only because the array itself hold items that are a single byte long. Sizeof actually returns the number of bytes for the whole thing. If the array held ints instead of bools, then sizeof would return 12 because there are three ints each four bytes long. To get the number of items in an array, we need to divide the sizeof the array itself by the sizeof one of the elements in the array. It doesn’t matter which one since an array can only hold items of the same type. But usually, code divides by the sizeof the first item at index zero. For an array of 3 ints, this would mean 12 divided by 4 which comes back to the number of ints.

The last thing I wanted to explain involves passing arrays as method parameters. If you have a method that needs an array, then pay attention to the specific rules for how your language does this. You might end up passing the array as a pointer to the first item. This is useful for dynamic arrays that you allocate at runtime since you don’t know how many items to expect in the array. In this case, you’ll likely also need to pass another parameter to specify how many items are in the array. You won’t be able to use the sizeof method when all you have is a pointer to the first item to determine the size of the array. That’s because pointers are not an array.

You can use the name of an array when calling a method that takes a pointer or an unsized array because the name of an array is actually a pointer to the first item in the array.

What if you want to make sure that a method is always passed an array of 3 bools, no more and no less? Then you won’t be able to declare the method to take a pointer to bools or an unsized array of bools. Both of those will allow callers to pass your method arrays of different lengths. If you’re using C++, you can declare the method parameter to be a reference to an array of 3 bools or a pointer to an array of 3 bools. This is not common C++ code and the syntax can get a bit tricky. Make sure to check with your language because there could be seldom used capabilities like this in C++ or your language could be very different. For example, in C#, arrays have more information such as their size available to you and the rules are different. In C#, you’ll just have to check the length and throw an error if it’s wrong.

113: Data Types: Arrays.

Transcript

Tags

Leave a ReplyCancel reply

113: Data Types: Arrays.

Transcript

Share this:

Tags

Leave a ReplyCancel reply