Setting Up Arrays and Using Them
And Before We Get Started (Again) . . .
As mentioned in the previous topic, there are two ways to access and execute array operations. The first form you observed in the previous topic is more commonly used simply because it is easier. However, there are other actions you can take with pointers in the C language, and using them with arrays is an easy way to get started with understanding them. And, as you will see below, there are a couple of cool tricks that can be done with pointer-created arrays that you can't do with bracket-created arrays. Also, as previously mentioned, using both array tactics will make you a better programmer. Note that this page is going to look a lot like the previous page with the same examples. That is purposeful so you can compare the two tactics to each other as needed. But don't skip it; there are differences.
Using Pointer Arithmetic
The fundamental arrays in the C programming language are simple to declare and define. You define the name of the array as you have done with variables before, and then you tell the computer how many cells or locations you want the array to have, using in this case, a thing called dynamic memory allocation. It's just a little more complicated, but not bad.
int *integerArrayPtr = (int *)malloc( 10 * sizeof( int ) );
double *doubleArrayPtr = (double *)malloc( 10 * sizeof( double ) );
char *characterArrayPtr = (char *)malloc( 50 * sizeof( char ) );
Before discussing how this works, remember that literal numbers should not be used in a program, not even to define arrays. The above should really look like the following.
// in global constant area of class
const int NUM_INTEGERS = 10;
const int NUM_DOUBLES = 10;
const int NUM_CHARACTERS = 50;
// declared at the beginning of a function
int *integerArrayPtr
= (int *)malloc( NUM_INTEGERS * sizeof( int ) );
double *doubleArrayPtr
= (double *)malloc( NUM_DOUBLES * sizeof( double ) );
char *characterArrayPtr
= (char *)malloc( NUM_CHARACTERS * sizeof( char ) );
Note that C++ has a slightly different way to do this, shown here:
// in global constant area of class
const int NUM_INTEGERS = 10;
const int NUM_DOUBLES = 10;
const int NUM_CHARACTERS = 50;
// declared at the beginning of a function
int *integerArrayPtr = new int( NUM_INTEGERS );
double *doubleArrayPtr = new double( NUM_DOUBLES );
char *characterArrayPtr = new char( NUM_CHARACTERS );
The first thing to note is that the above is known as dynamic memory allocation. This means that the programmer is requesting memory that has not already been set aside for this particular function. There is more to discuss about this later in this topic, but for now, just know that you are asking the system to provide you with some memory you can use.
Now you can look over all that is happening here. Note that the exact same thing happens when you use the bracket operations but it is done outside of the programmer's view, and there is one other difference which will be discussed later. The following will identify and explain each part of the dynamic memory allocation process.
- To start with, a pointer variable is declared (int *integerArrayPtr). There are two uses for the asterisk( * ) character. The first use is to tell the compiler that the identifier that follows the asterisk is to be declared as a pointer. Remember from a previous topic that a pointer can simply point to data in memory
- Next up, the malloc ("memory allocation") function actually makes the request to the operating system for the memory that is needed. However, the malloc function doesn't know what kind of data is being requested so it needs to be told what kind of pointer is being used. This means it needs to be cast to the appropriate pointer type (int *)
- Finally, the malloc function needs to know exactly how much memory to allocate. This is implemented by showing the number of elements needed, and the size of each element which depends on the type of data being used. The parameter that is passed to the malloc function is NUM_INTEGERS * sizeof( int ) which will provide that information to the operating system
One thing to notice about the above actions is that instead of using the integerArray identifier, the integerArrayPtr (i.e., the extension Ptr is added to the end of the identifier) is used. While it is not an absolute law, it is a good practice to distinguish between array identifiers and pointer identifiers even if they do the same thing. As you will see later, pointers do have a couple extra tricks up their sleeves so it is a good idea to make clear which is which in your programs.
Again, the result of your above definitions is the same as it was using the bracket operations. In the case of the first example, an array named integerArrayPtr is created. Then, just like when you defined variables, a data storage "box", called an element (much like the "bucket" metaphor when you first learned about variables), is created. However, since you have indicated that you want ten of these integer data types, nine more "boxes" are created after the first one. If you define your array with ten in the malloc parameter, you will have ten boxes available in which to store integer data types.
The array of "boxes" that you have created can be represented as shown below.
The array is shown with "stuff" in it when it is first created. The technical terminology for the stuff found in variables and arrays when they are first created is garbage, a term you have seen before. Some languages, like C, just allow garbage to be left in arrays when they are initialized but others may zero them out if they are numerical, or place other standard values in each element of some arrays. That said, you should never a trust a language to conduct this action. First, you want to get in the habit of conducting your own initialization so you can move over to other languages, and second, sometimes "automatic" things in programming languages just don't work.
This is a good place to note the difference between the terms size or length, and the term capacity. An array may be declared with a capacity of 100 integers or doubles, etc., but until you actually place appropriate or valid data in the array, the size or length is zero. Once you do place data into the array, the size or length is considered to be the number of valid items with which you are working, not the capacity of the array. You must be careful to maintain the distinction between these two quantities. It is not common to be using all of the elements in an array, so your array size or length will almost always be less than your array's capacity. As a note, the word size is commonly used to mean capacity in some other languages, and commonly on the Internet. This is unfortunate since it blurs the difference but there are some things you have to learn to live with.
The following is an initialized array. For the moment, assume that the programmer took some action to implement the initialization, so your array will now look like the example shown below where all the values have been set to zero.
Always remember that arrays are not guaranteed to initialize themselves. For the safety of your programs, assume until otherwise told that your arrays start with garbage in them.
Whether your array has been initialized or not, once it is defined, you can store data in it. To implement this process, you need two things identified above. First, you need a location in which to place your data. The numerical value of an array location or element is called an index.
Secondly, you need the data itself. Since you have declared this particular array as an integer array, you are only allowed to store integers in it. In order to do this, you indicate the following to the computer, as shown here.
*( integerArrayPtr + 5 ) = 49;
As you recall this same operation was integerArray[ 5 ]; back in bracket land but the result is identical. The action has placed the value 49 in the sixth element of the array. As mentioned above, one of the uses for the asterisk ( * ) operator was to indicate when you are declaring a pointer variable. What you see here is the second use of the same operator, but in this case, it is saying "The value at:". Formally, this is called dereferencing but if you just think of it as saying "The value at:" it will help you understand the action that is occurring.
Notice that since you placed it in the element labeled five, which is the offset from the first element, you have actually placed it in the sixth location. Your array of "boxes" would now look like the following.
To get the value back, you basically invert the process as you did with bracket operations. For example if you had defined a double value called rootValue, you could implement the following process.
double rootValue;
// get square root of array element
// function: sqrt
rootValue = sqrt( *( integerArrayPtr + 5 ) );
The library math function called sqrt is again used here. rootValue would be assigned the value seven because it is the square root of the value 49. Each element of the array acts just like an double, or double variable all its own. It is just easier to handle quantities of values by providing one name, and then using index locations to get the values back.
Remember that you can make an array of any data type, and hold values of that data type, as needed. The following provides some examples. The constant MAX_VALUES is used to define the number of elements. This is an abstraction because you may not always know what MAX_VALUES is. There is no harm in this because you can work with whatever number of elements are declared. Remember that as you study lists or groups of data, you may study lists of 10, 25, or 1,000 (or more) values.
However, to be a programmer, you must be able to manage a list of any number of items. For this reason, many times the number of items will be abstracted to N or n, as mentioned previously in this reference. Since N/n can represent any number of items, you can learn how to manage lists of any size. We commonly use N as the abstraction for "some number of values", but we always use self-documenting identifiers such as MAX_VALUES to declare our array capacities.
The following is a simple example of using an array of doubles.
// Example of array of doubles
double *dArrayPtr = (double *)malloc( MAX_VALUES * sizeof( double ) );
*( dArrayPtr + 4 ) = 4.55;
*( dArrayPtr + 6 ) = 3.77;
*( dArrayPtr + 8 ) = *( dArrayPtr + 4 ) * *( dArrayPtr + 6 );
// the ninth element in dArray now holds 17.1535
The following is a simple example of using an array of Booleans.
// Example of Boolean value array
bool *bArrayPtr = (bool *)malloc( MAX_VALUES * sizeof( bool ) );
*bArrayPtr = true; // no need for offset
*( bArrayPtr + 1 ) = true;
*( bArrayPtr + 2 ) = false;
if( *( bArrayPtr + 1 ) )
{
// output message
// function: printString, printEndline
printString( "The second element of bArray is true" );
printEndline();
}
The following is a simple example of using a character array.
// Example of character array
char *cArrayPtr = (char *)malloc( MAX_VALUES * sizeof( char ) );
*cArrayPtr = 'a';
*( cArrayPtr + 1 ) = 'b';
*( cArrayPtr + 2 ) = 'c';
*( cArrayPtr + 3 ) = 'd';
*( cArrayPtr + 4 ) = 'e';
*( cArrayPtr + 5 ) = 'f';
// output lunch message
// function: printString, printCharacter, printEndline
printString "I think I will eat at the " );
printString( " "
printCharacter( *( cArrayPtr + 2 ) );
printEndline();
printString( " " );
printCharacter( *cArrayPtr );
printEndline();
printString( " " );
printCharacter( *cArrayPtr + 5 ) );
printEndline();
printString(" "
printCharacter( *( cArrayPtr + 4 ) );
printEndline();
The result of the above character program segment is shown below.
I think I will eat at the c
a
f
e
Now that you have seen how to declare, assign to, and retrieve from an array using pointers, you have one condition left to consider with this tool. As you know, array elements can be passed to functions exactly the same way as other variables; passing whole arrays is even easier but will require a little bit of discussion.
Start with passing array elements to a function. Suppose you have a function that adds two numbers together and returns their sum, called addEm, as shown here.
int addEm( int valOne, int valTwo ) // header
Only the prototype is provided so you can see that it is just like any other function with normal parameters. So, when you need to add two values in an array, it works exactly as you might expect. Consider the following.
// usage
*( numbers + 5 ) = 10;
*( numbers + 15 ) = 20;
sum_1_2 = addEm( *( numbers + 5 ), *( numbers + 15 ) );
// sum_1_2 will now hold the value 30
Now an array can be passed to a function either with brackets ( int intArray[]) or as a pointer(int *intArrayPtr). Remember that these are exactly the same thing so in most cases, the compiler does not care. Consider the following function that adds all the values within the function and returns their sum.
int addEmAllUp( const int *intArray, int sizeOfArray ) // header
And now consider how this is used.
// call function with array and its size
// function: addEmAllUp
overallSum = addEmAllUp( ages, size );
Whoa! This looks exactly the same as the previous topic when passing an array into a function. That's because it is the same. With or without brackets, the array name is a pointer so it is passed exactly the same way.
But wait, there's more!
And it is an important difference. As was mentioned in the previous topic, bracket-created arrays are local in scope. That means that just like any other variable, they are only available in the function where they were created. Pointer-created arrays are a little different. When malloc is used to create dynamic memory (i.e., the array), the memory remains allocated until you put it away. The allocated memory even remains if the pointer being used is somehow pointed at something else and no longer at the array. If this happens, the memory is still there but there is no way to access it anymore so it is effectively lost. Since it remains allocated but unavailable, this is called a memory leak, and as you might expect, it is a bad thing.
That said, it isn't really a big deal to handle this situation. Here is the code that will put the allocated memory back where it belongs (like your Mom always told you to to).
// assume memory was allocated to the dArrayPtr in C
// deallocate memory
function: free
free( dArrayPtr );
Or in C++,
// assume memory was allocated to the dArrayPtr as an array in C++
// deallocate memory
function: delete
delete [] dArrayPtr;
Note that if the memory was declared as an array as shown in the second example shown in the declaration part previously in this page, then it must be deallocated as an array using the two brackets ([]).
That is all there is to it. When the free function or delete operator is called, using the pointer associated with the dynamically allocated memory, the operating system happily takes the memory back and no one is the wiser. To be clear, programmers must take this action because memory leaks can bring down a program . . . or a system. And that's a really bad thing.
If you are thinking about this difference, you are probably wondering if you could use the array somewhere other than in the function in which you created it. And you are right to ask this question. Because you can use the array outside the function in which it was created. However, a clear difference needs to be made here.
As you have read here previously, you can create an array, pass it into a function as a parameter, and if the array parameter was changed inside the function, the data in the array will also have been changed. Consider the following.
void loadEmUp( int array[], int size )
{
// initialize function/variables
int index;
// for number of items (size)
for( index = 0; index < size; index++ )
{
// assign value to array
array[ index ] = index + 1;
}
}
This function is called as follows:
int *intArr = (int *)malloc( numVals * sizeof( int ) );
loadEmUp( intArr, numVals );
// next code operations
When the above function has completed, the array inside the function would have 1 2 3 4 5 in it if size were 5. However, the array in the calling function (i.e., outside the function loadEmUp) would also have 1 2 3 4 5. This works because:
- the array intArr was created outside the called function (i.e., loadEmUp); and
- the array was called with intArr as a parameter; intArr is an array name, and therefore a pointer, and therefore an address in memory where the data is stored; and
- when passing arrays as parameters without the keyword const, the array can be modified inside the function because it is changed via the array name/memory address that was passed into the function
But what would you do if the array was created inside some other function and not outside, prior to calling the function? The answer is here:
int *loadEmUp( int size )
{
// initialize function, variables
int *localArrPtr = (int *)malloc( size * sizeof( int ) );
int index;
// loop across given number of items
for( index = 0; index < size; index++ )
{
// assign a value to the array
localArrPtr[ index ] = index + 1;
}
// return array pointer
return localArrPtr;
}
Note that the array is created inside the function. If this were done using brackets, the array would be local to the function and not usable once the function terminated. However, with the array created as dynamic memory, it will continue to be available -- that's called persisting -- even after the function has done its job. For that reason, the pointer to that array can be returned to the calling function. Again remember that all that would be returned from the function would be the address of that location in memory so there is no overhead in moving all the array data from one function to another. You can see this since the return quantity of the function is int *.
And it would be called as follows:
// initialize function/variables
int size = 10
int *arrPtr;
// some code prior to the function call
// load array with values
arrPtr = loadEmUp( size );
// some code following the function call
// the array pointed at by arrPtr can now be used
// anywhere else in the program
Wrapping Up Part 2, Arrays using pointers
So, there are differences between the two ways of creating and using arrays. The actual use can be exactly the same if you want it to be. In fact, from here to the end of this chapter, brackets will be primarily used for accessing and manipulating data just to keep things simple. That said, you need to be completely familiar with how to handle data management with pointers. These will be used later on with an even cooler data structure called a linked list, and you don't want to miss out on that.