One Dimensional Array Concepts
Stepping Up Your Game
You have really completed the learning necessary to program computers. If you look back at the previous topics, you will see that in one way or another, you have learned to implement all of the fundamental concepts of Computer Science. However, this chapter is less dedicated to learning how to program, and more dedicated to applying your programming and Computer Science skills to accepting, processing, and otherwise managing quantities of data.
With this chapter, you will be taking the first steps into learning about data and its manipulation and management as the goal of your programming processes. Being able to store a set of names and addresses, or student names and grades, or driver's license numbers is what makes the world go around. As you work your way through this chapter, take your time to gather as much as you can. As has been the case with previous chapters, when you continue on with larger and more complicated data structures, they will all be built on what you learn here.
Managing Data in Bulk
Up to this point in your programming, you have been defining and using individual variables to store and manipulate data. If you had to find the average of three test scores, you would create three variables to hold the test scores, one to hold the number of test scores, one to hold the sum of the scores, and one to hold the average of the scores, as shown below.
double scoreOne, scoreTwo, scoreThree;
double sum, average;
int numScores;
This setup works okay for a condition of this size, but what happens if you have a need for twenty or thirty test scores, or the scores of twenty or thirty students. Typing scoreOne, . . scoreFifteen, . .scoreThirty is going to be difficult and bulky, and if you have to do the same thing for each item, there is a likelihood – actually almost a guarantee – that you will make a mistake or five as you type in all the processes and manipulation of the data.
The data (i.e., the score variables) given above is said to be homogeneous, or all of the same type. In other words, you have a need to store a whole list of test scores that will all have the same kind of information stored in the same kind of data type. In this case, you have a need to store a series of test scores that may have fractional components, so you will want a packaged group of doubles.
If you think of arrays as organized groups of things, you can get an idea of what arrays mean in programming. In large communications facilities, such as airport control towers, weather facilities, and satellite tracking stations, it is very common to have antenna arrays. An antenna array is simply an organized group of antennas. Each antenna will have its own location, and each antenna will have its own characteristics, such as the frequency at which it transmits or receives signals. It would be a normal day's operation for a supervisor to give the following instructions to her employee: "Switch to the antenna at location seven; that one is a low frequency antenna".
Another, less technological example would be the storage shelves in a shoe store. A supervisor's instruction might be: "Go to shelf five to find the size eight and a half shoes". As mentioned above, arrays are organized groups of data. You must know the location of the data or thing if you want to get access to it, and you must know what kind of information you are looking for. If you look around, you will see that arrays are used all the time in everyday life.
Pointing at the Data
To understand the fundamentals of arrays, you need to look at the fundamentals of data storage (e.g., memory) again. Think of the following table as the memory of your computer. As mentioned previously, this is not exactly the way data is stored in a computer, but it is a reasonable representation for purposes of the present discussion. As you will notice, this representation takes the form of a simple spreadsheet, as shown here.
A |
B |
C |
D |
E |
F |
G |
H |
|
1 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
2 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
3 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
Now if you think of the spreadsheet (e.g., the computer memory) with data in it, it might look like the following.
A |
B |
C |
D |
E |
F |
G |
H |
|
1 |
(name) |
(age) |
(pay) |
--- |
--- |
--- |
--- |
--- |
2 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
3 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
Remember that the name of the variable is not actually stored at the data location but between the compiled code and the operating system, the name of the variable (i.e., the identifier) is associated with the memory location; for this representation the names will help you understand the memory usage.
Next, if you wanted to create a bunch of data such as a series of ages, you would have to declare separate memory locations for each on, as shown here.
A |
B |
C |
D |
E |
F |
G |
H |
|
1 |
(name) |
(age) |
(pay) |
(age1) |
(age2) |
(age3) |
(age4) |
(age5) |
2 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
3 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
As mentioned previously, creating ages for a whole class of students is really not viable, especially if you have 200 or 300 students in your class, and you need to store the ages of every student.
The solution is an array, but the question is how do you make all the memory locations without declaring all the variable names? And the answer to that is easy. You don’t. You create one name for the list of data values, such as ages in this case, and you tell the computer which one of the data values you want by giving it an offset from the first location.
Then the next obvious question is how does one find the first location? It turns out that the answer to that question is easy as well. You point at it. The final question might be, how do you point at it, and the answer to that one is you use what in the C programming language is called a pointer.
A pointer is an identifier you actually declare to, well, point at some location in memory; the pointer is a variable that stores the address of the memory location to which it points. Consider the following modification to the data storage shown previously.
A |
B |
C |
D |
E |
F |
G |
H |
|
1 |
(name) |
(age) |
(pay) |
(ages) |
--- |
--- |
--- |
|
2 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
3 |
--- |
--- |
25 |
15 |
67 |
52 |
31 |
--- |
4 |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
--- |
There are no more identifiers associated with each data value but, as should be obvious, it still requires one memory location for each value. Notice also that one identifier has been used for this data set called ages. The identifier ages is a pointer, and as you can see in the table, rather than holding any actual data, it holds the "address" C3, of the first element. One element is exactly one data value within any given array, and the array holds a specified number of elements.
From there, if you want to access for example, the third element in the array, you just tell the computer. There are two ways to do this that are shown here but will be discussed in later topics.
// go to the first element of the array, identified by the pointer,
// move over two elements from there to get to the third value,
// and in this case, assign it to some other value
otherValue = ages[ 2 ];
OR
otherValue = *( ages + 2 );
That is really all there is to it. You can now declare an array that can hold a large number of specific values as if they were individual variables, you can store them at those memory locations, and you can retrieve them from those memory locations. Here is a very simple example of using array elements.
// Using the bracket operator
// enter values to an array
payRates[ 0 ] = 10.75; // first element in array, distance 0 from the pointer
payRates[ 1 ] = 11.50; // second element in array, distance 1 from the pointer
payRates[ 2 ] = 15.35; // third element in array, distance 2 from the pointer
// calculate average pay rate
averagePayRate = ( payRates[ 0 ] + payRates[ 1 ] + payRates[ 2 ] ) / 3;
OR
// Using pointer arithmetic
// enter values to an array
*payRates = 10.75; // first element in array, distance 0 from the pointer
*( payRates + 1 ) = 11.50; // second element in array, distance 1 from the pointer
*( payRates + 2 ) = 15.35; // third element in array, distance 2 from the pointer
// calculate average pay rate
averagePayRate = ( *payRates + *( payRates + 1 ) + *( payRates + 2 ) ) / 3;
The C arrays are called zero-based arrays. Other programming languages support arrays that start with 1 so the first element is number 1, the second is number 2, and so on. Either of these strategies can be used, but it makes more sense if you understand the real process of pointing to and accessing an element in the array, and then using an offset from that pointer to get to any of the other locations.
Once again, if you want to get the first value from the ages array, you go to the location of the pointer (i.e., C3), and then you go exactly zero spaces over from that location to get the age (i.e., you don't move from the pointer location); the code representation would be ages[ 0 ] or just *ages (more about that asterisk later). If you want to get the third value from the ages array, you will go to the location of the pointer (i.e., C3), then move over two spaces, using the code representation ages[ 2 ] (or *( ages + 2 );) and so on.
Much more will be provided relating to the different ways to declare and use arrays, and how to pass array quantities to functions, and back. However, before you get into all the syntax and operations, make sure you understand these fundamental concepts of arrays.