Offering Some Choices
Pointer Return vs Pass by Reference
In the previous sections, various ways of adding and removing nodes have been provided. Some code was provided but with linked lists there really isn't that much to do for any of the operations, so some code was left for students to figure out from the pictures. And as stated in the previous sections, that is how programmers really come to learn linked lists. Pictures to Code, and Code to Pictures.
Whenever a linked list is updated, there is always a chance that the linked list head pointer will be modified. This section will provide a little support for calling add, remove, and other functions, and it will also discuss how to get the updated head pointer back from a function if or when it has been modified.
Remember that no matter what other actions are taken with linked lists, if the head pointer is lost, the linked list is lost. Programmers must continuously strive not to experience this inconvenience.
Returning the Head Pointer
This part is not difficult. As mentioned, when a node is added or removed, or a linked list is modified in any other way, there is a fair chance that the head pointer will be modified or updated. The calling function, and program in general, must always have access to the head pointer after any change, so this is critical.
However, many functions manage this by simply returning the head pointer. Consider the following examples.
// add function, called
headPtr = addNode( headPtr, newData );
// remove function, called
headPtr = removeNode( headPtr, removedData );
Both functions accept the head pointer as a parameter along with the data that is to be added or removed. If a linked list is copied, the pointer to the new list would also be returned. Again, pretty much any modification of a linked list may potentially change the head pointer, and this change must be provided back to the calling function.
Thus, for any of these actions, it is assumed that the head pointer is fully up to date when it is passed in as a parameter, it will be used inside the function to conduct the activities, and the possibly updated head pointer will be returned to the calling function, as shown in this example.
NodeType *addNode( NodeType *headPtr, int newData )
{
// actions for adding the node, using,
// and possibly modifying the head pointer
// head pointer is returned at the end
return headPtr;
}
Nothing tricky here. Just make sure the special cases, like an empty list, or a first or second node modification, or any others, are treated by the code, and then provide the updated head pointer.
Returning the Head Pointer with a Double Pointer
(aka PassByReference)
This part is a little more tricky but it's not that difficult as long as you have your trusty scratch paper nearby to draw out the pointer actions. Consider the situation where a programmer wants to search for a particular data item in a linked list and remove it. And consider the possibility that the item might, or might not, be in the list.
The programmer wants to write the function in such a way that it returns a Boolean result of the removal. In other words, if the data is found and removed, the function would return true; if the data is not found, the function would return false. This sounds like a really good idea but (again) there is a reasonable liklihood that the linked list head pointer would be modified as part of the remove node action.
So how does the programmer return both the Boolean AND the potentially updated head pointer? And remembering that all C functions implement pass by copy, how does one get a potentially modified pointer back as a parameter?
The first answer is to use a double pointer. This is exactly what it sounds like; a pointer to a pointer.
However, before jumping into the code for this, consider what a double pointer does, and how it would work as a parameter. Here is an example of what the header of this function would look like.
bool removeNode( NodeType **headPtrAddress, int dataToBeRemoved );
Next, here is what the pointers would look like in memory.
A |
B |
C |
D |
|
1 |
--- |
headPtrAddress |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
First LL Node |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
That's not so bad. As stated, a double pointer is a pointer that points to another pointer which then points to some kind of data, which in this case is the first node in the linked list. That said, students are unlikely to get comfortable with this concept unless they draw this picture and trace the action themselves. Be advised.
The (double pointer) headPtrAddress is passed into the function and for purposes of this discussion, is stored in memory location B1. This pointer points to another location in memory (C2) which is the actual headPtr, and which is a single pointer. This pointer will point to the location in memory where the first linked list node is at A3. In a nutshell, headPtrAddress is a pointer to a pointer to a node.
So now what happens in the function? The first answer will be to simply show the function using comments. The following is not quite complete yet but will show the actions that must occur.
bool removeNode( NodeType **headPtrAddress, int dataToBeRemoved )
{
// use the single pointer as the local head pointer
// this is found by dereferencing the double pointer
// search for the item to be removed
// if it is found
// conduct the removal operations
// return true
// otherwise, assume item not found, return false
}
And what is happening here? How does this work? Take another look at the memory, provided again here for reference.
A |
B |
C |
D |
|
1 |
--- |
headPtrAddress |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
First LL Node |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
The single pointer (at C2) does everything it needs to as the head pointer. This includes the possibility that it gets changed. For example, if the first node is removed, the head pointer would now point to the second node in the linked list. This might be at D4 and the memory map would now look like the following.
A |
B |
C |
D |
|
1 |
--- |
headPtrAddress |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
--- |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
First LL Node |
So far, this doesn't look like a problem. But remember that the C language only allows pass by copy and the head pointer address must remain unchanged so it will still be correct after the function completes.
No problem. The original double pointer is NOT changed. It is still at the same location in memory (B1) and more importantly, it is still pointing at the same location where the actual head pointer is (C2). It has not been modified in any way.
Now the head pointer at C2 has been changed but this value/pointer is not the parameter. No harm, no foul. When the double pointer is passed back to the calling function as a parameter, it will hold the up to date head pointer location.
Pretty Cool, if not a little (and seriously, just a little) complicated.
The second answer needs to go a little deeper to make the syntax workable.
To be able to get to the double pointer's single pointer, the double pointer must be dereferenced), meaning the data "pointed at" by the double pointer, which happens to be the head pointer, is determined. The dereference operation looks like the following.
*headPtrAddress
This operation is literally saying "the data pointed at by the headPtrAddress" and the data being pointed at happens to be the actual head pointer. In this case, headPtrAddress is a pointer to, or holds the address of, headPtr, which is C2; and the dereferenced *headPtrAddress holds the data at the location (C2) it is pointed at, which is D4 in the most recent table/graphic.
The problem with dereferencing the double pointer is that the programmer will have to use this syntax everywhere. For example to identify the first node's data, the programmer will be required to use the dereferenced double pointer as follows.
(*headPtrAddress)->data = someVal;
As noted, the parentheses will be required everywhere the double pointer is used as a single pointer due to operator precedence rules related to both pointer and dereference operators. This does work just fine, but it can get a little messy. Here is the smarter way to implement this.
bool removeNode( NodeType **headPtrAddress, int dataToBeRemoved )
{
// initialize the local head pointer
NodeType *localHeadPtr;
// use the single pointer as the local head pointer
// this is found by dereferencing the double pointer
localHeadPtr = *headPtrAddress;
// search for the item to be removed
// if it is found
// conduct the removal operations using the local head pointer
// once completed
// reassign the double head pointer to the local head pointer
*headPtrAddress = localHeadPtr;
// return true
// otherwise, assume item not found, return false
}
Once again, pretty straightforward. While there is complexity, it is not too messy to think through.
How To Call It
While all of the above is nice to know, the programmer still needs to know how to call the removeNode function. And this just means a little further "thinking through" action. Remember that by using a double pointer, the first pointer is pointing at, or providing the address of the second pointer. That is how to think about it.
The head pointer of the linked list (or whatever data type) should be created as it normally would in the calling function. This action would look like the following in the previously provided memory map if the pointer were initialized to NULL.
A |
B |
C |
D |
|
1 |
--- |
--- |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
--- |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
Or, it would look like the following if the linked list was already pointing at data.
A |
B |
C |
D |
|
1 |
--- |
--- |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
First LL Node |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
However, when the function is called with the "address of" operator (&), the head pointer will not be passed into the function. Instead a pointer to the head pointer will be passed, and it will look like that which you observed earlier in this section.
Note that while the head pointer and the first and any other linked list nodes were created prior to the function call, and continue to exist (this is called persistence), the head pointer address will be local to this function and will go out of scope when the function completes. This is acceptable because the program does not need the address of the head pointer after the function call; it just needs the head pointer.
It is happenstance that the address of the head pointer is located at B1; it doesn't matter but the compiler had to locate the address somewhere so that is where it happened to end up.
A |
B |
C |
D |
|
1 |
--- |
headPtrAddress |
--- |
--- |
2 |
--- |
--- |
headPtr |
--- |
3 |
First LL Node |
--- |
--- |
--- |
4 |
--- |
--- |
--- |
--- |
The pointer's address must be passed to the function, meaning that the "address of" the pointer (i.e., the pointer to the pointer) must be passed into the function, as shown here.
boolResult = removeNode( &localHeadPtr, dataToRemove );
Note that the compiler will require this syntax (i.e., it will show errors if it is not done this way) so that will help the programmer develop the appropriate code.
This is a good example of where thinking about what is to be done and getting the idea in English (or your own language) will lead directly to knowing what to do in the code. Keep thinking.
Important Side Note: While the above is a slightly imperfect model presentation of memory, the model does accurately represent the fact that the memory does not change when the program is running. As previously mentioned, the head pointer address will come into scope while removeNode is running, and it will go out of scope when the function has completed, but the head pointer itself will always be available to the program before, during, and after the removeNode (or any other function) call. That keeps it accessible and that is the goal here.
Pointers to Pointers
When the smoke all clears, double pointers are not terribly difficult to understand or use. That said, very few people figure them out by just talking about them. Programmers who are going to use double pointers must understand what is and what is not changing in a given pointer usage situation, especially if it is passed to a function as a parameter.
Don't be intimidated by this. Do make sure you understand the layers being applied. And then sit back and bask in the glory.