Functions and Stacks in C
The general form of a function in C is:
The body contains statements that define what the function does. Within a function, we can have comments, variable declarations, variable assignments, control flow loops, etc.For a statement that has conditional logic, if the following line only has one statement, you can omit the curly braces {}
:
Declarations
Whilst there is no requirement to declare a function before defining it, it can sometimes be useful. We declare the function similar to defining, but omit the body and possibly the parameter names, e.g., int max(int num1, int num2);
can also be written as int max(int, int);
, as there is no actual reference to the variables that is needed in a declaration, it just makes it easier to read.
Local Variables
When calling a function, we might assume that the variables we call it with are the ones used in the function, e.g., void add(int a) { a += 1; };
we would expect would increment a
, then we could use this function to manipulate variables...
Unfortunately, (or fortunately?) this is not how it works. When we call a function with some local variables, a copy of the variable is created on the stack frame for that function call. When we call add()
, it takes a copy of a
then works on the local copy. Any changes within that function are then destroyed.
Recursion and the Stack
Recall that a simple iterative definition of the factorial function can be created:
Implemented in C, this would look as such:
This definition allows us to call the function, create a stack frame for it, then recursively call factorial()
. This all goes well until a point, when the stack grows too far down, and begins to overflow. We can fix this using tail recursion, and indeed the compiler would handle this for us:
// Main function
int factorial(unsigned int n) {
return factorial_aux(n, 1);
}
// Aux recursive function
int factorial_aux(unsigned int n, unsigned int acc) {
if(n<=1)
return acc;
return factorial_aux(n-1, acc*n);
}
// Transformed into this by compiler
int factorial_aux(unsigned int n, unsigned int acc) {
loop:
if(n<=1)
return acc;
acc *= n--;
goto loop;
}
Call-by-Value vs Call-by-Reference
Call-by-Value
Call-by-value is the way that all the function calls listed thus far have operated. This is where passing the arguments to the function copies the value of the argument into the function. Changes inside the function to these variables affect the variable that has been copied onto the stack, but not the variable that was passed in by the caller function.
The only way for a function to affect the caller in this way would be through a return value. Take, for example, our simple add()
example. The only way for us to update the a
outside the function would be:
#include <stdio.h>
// Original implementation, doesn't work
void add(int a) { a += 1; };
// New implementation, returning modified a
int add(int a) { return a += 1; };
// Simple driver program
void main() {
int a = 1;
printf("%d", a);
a = add(a);
printf("%d", a);
}
This has an issue in that we can only modify one value, and then can't return anything else. Now comes good old call-by-reference!
Call-by-Reference
Instead of passing the value into the function, we pass the address the variable is stored at, known as the pointer, to then access the actual variable outside the function call. Therefore, any changes inside the function will affect the variables outside the function.
To help with this, C introduced the dereferencing operator *
and the referencing operator &
. We have sort of already touched on these in main(int argc, char** argv)
. The *
operator allows us to get the value of the pointer at that address, and the &
operator allows us to pass the address of the value into a function.
We can therefore redefine the simple add
function, such that it takes a pointer to the variable and manipulates that pointer:
#include <stdio.h>
// New function using pointers
void add(int* a) {
*a += 1;
}
// Updated driver program
void main() {
int a = 1;
printf("%d", a);
add(&a); // Pass in the address to a
printf("%d", a);
}
The lecture slides have an alternative with swap(int* a, int* b)
, which takes the value of a
and swaps it with the value of b
. Without pointers, this would be impossible.
Pointers
What we have just introduced in the call-by-reference section are what are known as pointers. At the low level, we have a pointer type, which simply stores the memory address we are referencing. Pointers are a very powerful part of the C language, and are one of the key features that allow it to be so useful for low level operations.
Arithmetic
Just like other types, pointers can be manipulated. They are quite clever in this way, as they allow us to increment and decrement in terms of the size in bytes of the address that they reference.
The address at (pa+i)
, where pa
is a pointer to a
, given a
has type T*
would simply be (pa) + [i* sizeof(T)]
. From this, we have *(pa+i) == (*pa)[i]
. Remember that when we are working in C, we use the logical memory layout, unlike the underlying memory on the computer which often employs ASLR (for security), paging, and wildly different physical addresses.
Null Pointers
Another important concept to remember when writing C is to assign pointers to NULL
initially, such that the pointer points to memory address 0
of the program. We do this as part of error handling. For example, if malloc
fails, it will point to NULL
, and as good programmers, we should check to see if the pointer is null.
Also, in C, recall that if()
evaluates to true for a nonzero value, so if the pointer references 0
, then we fail the conditional.