Upon completion of this lesson, you will be able to:
This lesson presumes the content of these lessons:
While difficult to master, pointers provide C programmers with additional and essential flexibility. it is a fundamental data type in C (and, of course, C++). Pointers provides a mechanism to dynamically manipulate memory, enhance support for dynamic data structures, and enable access to hardware making pointers essential for systems programming.
The concepts covered in this lesson about pointers in this lesson applies equally to C and C++.
In a computer, memory holds program instructions and data. This type os memory is generally called RAM or Random Access Memory. Every location in physical or virtual memory is one byte (8 bits). Each location in memory has a numeric address that starts at 0 and ends the amount of physical memory the computer’s CPU can handle. Older CPUs (and those used even today in appliances) often have a 16-bit address space which means that the maximum memory that those CPUs can address is from 0 to 216-1 (= 65535 bytes = 64k bytes)1. A CPU with a 32-bit address space has a program counter and an address register that can address memory from 0 to 232-1 (= 22 × 230 = 4GB = 4,294,967,295 bytes). That would mean that a computer with a 32-bit address space can have at most 4GB of main memory in which to load and execute programs and memory. Most modern CPUs (ca 2023) have 64-bit address spaces so they can address far more than that; they can up to 264 bytes of RAM – or 4 EB (exabytes).
In a C program, memory is divided into four segments:
We can get the address of any location in any one of these segments.
When we declare a variable in a program, it means that we ask the compiler to reserve a “chunk” or block of consecutive bytes in memory. For example, in the code fragment below, we allocate several variables of different types. Depending on the type, different amounts of memory are allocated. The segment in which the memory is allocated depends on how we declare it.
The number of bytes allocated for a variable is type and system dependent. To find the exact number of bytes allocated in the system on which the program is running, use the operator sizeof
as demonstrated below (sizeof
returns an unsigned long):
#include <stdio.h>
int main(int argc, char* argv[])
{
printf("num bytes for int = %lu", sizeof(int));
}
For the example above, since all variables are declared within a function (main()
in this case), the allocation is automatically done from the program’s stack. This memory is allocated upon entry to the function and automatically deallocated upon exit from the function.
The table below summarizes commonly used types in C programming. The column for “Format Specifier” is useful for printing or displaying variables using printf()
.
Type | Size (bytes) | Format Specifier |
---|---|---|
int | between 2 and 4 | %d, %i |
char | 1 | %c |
float | 4 | %f |
double | 8 | %lf |
short int | 2 usually | %hd |
unsigned int | at least 2, usually 4 | %u |
long int | at least 4, usually 8 | %ld, %li |
long long int | at least 8 | %lld, %lli |
unsigned long int | at least 4 | %lu |
unsigned long long int | at least 8 | %llu |
signed char | 1 | %c |
unsigned char | 1 | %c |
long double | at least 10, usually 12 or 16 | %Lf |
An array is a contiguous block of memory. It can be allocated either in the static segment or from the stack if declared as a variable, or dynamically using malloc()
. The name of the array variable is actually a pointer variable and can be used like a pointer. This implies that passing an array to a function is always a call-by-reference, i.e., the array is not copied to the call stack, so any modification of the array in a function modifies the actual array that was passed (careful!).
Arrays can be allocated as global variables, local variables, or dynamically. The access is the same in all cases; the difference is only from which segment the memory is allocated, when the memory allocation occurs, and when the memory is deallocated.
For a globally allocated array (defined outside the scope of any function), allocation is part of the text segment and occurs when the program starts. It remains allocated until the program exits. The code fragment below defines an array of integers. The array variable a is accessible everywhere after this declaration.
For a locally allocated array (define within the scope of a function), allocation occurs upon entry to the function and the memory is deallocated upon exit from the function.
For a dynamically allocated array, the allocation occurs when the allocation function (most commonly malloc()
) is called. The memory is deallocated (freed or released) when free()
is called. It is also automatically deallocated when the program exits.
The code fragment below shows each of the methods:
To initialize an array with a loop requires knowing the array bounds. There is no language environment, so accessing elements beyond the allocated range does not result in an error – but will result in bugs.
In C, a variable (including an array) is not initialized and the “values” are whatever is in memory at the time of allocation. So, to avoid any bugs, all variables must be initialized after being defined.
int i = 0; // implicit initialization
int main(void)
{
char a; // declared but not initialized
a = '\0'; // explicit initialization
}
Like all variables, arrays are not initialized in C. Forgetting to initialize an array to default values and using the array is a common mistake; perhaps you are lucky and on your system the memory locations of the array always contain 0, but you cannot rely on that. You must initialize the array. The most common approach is to use a loop, but small arrays can also be initialized by specifying default values.
Initializing an array can be done most easily with a loop or a default value, as shown below.
#define ARRAY_SIZE = 32
int main(void)
{
char a[ARRAY_SIZE]; // allocated but not initialized
for (int i = 0; i < ARRAY_SIZE; ++i) {
a[i] = '\0';
}
return0;
}
The example above demonstrates the use of a pre-compiler directive to define a constant value. It is important to note that such constants are resolved at compile time and there is no memory allocated for them. The pre-compiler substitutes the value of 32 where ever the constant ARRAY_SIZE appears in the code. Also note that you must not have a semi-colon at the end of a #define macro directive. If you do, this may or may not cause compiler error – and may lead to some strange and hard to find bugs.
Do NOT put a semicolon character at the end of #define statements.
An array can also be initialized with default values during definition as the example below illustrates. Note that the array definition does not contain a dimension as the compiler can infer the size of the array from the default values.
Watch and follow along with the narrated demonstration below that illustrates how pointers and arrays are equivalent and how C provides access to elements. The instructor uses the [repl.it] interactive development environment, although you can use any C/C++ compiler such as clang on MacOS, Visual C++, or gcc.
Watch and follow along with the narrated tutorial below that summarizes the concepts of this lesson through a code walk
Pointers are variables that hold an address in memory. They are essential for programming in C as they facilitate call-by-reference for arguments to functions and support dynamic memory allocation. Furthermore, the variable name of an array is, in fact, a pointer. However, dynamic memory allocation and the use of pointers is a common source of bugs in C programs and must be used with care. Checking code prior to deployment and thorough code reviews are essential as C does not have a run-time environment where “dangling” pointers or illegal memory references can be handled safely – in C, they either cause corruption of memory or program crashes. Careless use of dynamic memory and pointers can also be a source of security intrusions, particularly “buffer overflow” exploits.
A pointer is a variable that stores the memory address of another variable. In C, pointers are used to access and manipulate data stored in memory.
Declaring Pointers:
A pointer is declared by using an asterisk (*) before the variable name in the declaration. For example:
python Copy code int *ptr; The above declaration creates a pointer ptr that can store the address of an integer variable.
Initializing Pointers:
Pointers are initialized using the address of the variable they point to. The address of a variable can be obtained using the & operator. For example:
python Copy code int num = 10; int *ptr = # The above code initializes the pointer ptr to store the address of the variable num.
Accessing Data using Pointers:
The value stored at the memory location pointed by a pointer can be accessed using the * operator. For example:
perl Copy code int num = 10; int ptr = # printf(“Value of num: %d“, ptr); The above code prints the value of num using the pointer ptr.
Passing Pointers as Arguments:
Pointers can be passed as arguments to functions. This allows the function to access and modify the data stored at the memory location pointed by the pointer. For example:
void increment(int ptr) { (ptr)++; }
int main() { int num = 10; int *ptr = # increment(ptr); printf(“Value of num after increment: %d”, num); return 0; }
The above code increments the value of num by 1 using the pointer ptr passed as an argument to the increment function.
In conclusion, pointers are an important concept in C programming and are used to access and manipulate data stored in memory.
In computing kilo is not 1000 but is 210 = 1024, mega is not one million but 1024 kilo bytes, so 220 = 1,048,576 and giga is not one billion but one 1024 megabytes, so 230 = 1,073,741,824. However, colloquially we often revert to the decimal system, but not when talking about computer systems or computer architecture.↩︎