Objectives

Upon completion of this lesson, you will be able to:

  • define pointer variables
  • declare different types of pointers
  • dynamically allocate memory
  • employ alternative memory management mechanisms
  • distinguish between call-by-value vs call-by-reference
  • pass or return data to and from functions by reference
  • understand how arrays relate to pointers
  • explore strings and how pointers are used to support them
  • find common errors and mistakes with pointers

Prerequisites

This lesson presumes the content of these lessons:

Overview

While difficult to master, pointers provide C programmers with additional and essential flexibility. it is a fundamental data type in C (and, of course, C++). Pointers provides a mechanism to dynamically manipulate memory, enhance support for dynamic data structures, and enable access to hardware making pointers essential for systems programming.

The concepts covered in this lesson about pointers in this lesson applies equally to C and C++.

Computer Memory

In a computer, memory holds program instructions and data. This type os memory is generally called RAM or Random Access Memory. Every location in physical or virtual memory is one byte (8 bits). Each location in memory has a numeric address that starts at 0 and ends the amount of physical memory the computer’s CPU can handle. Older CPUs (and those used even today in appliances) often have a 16-bit address space which means that the maximum memory that those CPUs can address is from 0 to 216-1 (= 65535 bytes = 64k bytes)1. A CPU with a 32-bit address space has a program counter and an address register that can address memory from 0 to 232-1 (= 22 × 230 = 4GB = 4,294,967,295 bytes). That would mean that a computer with a 32-bit address space can have at most 4GB of main memory in which to load and execute programs and memory. Most modern CPUs (ca 2023) have 64-bit address spaces so they can address far more than that; they can up to 264 bytes of RAM – or 4 EB (exabytes).

Program Memory

In a C program, memory is divided into four segments:

  • text segment holding the program instructions
  • static segment holding static/global variables
  • heap segment for dynamically allocated memory
  • stack segment for automatically allocated memory

We can get the address of any location in any one of these segments.

Variables

When we declare a variable in a program, it means that we ask the compiler to reserve a “chunk” or block of consecutive bytes in memory. For example, in the code fragment below, we allocate several variables of different types. Depending on the type, different amounts of memory are allocated. The segment in which the memory is allocated depends on how we declare it.

int main(int argc, char* argv[])
{
  char a;     // allocates one byte
  int i;      // allocates four bytes
  
}

The number of bytes allocated for a variable is type and system dependent. To find the exact number of bytes allocated in the system on which the program is running, use the operator sizeof as demonstrated below (sizeof returns an unsigned long):

#include <stdio.h>

int main(int argc, char* argv[])
{
  printf("num bytes for int = %lu", sizeof(int));
}

Automatic Variables

For the example above, since all variables are declared within a function (main() in this case), the allocation is automatically done from the program’s stack. This memory is allocated upon entry to the function and automatically deallocated upon exit from the function.

Size of Types

Types

The table below summarizes commonly used types in C programming. The column for “Format Specifier” is useful for printing or displaying variables using printf().

Type Size (bytes) Format Specifier
int between 2 and 4 %d, %i
char 1 %c
float 4 %f
double 8 %lf
short int 2 usually %hd
unsigned int at least 2, usually 4 %u
long int at least 4, usually 8 %ld, %li
long long int at least 8 %lld, %lli
unsigned long int at least 4 %lu
unsigned long long int at least 8 %llu
signed char 1 %c
unsigned char 1 %c
long double at least 10, usually 12 or 16 %Lf

Arrays vs Pointers

An array is a contiguous block of memory. It can be allocated either in the static segment or from the stack if declared as a variable, or dynamically using malloc(). The name of the array variable is actually a pointer variable and can be used like a pointer. This implies that passing an array to a function is always a call-by-reference, i.e., the array is not copied to the call stack, so any modification of the array in a function modifies the actual array that was passed (careful!).

Allocating Arrays

Arrays can be allocated as global variables, local variables, or dynamically. The access is the same in all cases; the difference is only from which segment the memory is allocated, when the memory allocation occurs, and when the memory is deallocated.

For a globally allocated array (defined outside the scope of any function), allocation is part of the text segment and occurs when the program starts. It remains allocated until the program exits. The code fragment below defines an array of integers. The array variable a is accessible everywhere after this declaration.

#include <stdio.h>

int a[32];     // global array of 32 integers

int main(void)
{
  
}

For a locally allocated array (define within the scope of a function), allocation occurs upon entry to the function and the memory is deallocated upon exit from the function.

For a dynamically allocated array, the allocation occurs when the allocation function (most commonly malloc()) is called. The memory is deallocated (freed or released) when free() is called. It is also automatically deallocated when the program exits.

The code fragment below shows each of the methods:

#include <stdio.h>

int a[32];     // global array of 32 integers

int main(void)
{
  
}

To initialize an array with a loop requires knowing the array bounds. There is no language environment, so accessing elements beyond the allocated range does not result in an error – but will result in bugs.

Initializing Arrays

In C, a variable (including an array) is not initialized and the “values” are whatever is in memory at the time of allocation. So, to avoid any bugs, all variables must be initialized after being defined.

int i = 0;    // implicit initialization

int main(void)
{
  char a;    // declared but not initialized
  
  a = '\0';  // explicit initialization
}

Like all variables, arrays are not initialized in C. Forgetting to initialize an array to default values and using the array is a common mistake; perhaps you are lucky and on your system the memory locations of the array always contain 0, but you cannot rely on that. You must initialize the array. The most common approach is to use a loop, but small arrays can also be initialized by specifying default values.

Initializing an array can be done most easily with a loop or a default value, as shown below.

#define ARRAY_SIZE = 32

int main(void)
{
  char a[ARRAY_SIZE];    // allocated but not initialized
  
  for (int i = 0; i < ARRAY_SIZE; ++i) {
    a[i] = '\0';
  }
  
  return0;
}

The example above demonstrates the use of a pre-compiler directive to define a constant value. It is important to note that such constants are resolved at compile time and there is no memory allocated for them. The pre-compiler substitutes the value of 32 where ever the constant ARRAY_SIZE appears in the code. Also note that you must not have a semi-colon at the end of a #define macro directive. If you do, this may or may not cause compiler error – and may lead to some strange and hard to find bugs.

Do NOT put a semicolon character at the end of #define statements.

An array can also be initialized with default values during definition as the example below illustrates. Note that the array definition does not contain a dimension as the compiler can infer the size of the array from the default values.

int bufSizes[] = {32, 64, 128, 256, 1024};

int main(void)
{
  // ...
  
  return0;
}

Demo: Arrays and Pointers

Watch and follow along with the narrated demonstration below that illustrates how pointers and arrays are equivalent and how C provides access to elements. The instructor uses the [repl.it] interactive development environment, although you can use any C/C++ compiler such as clang on MacOS, Visual C++, or gcc.

Summary

Pointers are variables that hold an address in memory. They are essential for programming in C as they facilitate call-by-reference for arguments to functions and support dynamic memory allocation. Furthermore, the variable name of an array is, in fact, a pointer. However, dynamic memory allocation and the use of pointers is a common source of bugs in C programs and must be used with care. Checking code prior to deployment and thorough code reviews are essential as C does not have a run-time environment where “dangling” pointers or illegal memory references can be handled safely – in C, they either cause corruption of memory or program crashes. Careless use of dynamic memory and pointers can also be a source of security intrusions, particularly “buffer overflow” exploits.

ChatGPT-3 Generated Text

A pointer is a variable that stores the memory address of another variable. In C, pointers are used to access and manipulate data stored in memory.

Declaring Pointers:

A pointer is declared by using an asterisk (*) before the variable name in the declaration. For example:

python Copy code int *ptr; The above declaration creates a pointer ptr that can store the address of an integer variable.

Initializing Pointers:

Pointers are initialized using the address of the variable they point to. The address of a variable can be obtained using the & operator. For example:

python Copy code int num = 10; int *ptr = # The above code initializes the pointer ptr to store the address of the variable num.

Accessing Data using Pointers:

The value stored at the memory location pointed by a pointer can be accessed using the * operator. For example:

perl Copy code int num = 10; int ptr = # printf(“Value of num: %d“, ptr); The above code prints the value of num using the pointer ptr.

Passing Pointers as Arguments:

Pointers can be passed as arguments to functions. This allows the function to access and modify the data stored at the memory location pointed by the pointer. For example:

void increment(int ptr) { (ptr)++; }

int main() { int num = 10; int *ptr = # increment(ptr); printf(“Value of num after increment: %d”, num); return 0; }

The above code increments the value of num by 1 using the pointer ptr passed as an argument to the increment function.

In conclusion, pointers are an important concept in C programming and are used to access and manipulate data stored in memory.


All Files for Lesson 8.110

Errata

Let us know.


  1. In computing kilo is not 1000 but is 210 = 1024, mega is not one million but 1024 kilo bytes, so 220 = 1,048,576 and giga is not one billion but one 1024 megabytes, so 230 = 1,073,741,824. However, colloquially we often revert to the decimal system, but not when talking about computer systems or computer architecture.↩︎