This is the first one of 'Talking about C pointer' series. In this article, I'll discuss some basic concept on pointers including what are pointers, how to use pointers, and some issues on multi-level pointers and pointers with 'const'.
In Von Neumann architecture, data and codes are stored in main memory. CPU uses address to locate where the data or code that needed; after being found, data are fetched to registers and ready to being computed. In high-level languanges such as C/C++, Java, C#, a variable has two attributes: address and value. In general, a pointer is an object that stores the address of another object in main memory; that means a pointer that is a 'specical variable' refers to another entity (variable, structure or function) in main memory, that is to say the only thing stored in this 'special variable' is memory address.
During my undergrad study, I was always teached by 'C/C++ has pointer while other high-level languages don't'. Actually, it's not that right. Every high-level language has pointers because, the existence of pointer correlate to the addressing mode in almost every ISAs. The rub is C/C++ reveal pointer to programmers so that they could manipulate memory while others hide pointers for the sake of simplicity and safety.
After getting equippted with concept of pointer, let's go over pointer in C. pointer is a special variable in C, it's size equals to 4 bytes in 32-bit machine and equals to 8 bytes in 64-bit machine.
Now, let's take a look at a simple program.
/* test.c */
#include <stdio.h>
int main() {
int a = 5;
int *p = &a;
printf("&p = 0x%lx\n", &p); // address of p
printf(" p = 0x%lx\n", p); // value of p
printf("&a = 0x%lx\n", &a); // address of a
printf("*p = 0x%lx\n", *p); // value of an address that p refers to
printf(" a = 0x%lx\n", a); // value of a
return 0;
}
Let's compile this program in GCC:
gcc -m32 -o test test.c
The result of those codes are: (on x86_64 Linux using -m32 option when compiling)
&p = 0xffc6abb8
p = 0xffc6abb4
&a = 0xffc6abb4
*p = 0x5
a = 0x5
As we just illusrated, pointer p is a variable just like a. Type of a is int which stores an integer number while type of p is int* that stores an address of an integer number. When p point to a, that means p stores the address of a (lvalue) instead of value of a (rvalue). In C, the operator '*' allows a pointer to get the value of what this pointer pointing to. In this simple example, we could clearly see what is a pointer in C and how to use it. p and a sit in stack, thus their address differs by 4 (the size of int). p = &a and *p = a.
Pointers and one-dimensional array
Array is a data structure that stores both logical and physical consecutive variable of the same type. A pointer of that type could point to this array and visit all element in the array, or we can alter value of a by using p:
int a[5] = {1, 2, 3, 4, 5}; // a is an integer array with 5 elements
int *p = a; // p is a pointer of integer pointing to the first element of array a
For array a, 'a' has double meaning: a not only represents the address of first element of array a, but also represents the name of this array. That means although the value of a and &a are the same, they're different. a refers to the first element a[0], thus a+1 increments by sizeof(a[0]) = 4 (the size of int) in this case. While &a refers to the whole array a, thus &a+1 increments by sizeof(a) = 4*5 = 20 (the size of whole array). Therefore, type of a pointer determines the length of data that this pointer pointing to.
Pointer Assignment
When do pointer assignment, the type should be the same. Otherwise, compiler will do implicit casting. Now let's look at an example:
/* test.c */
#include <stdio.h>
int main() {
int a[5] = {1, 2, 3, 4, 5}; // a is an integer array with 5 elements
int *q = &a; // assign the address of the whole array to q (the types here are different!)
printf("&a = 0x%x\n", &a);
printf("&a+1 = 0x%x\n", &a+1);
printf("q = 0x%x\n", q);
printf("q+1 = 0x%x\n", q+1);
return 0;
}
Let's compile this program in GCC:
gcc -m32 -o test test.c
The result of those codes are: (on x86_64 Linux using -m32 option when compiling)
&a = 0x2e609a10
&a+1 = 0x2e609a24
q = 0x2e609a10
q+1 = 0x2e609a14
we assign &a to q, &a = q but &a+1 != q+1. Because when we do assignment, the type of &a and type of p are not the same, thus the compiler did the implicit casting. &a is the address of whole array whose type is int(*)[5] (This is an array pointer, I'll discuss it in the next talk) while p's type is int*. Thus &a+1 = &a + sizeof(a) and p+1 = &a + sizeof(int).
Multi-level Pointers
Pointers are also allowed to be indirected in multi-level:
int a = 5;
int *b = &a; // *b = a = 5
int **c = &b; // **c = *b = a = 5
In two-level pointer, the first-level pointer stores the address of the second-level pointer, and the second-level pointer stores the address of the variable you want to refer to.
Pointers with 'const'
'const' could pe applied to an pointer when being declared:
const char * p // The value that p pointing to could not be altered
char * const p // p could not pointing to any other addresses
const char * const p // both p and *p could not be altered
Now let's look at an example:
/* test.c */
#include <stdio.h>
int main() {
char a[10] = "abcde1234";
char s[10] = "4321edcba";
const char *b = a;
char *const c = a;
const char *const d = a;
b = s; // correct
*c = 'A'; // correct
printf("b = %s\n", b);
printf("c = %s\n", c);
printf("d = %s\n", d);
//b[0] = 'A'; // wrong
//c = s; // wrong
//d[0] = 'A'; // wrong
//d = s; // wrong
return 0;
}
Let's compile this program in GCC:
gcc -m32 -o test test.c
The result of those codes are: (on x86_64 Linux using -m32 option when compiling)
b = 4321edcba
c = Abcde1234
d = Abcde1234
If we uncomment the 4 wrong assignment, GCC will return error:
test.c: In function ‘main’:
test.c:17:7: error: assignment of read-only location ‘*b’
b[0] = 'A'; // wrong
^
test.c:18:4: error: assignment of read-only variable ‘c’
c = s; // wrong
^
test.c:19:7: error: assignment of read-only location ‘*d’
d[0] = 'A'; // wrong
^
test.c:20:4: error: assignment of read-only variable ‘d’
d = s; // wrong
Conclusion
This blog reviews basic concept of C pointers. The existence of pointer has everything to do with the addressing mode in Instruction Set Architecture. Pointer itself is a variable, it stores the address of other objects thus pointer could direct to them. The type of a pointer is quite important, and assignment between different types of pointers will cause compile error or implicit casting. This's quite important to understand before we go to our next talk: Pointer to array, Array of pointer and Two-level pointer.