Monday, 29 December 2014

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Part 1)

What's in a name ?
A lot, and a lot more if you care about programming in C++. 'names' are something, which a programmer uses all around a code. It might be for declaring variables or functions or namespaces or user defined types and what not.

To quote from 'C++ Templates: The Complete Guide' book (from here will use short hand notation of CTCG):



Names are a fundamental concept by means of which a programmer can refer to previously constructed entities.

So, when you are compiling your code with your favorite compiler, whenever it encounters a name, it needs to look-up and identify which entity is being referred to when the programmer is using this name.  

For example, see this piece of code:

typedef int T;
T * x;                 // (2) Here, 'T' is a type, so 'x' is a pointer of type 'T'
int y = 1, z = 2;
y * z;                 // (4) Here, 'y' and 'z' are variable names, so 
                       //     compiler considers it as a multiplication 
                       //     operation
As you can see, line numbers (2) and (4) kind of looks the same if we just consider it as a 'English text'. But, for a compiler, when it reaches line (2), it does a look-up of name 'T'. 

From the look-up, it understands that, it is a typedef for an integer type. So, from compilers perspective, it is a variable declaration. Now, as to what would the compiler do for line(4) is an easy guess, it again does a look-up for names 'y' and 'z' and considers it as multiplication operation since the names corresponds to variable names. So, it's based upon the 'context' of the name that the compiler understands the expression.




Types of Name
By now, as you are already aware that the compiler does a look-up of the symbols it has encountered till now while parsing or tokenizing the code, it must not come as a surprise that there are different types of name based on how the look-up is done. 

Since, C++ also has template feature, it also adds to the types of names.
To make things less boring, I will spend some time creating nice block diagram and all (I have been using 'gliffy' for this purpose for a long time):

This is not it! There is a whole bunch of name taxonomy which you can find in CTCG book. Since all of them are not required for this post, I have just elided it.


Qualified Names: [From the book] A name is a qualified name if the scope to which it belongs to is explicitly denoted using a scoperesolution operator (::) or a member access operator (. or ->) . For example, 'this->count' is a qualified name, but 'count' is not.


Special attention must be paid to the example, where it is mentioned that both 'this->count' and 'count' which is a part of the earlier expression are termed as name. Though they are implicitly equivalent (if count is a member of class), but 'count' is unqualified for the compiler and 'this->count' is qualified. [Read the definition again now]



Unqualified Names: A name which is not a qualified name is an unqualified name. How simple is that!


An example:
#include <iostream>

void print_func1() {
    ::std::cout << "Qualified std lookup" << ::std::endl;
}

void print_func2() {
    std::cout << "Unqualified std lookup" << std::endl;
}

int main() {
    print_func1();
    print_func2();

    return 0;
}
Lets make this as simple as possible:
1. Compiler encounters '::std::cout'  in function 'print_func1'. Now, since 'std' is prefixed by the scope resolution operator (::), it is directly linked to the global namespace.
As for , 'cout', it is already qualified since it is written with its enclosing namespace name prefixed as 'std::cout'.

The same story with the name '::std::endl'. Both 'std' and 'endl' are qualified names.


2. Compiler encounters 'std::cout' in function 'print_func2'. In this case, 'std' is not prefixed with any scope operator, hence the compiler needs to do an 'unqualified' look-up to find 'std', which it finds out under iostream header file.

But, 'cout' and 'endl' are qualified in this case as well as they are prefixed with the namespace they are enclosed with.


Dependent Names: A name that depends upon a template parameter. So, any name, qualified or unqualified but has a template parameter becomes dependent.


Nondependent Name:  A name that does not fit the above description is called nondependent name.


Example:

struct simple_pod {
    int val;
};

template <typename T>
struct complex_pod {
    T val;
};

template <typename T>
void some_func() {
    simple_pod tmp1;       // Non dependent name
    complex_pod<T> tmp2;   // Dependent name. Lookup postponed till compiler 
                               // instantiates it for a used type
}


Name Look-up
If you have reached till here (great going!!), that means we have now a fair understanding of 'names' in C++ and how the compiler looks up to find a relation with a name. Now, we will get into the details of it.
For understanding the name look-up in detail, I would recommend one to visit THIS page.

In this section, I am planning to cover below look-up techniques:
1. Qualified name look-up
2. Unqualified name look-up
3. Argument Dependent look-up

Qualified Name Look-up:
[Better have a look at the definition of Qualified names again]
From CTCG, Qualified names are looked up in the scope implied by the qualifying construct. If the scope is a class, then base classes may also be looked up.
(Imp) However, enclosing scopes are not considered when looking up qualifying names. This is done for 'unqualified lookup'.

Example (from CTCG):

int x;

class B {
public:
    int i;
};

class D : public B {
};

void (D* pd) {
    pd->i = 42;   // finds B::i as per the definition
}

Here, the name 'i' is qualified, since it is denoted using the member access operator. But, at the same time, the name 'pd' is unqualified.


Unqualified Name Look-up:

This is the longest to explain with many scenarios and also Argument Dependent look-up (ADL) is a branch out of this. I would be discussing ADL as a separate section altogether.

If you keep following things in mind, it would be really simple to understand it:
1. Imagine that the compiler is reading your source code and trying to make some sense out of it.
2. When a compiler hits a particular name for which it has to do a look-up, it has information of symbols which it has read of till now. It doesn't know anything which you have defined later in the source code. Therefore, the look-up is always performed towards the top from the point where the compiler found the name.
3. Generally, the look-up is done from the inner-most scope to the global scope. It means that, the name is first looked up at the local scope, for eg: inside the function body where the name was found, if it is not found there, it will be looked up at the global scope.

Consider this example taken from cppreference  article:

namespace A {
   namespace N {
       void f();
       int i=3; // This is third choice (if not second), innermost namespace of function f()
    }
    int i=4;    // This is fourth choice (if not third), last user defined namespace for function f()
}
 
int i=5;  // This is fifth choice (if not fourth), global scope.
 
void A::N::f() {
    int i = 2;      // This is second choice (if not first) , function body scope
    while(true) {
       int i = 1;   // This is first choice, innermost scope.
       std::cout << i; // <= Consider Compiler is here. Look-up 'i'
    }
}

In the comments, I have explained how the unqualified look-up would have accepted the definition in the absence of others.

Friend function(s) comes up with a slightly different kind of rule, but I will skip that in this post.


For complete details, you can go to THIS site.


In the next post, we will be discussing something more interesting but based upon the facts developed here, Argument Dependent Lookup (ADL) / Koenigs look-up.

(To be Continued...)

2 comments: