Tuesday, 13 January 2015

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Sentinel)

Finally, we are at the final part of this series. Hope you enjoyed the reading till now. 
If you have not had the chance to go over the previous two articles, I will link you to those:

Part 1
Part 2


Template Argument Deduction

Just like other topics, I can only dream about knowing all the nooks and crooks of argument deduction in the template world. But, what I can put forth here is what every experienced C++ programmer must know about it.  

Most of the things I will be writing about is directly influenced by what is already explained in "C++ Templates: The Complete Guide" book. It is highly recommended for everyone who makes use of templates in their C++ code to read through it and always keep it at your hands distance.

Why Argument Deduction ?

A simple template code which one might have written to understand template would have looked like this:
#include <iostream>
#include <string>
#include <typeinfo>

template <typename T>
void func(T arg) {
   std::cout << "func called with arg " << typeid(T).name() << std::endl;
}

int main() {
    func<int>(5);
    func<char>('a');
    return 0;
}


Now, assume you have below function, which you need to call many times and at many places in your code:
template<typename T1, typename T2, typename T3>
void someCommonFunction(T1& a, T2 b, T3 c) {
....
}

...
someCommonFunction<std::string, char, int>(str, 'b', 5);

This is what you would be writing every where! Too much verbosity!!
This is where argument deduction comes in. It deduces the types auto-magically thereby relieving us from the burden of mentioning the intended template types everywhere.

Also, a very important point to remember is that, template type deduction works only for function templates and member function templates, it is not applicable at the class level.


Let's cover some basics...

Look at the below example:
template <typename T>
T max (const T& a, const T& b) {
    return a < b ? b : a;
}

int max_val = max(10, 11);

Lets now see how compiler might be thinking while deducing the types:
1. It encounters the first parameter as 10 and considers its data type as 'int'. So, the parameterized type 'T' of function template max is considered 'int'  as one of the option.

2. Now, it checks the data type of the second argument, which is also 'int'. This matches with the previous conclusion done in step #1 and also fits perfectly as the argument type of the function template based upon the paremeter type i.e both the arguments should be of same type.
Hence, in this case, type deduction succeeded.

3. Let's now consider the max call like below:
int max_val = max(10, 10.1);

4. For the above case, step #1 will tentatively deduce 'T' as 'int'. But, for the second argument, 'T' will be deduced as 'double'. This second deduction does not conform with the first deduction and also w.r.t the parameter type of the function template i.e. both argument needs to be of the same type.
Hence, for this case, argument deduction rightfully fails and the compiler looks for other overloaded function with the same name 'max' to match it with the actual call.


Type Deduction Scenarios

There are various scenarios depending upon the argument type of the function template on which deduction of type T is dependent. These are:
1. Argument type is either a pointer or reference.
2. Argument type is pass by value.
3. [C++11 specific] Argument type is a forwarding reference. (We will not be covering this now.)

NOTE: Examples in this section are influenced by what is provided in 'Effective Modern C++' by Scott Meyers.

Case 1. When argument type is either reference or pointer.
Remember that, if the argument type of the callee is a reference or a pointer, then, the pointer or the reference is ignored. 
For example:
template <typename T>
void func(T& param) {   // See that argument is a reference type
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x);  // 'x' is int; 'T' is int; Argument type is 'int&'

f(cx); // 'cx' is 'const int'; 'T' is 'const int'; Arg type is 'const int&'

f(rx); // 'rx' is 'const int&'; 'T' is 'const int'; Arg type is 'const int&'

Following the comments, it should be pretty straightforward to grasp what is happening here.
The important point to remember here is that:
When we pass a const object to a reference parameter, the object remains as a const i.e. const-ness (also volatile) of the object becomes part of the type deduced for T.
Let's see the same example again with a small change to the argument type:
template <typename T>
void func(const T& param) {
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x);  // x is int; T is int; Arg type is const int&

f(cx); // cx is const int; T is int; Arg type is const int&

f(rx); // rx is const int&; T is int; Arg type is const int&
If param was a pointer ( or a pointer to const ) instead of a reference, things would have been the same. The note above regarding the const-ness (also volatile) of the object also holds in case of the pointer.


Case 2. When argument is passed by value
When the parameter is passed by value, a completely new object is created at the target destination. This has some major changes in the rules regarding c-v (const - volatile) qualifier which we saw for reference/pointer types.

As before, if the callers expression type is a reference, ignore the reference part. If the callers expression type has c-v qualifiers, then they are stripped off.

Example:
template <typename T>
void func(T param) {
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x); // x is int; T is int; Arg type is int

f(cx);// cx is const int; T and Arg type are int

f(rx);// rx is const int& ; T and Arg type are int

As can be seen from the above example that the c-v qualifiers are stripped off when passed by value.


Decaying

Case 1. Decaying of Array into a pointer
In many contexts, an array decays into a pointer to its first element. This is what we know right from C. But, this decaying also takes place in case of function templates accepting parameter by-value!

Example:
const char name[] = "Arun"; // Type of name is const char[5]

const char* pname = name; // array decays to a pointer

template <typename T>
void func(T param) {
}

f(name); // T is const char*, name is const char[5]

Now, the confusing part, if we modify the function template to accept argument by-reference instead of by-value, the deduced type for T is the actual type of the array i.e const char[5], and the type of func's parameter is const char (&) [5].



Use of type deduction

Have a look at boost source or STL implementation, they are every where. It is one of the very basic requirements to understand how template works after all.
It is used in type traits, boost function etc.

What I will be showing here is a pretty useless example, but will cover how in general type traits are written and also shows a peek into boost function traits

Beware, following code is only for those who can bend their mind :) 
#include <iostream>

// Type trait helper function
// Determines at compiler time whether type is signed integer or not
template <typename T> struct is_integer { enum {value = false}; };
template <> struct is_integer<int> { enum {value = true}; };

// Base structure of function trait
template <typename Func> struct function_traits {};

// Partially specialized function_trait
// to accept a function signature
template <typename R>
struct function_traits<R (void)> {
    typedef R result_type;
};

// Partially specialized function_trait 
// to accept a function pointer
template<typename R>
struct function_traits<R (*) (void)> {
    typedef R result_type;
};


int test_func() {
    std::cout << "This is a test function" << std::endl;
    return 42;
}

template <typename Fun>
typename function_traits<Fun>::result_type 
my_function(Fun& f) {
    return f();
}

int main() {
    if (is_integer<function_traits<float(void)>::result_type>::value == true) {
        std::cout << "Return type of function is integer" << std::endl;
    } else {
        std::cout << "Return type is not integer!!" << std::endl;
    }

    int (*fptr)() = test_func;

    if (is_integer<function_traits<int (*)(void)>::result_type>::value == true) {
        std::cout << "Return type of function is integer" << std::endl;
    } else {
        std::cout << "Return type of the function is not integer!!" << std::endl;
    }

    std::cout << my_function(test_func) << std::endl;

    return 0;
}

For those who reached at this point of the page in 5 seconds, have patience, it's not that difficult! Just go one line at a time and you will get it for sure.

Basically, we are just identifying the return type of the function based upon the signature, boost function traits does  a lot more than that.

Disabling Type Deduction

Yes, you can do that as well. With C++, getting things done is never an issue, the only ugly part that can be is the code itself. :)
Recently, I got bogged down by the question, why 'std::forward' cannot use type deduction ?
As always, I found the answer in THIS SO post. That's when the use of 'identity type trait' stuck me. So simple, yet so powerful. 
Lets dig in to an example right away...
#include <iostream>
#include <string>

template <typename T>
void func_deduct(T& val) {
    std::cout << "With deduction: " << val << std::endl;
}

// Identity type trait
template <typename T>
struct identity {
    typedef T type;
};

template <typename T>
void func_nodeduct(typename identity<T>::type& val) {
    std::cout << "No deduction: " << val << std::endl;
}

int main() {
    std::string str("Will it work?");
    //func_nodeduct(str);                // Uncomment this and face the wrath of compiler
    func_nodeduct<std::string>(str);

    return 0;
}


As to why deduction does not work, in short, it is because template type T in function 'func_nodeduct' appears in 'nondeduced' context.

From the Standard 14.8.2.4/4:
The nondeduced contexts are:
  • The nested-name-specifier of a type that was specified using a qualified-id.
  • A type that is a template-id in which one or more of the template-arguments is an expression that references a template-parameter.
When a type name is specified in a way that includes a nondeduced context, all of the types that comprise that type name are also nondeduced. However, a compound type can include both deduced and nondeduced types. [Example: If a type is specified as A<T>::B<T2>, both T and T2 are nondeduced. Likewise, if a type is specified as A<I+J>::X<T>IJ, and T are nondeduced. If a type is specified as void f(typename A<T>::B, A<T>), the T in A<T>::B is nondeduced but the T in A<T> is deduced. ]

There are lots more to be discussed about argument deduction, but I will call it a day. One can lookup here to understand it in more detail.

Saturday, 3 January 2015

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Part 2)

This post as the name suggests is a follow up of my previous post . I would recommend to go through it once before we start digging into 'Argument Dependent Lookup' (ADL from here on) or 'Koenig Lookup'.

Let's start with a surprise story from Andrew Koenig himself which I had come across in  Dr. Doob's journal:
One of the comments on my article last week noted that argument-dependent lookup in C++ is often called "Koenig lookup". I didn't invent it, but unfortunately, I don't know who did, so I don't know where the credit -- or blame -- is really due.

Back to Hello, World! 
No, not "hello world" again! But wait, this has a lot to do with ADL! ADL is what that makes your C++ hello world program so simple! Just stay with me.

Here we go:
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

This simple program has a lot of C++ intricacies abstracted, in the end making the program extremely simple to comprehend. 

So, let's dissect and understand what is going on behind the scenes for this little innocuous piece of code. For the starters, how many function arguments and function calls do you see happening/appears will happen in line 4 ?
  1. If you answered "What! There are no function calls happening here", welcome to C++, enjoy this article and be ready to get shocked.
  2. Did you say "Two function calls and Three arguments" ?. Close enough, but not quiet there.
  3. But, if you answered "Three function calls and Three arguments", you may or may not be right. 
The answer can be either option 2 or option 3. It really depends upon the argument type.

What is std::cout ?

'std::cout' is an object of 'ostream' class. These objects can write sequences of characters and represent other kinds of data.

From Standard 27.4.1 [iostream.objects]
#include <ios>
#include <streambuf>
#include <istream>
#include <ostream>

namespace std {
  extern istream cin;
  extern ostream cout;
  extern ostream cerr;
  extern ostream clog;

  extern wistream wcin;
  extern wostream wcout;
  extern wostream wcerr;
  extern wostream wclog;
}

What is '<<' ?

It is an overloaded function for operator <<. For our case this is a member function of class ostream and can also be a regular function outside of class ostream but inside the namespace 'std' in which case it accepts 'ostream' class as its argument.
We will get more into it shortly.

What is std::endl ?

'std::endl' is a function template declared as:
template <class charT, class traits>
basic_ostream<charT,traits>& endl(basic_ostream<charT,traits>& os);
This is the third function which gets called after its passed as a parameter in the second call to the function operator <<.

The Real "Hello World"

Brace yourselves, waiting for you is an ugly code.

#include <iostream>

int main() {
    std::cout.operator <<("Hello, World!").operator <<(std::endl);
    return 0;
}
[ We have seen worse, haven't we? :) ]

"But, but there are just 2 function calls and 2 arguments !?". No, there are 3 function calls (do not forget 'std::endl' which is a function and gets called later) and 2 arguments for this case.

Consider another example:
#include <iostream>

namespace std_c {

    // Forward declare
    class Ex;

    std::ostream& operator << (std::ostream&, const Ex&);

    class Ex {
    public:
        Ex(int val = 0): val_(val) {}
        friend std::ostream& operator << (std::ostream& out, const Ex& obj);
    private:
        int val_;
    };

    std::ostream& operator << (std::ostream& os, const Ex& obj) {
        os << obj.val_;
        return os;
    }
};

int main() {
    std_c::Ex tmp(42);
    std::cout << tmp << std::endl;

    return 0;
}

How do you think this is working ?
Questions :
  1. For sure ostream class or cout object does not have a operator << member function that can understand how to print our custom class 'Ex'. So, its certain that a code similar to what we have shown for 'Hello world' won't work here. So how does it print it?
  2. Inside the 'main' routine, we have not mentioned any scope for function '<<', it is unqualified name. How did it end up calling the one inside namespace std_c 
Let's dive into ADL to find answers for the above questions.

Welcome to the world of ADL!

ADL stands for 'Argument dependent lookup'. As the name suggests, it is a type of lookup based upon the type of arguments. Hence, ADL is only applicable for function name lookup. To be more precise, ADL is only applicable for lookup of unqualified function names based upon the arguments of the function.
Though it's still bit incomplete, but we will get there.

Where does the lookup happen ?

From cppreference:
The lookup for the unqualified function names are done in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified lookup.
Does that strike something ? It says, the compiler also looks up to find a suitable function in the namespace of the arguments of the function !

For our last example, this is how the code looks like:
std_c::operator << (std::cout, std_c::Ex).operator << (std::endl);

Wow! It's calling the operator << funtion under namespace 'std_c' !! 

How did that happen ?
  1. First, compiler does an unqualified lookup for function 'operator <<'. The unqualified lookup gives the algorithm with the member function of ostream class or cout object.
  2. The function given out by unqualified lookup is rejected as it cannot take the arguments we are passing due to type mismatch.
  3. Now, ADL kicks in.
  4. It checks the arguments passed to 'operator <<'.  They belong to namespaces 'std' and 'std_c' resp.
  5. Compiler finds a 'operator <<' function under namespace std. But, that also gets rejected due to type mismatch. How could it, Ex is a user defined class.
  6. Now, compiler will look under the namespace 'std_c' since that's what our namespace is for the second argument.
  7. This definition of 'operator <<' fits the bill perfectly. So, this function is chosen.
  8. If any access specifier checks are applicable, then its done.

General (read 'Golden') Rules for ADL:

1. ADL is only applicable for unqualified function names.
2. ADL is not applicable if argument types are built-in types.
3. ADL kicks in only when unqualified lookup fails OR a template function or member function is found. [The second statement is purely from my observation. Please correct me if I am wrong here]
4. (From cppreference) If the lookup set produced by usual unqualified lookup contains any 
     of the following:
a) a declaration of a class member
b) a declaration of a function at block scope (that's not a using-declaration)
c) any declaration that is not a function or a function template (e.g. a function object or another variable whose name conflicts with the name of the function that's being looked up)
     Then the argument-dependent lookup is not considered.
5. If both unqualified lookup and ADL returns a set of applicable definitions, then overload resolution is applied to find out the best matching declaration.
6. Designing code with over dependence on ADL will hit you with force(Either physically or mentally).
7. For all other details look here.

Examples:
In this section, I will present some more examples to demonstrate ADL and some other cool tricks.
Compiler used: g++ 4.8.2 with C++11 flag enabled


Example 1:
Lets see an overly complicated example of comparing values of two objects:


template <typename T>
const T& max(const T& a, const T& b) {
    return a < b ? b : a;
}

namespace Ctx {
    class Object {
    public:
        Object(int val): val_(val) {}
        bool operator < (const Object& b) {
            std::cout << "Insidious called" << std::endl;
            return this->val_ < b.val_;
        }
    public:
        int val_ = 0;
    };

    template<typename T>
    bool operator < (const T& a, const T& b) {
        std::cout << "operator overload called" << std::endl;
        return a.val_ < b.val_;
    }
};

using namespace Ctx;

int main() {
    Object a(10), b(11);
    const Object& res = max(a, b);
    return 0;
}

Analysis:
What the code does:
1. It has a function-template called max, which receives 2 arguments of same type and compares it straightway like integers.
2. To make the max function work with objects, we have to define operator < function for those classes.
3. Just for the sake of demonstration, I have created two operator < functions, one of them as a member function and the second one inside the namespace.
4. When the max function is called, based upon the lookup, the most eligible operator < function will get called.

So, when you run this program, you expect to see "Insidious called" in the output. And that is what should happen based upon what we have seen till now.

But, I am always getting "operator overload called" as the output ! Why would that be ? It should be an easy-peasy job for the unqualified lookup, no ? Well, this is what I thought and it took me some time to find out the cause.

Inside the max function, operator < is called on an const object, so I have to make the member function const as well.

Change the line as below for the member function and you will see the expected result:
bool operator < (const Object& b) const {

Now, run it again and you should see the output as "Insidious called".

So, we saw how ADL saved the day for us in the first case (or did it just save the day and made future of the code uncertain ? ).


Example 2:
Now, let's see one more similar but bit more complicated example. The lookup flow would be similar to what we saw in example 1, but in this case we will get to see overload resolution kick in as well.


namespace test {

    template<typename T>
    struct Any {
        T val_;
    };

    class Ops {
      public:
        Ops():val_(0){}
        template <typename T>
        Ops& operator<< (const T& val) {
            std::cout << "Called member function" << std::endl;
            this->val_ = val;
            return *this;
        }
      private:
        int val_;
    };

    template <template<typename> class E,  typename T>
    Ops& operator<< (Ops& op, const E<T>& val) {
        std::cout << "Global function" << std::endl;
        return op;
    }
}

int main() {
    test::Ops op;
    int k = 9;
    test::Any<int> a;

    op << a;

    return 0;
}

Analysis:
Pretty much similar to what we have seen till now, only in this case, since the member function is templated, both unqualified lookup and ADL returns a set of definitions.

Unqualified lookup gives out the member function.
ADL gives out the second function.

But, it turns out that, the outside function provided by ADL better matches with the arguments passed, hence the output of the code is "Global function".

I had asked this question sometime back in Stack Overflow, you can see the answer here.



Example 3:
Lets see an example on how ADL works with friend functions.


class Ex {
public:
    friend void foo(const Ex& obj) {
        std::cout << "foo() called" << std::endl;
    }
};

int main() {
    foo(Ex());
    // (foo)(Ex()); // This will give error: "‘foo’ was not declared in this scope"
}

So, it appears that even friend functions are resolved correctly using ADL.
As per the standard:

friend declarations (11.3) may introduce a (possibly not visible) name into the enclosing namespace.
That means, in our example, function foo is visible inside the scope of class (class is also a namespace), thus unqualified look-up will fail to find foo, but ADL can find it since the argument of the function is the class in which we have defined our friend function.



Example 4:
Now, for the cool trick that I was talking about. It appears that there is a way to not to do ADL! One just has to write the function name in round brackets while calling it.

Here is an example:
class GlobalScope {
};

namespace Inner {
    class InnerScope: public GlobalScope { // GlobalScope Identified by unqualified lookup
    };

    void test_func(const InnerScope& param) {
        std::cout << "Inner scope test function" << std::endl;
        return;
    }
};

void test_func(const GlobalScope& param) {
    std::cout << "Global scope test function" << std::endl;
    return;
}

int main() {
    Inner::InnerScope obj1;
    test_func(obj1);  // Should call the one inside Inner due to ADL,
                      // since that is perfect match compared to the one
                      // in global scope


    (test_func)(obj1); // Disables ADL

    GlobalScope obj3;
    test_func(obj3);

    return 0;
}


Try it out for yourselves!

With this, I hope ADL would be pretty much clear. Remember the general rules and you should be just fine.

In the next part, which would be the final one as per the title, we will discuss about template argument deduction.

(To be continued....)