Saturday, 3 January 2015

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Part 2)

This post as the name suggests is a follow up of my previous post . I would recommend to go through it once before we start digging into 'Argument Dependent Lookup' (ADL from here on) or 'Koenig Lookup'.

Let's start with a surprise story from Andrew Koenig himself which I had come across in  Dr. Doob's journal:
One of the comments on my article last week noted that argument-dependent lookup in C++ is often called "Koenig lookup". I didn't invent it, but unfortunately, I don't know who did, so I don't know where the credit -- or blame -- is really due.

Back to Hello, World! 
No, not "hello world" again! But wait, this has a lot to do with ADL! ADL is what that makes your C++ hello world program so simple! Just stay with me.

Here we go:
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

This simple program has a lot of C++ intricacies abstracted, in the end making the program extremely simple to comprehend. 

So, let's dissect and understand what is going on behind the scenes for this little innocuous piece of code. For the starters, how many function arguments and function calls do you see happening/appears will happen in line 4 ?
  1. If you answered "What! There are no function calls happening here", welcome to C++, enjoy this article and be ready to get shocked.
  2. Did you say "Two function calls and Three arguments" ?. Close enough, but not quiet there.
  3. But, if you answered "Three function calls and Three arguments", you may or may not be right. 
The answer can be either option 2 or option 3. It really depends upon the argument type.

What is std::cout ?

'std::cout' is an object of 'ostream' class. These objects can write sequences of characters and represent other kinds of data.

From Standard 27.4.1 [iostream.objects]
#include <ios>
#include <streambuf>
#include <istream>
#include <ostream>

namespace std {
  extern istream cin;
  extern ostream cout;
  extern ostream cerr;
  extern ostream clog;

  extern wistream wcin;
  extern wostream wcout;
  extern wostream wcerr;
  extern wostream wclog;
}

What is '<<' ?

It is an overloaded function for operator <<. For our case this is a member function of class ostream and can also be a regular function outside of class ostream but inside the namespace 'std' in which case it accepts 'ostream' class as its argument.
We will get more into it shortly.

What is std::endl ?

'std::endl' is a function template declared as:
template <class charT, class traits>
basic_ostream<charT,traits>& endl(basic_ostream<charT,traits>& os);
This is the third function which gets called after its passed as a parameter in the second call to the function operator <<.

The Real "Hello World"

Brace yourselves, waiting for you is an ugly code.

#include <iostream>

int main() {
    std::cout.operator <<("Hello, World!").operator <<(std::endl);
    return 0;
}
[ We have seen worse, haven't we? :) ]

"But, but there are just 2 function calls and 2 arguments !?". No, there are 3 function calls (do not forget 'std::endl' which is a function and gets called later) and 2 arguments for this case.

Consider another example:
#include <iostream>

namespace std_c {

    // Forward declare
    class Ex;

    std::ostream& operator << (std::ostream&, const Ex&);

    class Ex {
    public:
        Ex(int val = 0): val_(val) {}
        friend std::ostream& operator << (std::ostream& out, const Ex& obj);
    private:
        int val_;
    };

    std::ostream& operator << (std::ostream& os, const Ex& obj) {
        os << obj.val_;
        return os;
    }
};

int main() {
    std_c::Ex tmp(42);
    std::cout << tmp << std::endl;

    return 0;
}

How do you think this is working ?
Questions :
  1. For sure ostream class or cout object does not have a operator << member function that can understand how to print our custom class 'Ex'. So, its certain that a code similar to what we have shown for 'Hello world' won't work here. So how does it print it?
  2. Inside the 'main' routine, we have not mentioned any scope for function '<<', it is unqualified name. How did it end up calling the one inside namespace std_c 
Let's dive into ADL to find answers for the above questions.

Welcome to the world of ADL!

ADL stands for 'Argument dependent lookup'. As the name suggests, it is a type of lookup based upon the type of arguments. Hence, ADL is only applicable for function name lookup. To be more precise, ADL is only applicable for lookup of unqualified function names based upon the arguments of the function.
Though it's still bit incomplete, but we will get there.

Where does the lookup happen ?

From cppreference:
The lookup for the unqualified function names are done in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified lookup.
Does that strike something ? It says, the compiler also looks up to find a suitable function in the namespace of the arguments of the function !

For our last example, this is how the code looks like:
std_c::operator << (std::cout, std_c::Ex).operator << (std::endl);

Wow! It's calling the operator << funtion under namespace 'std_c' !! 

How did that happen ?
  1. First, compiler does an unqualified lookup for function 'operator <<'. The unqualified lookup gives the algorithm with the member function of ostream class or cout object.
  2. The function given out by unqualified lookup is rejected as it cannot take the arguments we are passing due to type mismatch.
  3. Now, ADL kicks in.
  4. It checks the arguments passed to 'operator <<'.  They belong to namespaces 'std' and 'std_c' resp.
  5. Compiler finds a 'operator <<' function under namespace std. But, that also gets rejected due to type mismatch. How could it, Ex is a user defined class.
  6. Now, compiler will look under the namespace 'std_c' since that's what our namespace is for the second argument.
  7. This definition of 'operator <<' fits the bill perfectly. So, this function is chosen.
  8. If any access specifier checks are applicable, then its done.

General (read 'Golden') Rules for ADL:

1. ADL is only applicable for unqualified function names.
2. ADL is not applicable if argument types are built-in types.
3. ADL kicks in only when unqualified lookup fails OR a template function or member function is found. [The second statement is purely from my observation. Please correct me if I am wrong here]
4. (From cppreference) If the lookup set produced by usual unqualified lookup contains any 
     of the following:
a) a declaration of a class member
b) a declaration of a function at block scope (that's not a using-declaration)
c) any declaration that is not a function or a function template (e.g. a function object or another variable whose name conflicts with the name of the function that's being looked up)
     Then the argument-dependent lookup is not considered.
5. If both unqualified lookup and ADL returns a set of applicable definitions, then overload resolution is applied to find out the best matching declaration.
6. Designing code with over dependence on ADL will hit you with force(Either physically or mentally).
7. For all other details look here.

Examples:
In this section, I will present some more examples to demonstrate ADL and some other cool tricks.
Compiler used: g++ 4.8.2 with C++11 flag enabled


Example 1:
Lets see an overly complicated example of comparing values of two objects:


template <typename T>
const T& max(const T& a, const T& b) {
    return a < b ? b : a;
}

namespace Ctx {
    class Object {
    public:
        Object(int val): val_(val) {}
        bool operator < (const Object& b) {
            std::cout << "Insidious called" << std::endl;
            return this->val_ < b.val_;
        }
    public:
        int val_ = 0;
    };

    template<typename T>
    bool operator < (const T& a, const T& b) {
        std::cout << "operator overload called" << std::endl;
        return a.val_ < b.val_;
    }
};

using namespace Ctx;

int main() {
    Object a(10), b(11);
    const Object& res = max(a, b);
    return 0;
}

Analysis:
What the code does:
1. It has a function-template called max, which receives 2 arguments of same type and compares it straightway like integers.
2. To make the max function work with objects, we have to define operator < function for those classes.
3. Just for the sake of demonstration, I have created two operator < functions, one of them as a member function and the second one inside the namespace.
4. When the max function is called, based upon the lookup, the most eligible operator < function will get called.

So, when you run this program, you expect to see "Insidious called" in the output. And that is what should happen based upon what we have seen till now.

But, I am always getting "operator overload called" as the output ! Why would that be ? It should be an easy-peasy job for the unqualified lookup, no ? Well, this is what I thought and it took me some time to find out the cause.

Inside the max function, operator < is called on an const object, so I have to make the member function const as well.

Change the line as below for the member function and you will see the expected result:
bool operator < (const Object& b) const {

Now, run it again and you should see the output as "Insidious called".

So, we saw how ADL saved the day for us in the first case (or did it just save the day and made future of the code uncertain ? ).


Example 2:
Now, let's see one more similar but bit more complicated example. The lookup flow would be similar to what we saw in example 1, but in this case we will get to see overload resolution kick in as well.


namespace test {

    template<typename T>
    struct Any {
        T val_;
    };

    class Ops {
      public:
        Ops():val_(0){}
        template <typename T>
        Ops& operator<< (const T& val) {
            std::cout << "Called member function" << std::endl;
            this->val_ = val;
            return *this;
        }
      private:
        int val_;
    };

    template <template<typename> class E,  typename T>
    Ops& operator<< (Ops& op, const E<T>& val) {
        std::cout << "Global function" << std::endl;
        return op;
    }
}

int main() {
    test::Ops op;
    int k = 9;
    test::Any<int> a;

    op << a;

    return 0;
}

Analysis:
Pretty much similar to what we have seen till now, only in this case, since the member function is templated, both unqualified lookup and ADL returns a set of definitions.

Unqualified lookup gives out the member function.
ADL gives out the second function.

But, it turns out that, the outside function provided by ADL better matches with the arguments passed, hence the output of the code is "Global function".

I had asked this question sometime back in Stack Overflow, you can see the answer here.



Example 3:
Lets see an example on how ADL works with friend functions.


class Ex {
public:
    friend void foo(const Ex& obj) {
        std::cout << "foo() called" << std::endl;
    }
};

int main() {
    foo(Ex());
    // (foo)(Ex()); // This will give error: "‘foo’ was not declared in this scope"
}

So, it appears that even friend functions are resolved correctly using ADL.
As per the standard:

friend declarations (11.3) may introduce a (possibly not visible) name into the enclosing namespace.
That means, in our example, function foo is visible inside the scope of class (class is also a namespace), thus unqualified look-up will fail to find foo, but ADL can find it since the argument of the function is the class in which we have defined our friend function.



Example 4:
Now, for the cool trick that I was talking about. It appears that there is a way to not to do ADL! One just has to write the function name in round brackets while calling it.

Here is an example:
class GlobalScope {
};

namespace Inner {
    class InnerScope: public GlobalScope { // GlobalScope Identified by unqualified lookup
    };

    void test_func(const InnerScope& param) {
        std::cout << "Inner scope test function" << std::endl;
        return;
    }
};

void test_func(const GlobalScope& param) {
    std::cout << "Global scope test function" << std::endl;
    return;
}

int main() {
    Inner::InnerScope obj1;
    test_func(obj1);  // Should call the one inside Inner due to ADL,
                      // since that is perfect match compared to the one
                      // in global scope


    (test_func)(obj1); // Disables ADL

    GlobalScope obj3;
    test_func(obj3);

    return 0;
}


Try it out for yourselves!

With this, I hope ADL would be pretty much clear. Remember the general rules and you should be just fine.

In the next part, which would be the final one as per the title, we will discuss about template argument deduction.

(To be continued....)

2 comments:

  1. Updated with an example involving friend functions.

    ReplyDelete
  2. Final part of the series is out now. http://templated-thoughts.blogspot.in/2015/01/c-templates-name-lookup-adl-and_13.html

    ReplyDelete