Sunday, 20 September 2015

Slicing in C++

For the last few days, I have been playing around with Go language. I was basically doing the algorithms part so that I could in parallel write a concise yet beautiful code in its c++11 version as well (but you know thats quite ironic..). So, it was pretty much useless thing, but I wanted to do it anyways just for the fun part. That means, don't blow me away if you don't see me using any standard library functions for achieving something which I am intending to do.


Slices in Go

If you have not yet crossed your path with Go (I suggest you do), slices are basically your 'std::vector' counterpart in Go. Now there is a huge difference technically, but it is still a data structure which you would choose in Go if you are to store elements in memory contiguously.

Slices in Go abstracts away the underlying representation of the container actually storing the elements. But, it is usually an array, because thats what you want to store elements in contiguous memory location.

You can get down to its basics by reading through the below link:

Now what can you do with this slices ? Well, lets see this below example:

func edit_slice(elems []int) {
    for idx, _ := range elems {
        elems[idx] += 3
    }
}

func main() {
    elems := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
    edit_slice(elems[len(elems)/2:])
}

This is a pretty simple code which does nothing but increment each element from the last half by 3. It need not have been the last half, it could have been any 'range' (There is a reason why I have marked it as bold. Will come to that soon).

Now, lets see how someone would do that in c++11:
#include <vector>
#include <algorithm>

void edit_elems_1(std::vector<int>& vec,
                  std::vector<int>::iterator it1,
                  std::vector<int>::iterator it2)
{
    for (; it1 != it2; it1++) {
        *it1 += 3;
    }
}

void edit_elems_2(std::vector<int>& vec)
{
    std::for_each(vec.begin() + vec.size()/2, vec.end(),
                  [&](int& elem) { elem += 3; } );
}


int main() {
    std::vector<int> vec{1,2,3,4,5,6,7,8,9,10};
    edit_elems_1(vec, vec.begin() + vec.size()/2, vec.end());
    edit_elems_2(vec);
    return 0;
}

This time 'std::for_each' came to our rescue, otherwise it looks terrible ('edit_elems_1'). Even the 'std::for_each' is a bit harsh on the eyes compared to Go version.

Ranges in C++


But, this is not just me who is complaining about the inflexibility of 'iterators' in C++. Introducing the 'Range' concept is becoming more and more vocal in c++ world.

I believe it all started with the man 'Andrei Alexandrescu'. You can read through his 15 pages article to understand what he needs to convey:
http://www.informit.com/articles/article.aspx?p=1407357

Andrei implemented this concept in 'D' language. But our c++ experts wont accept defeat that easily, do they? We have got 'boost ranges' in answer to that. There is a series of posts by the author itself ('Eric Neibler') at his own weblog ('http://ericniebler.com/category/ranges/').

The implementation for the ranges in c++ seems to be very complex for me to comprehend right now. But there obviously has to be an easier way to implement what we can do in Go (or even python for that matter, but python is getting complex..huh, I will leave that for another talk).

So, what I have here is a POC for making use of slices in C++. There are many limitations in my design currently:
1) Not at all production ready. Its just a POC.
2) Some design issues. Adding new containers is bit contrived.
3) Many functionalities available in Go not yet supported.
4) Proper exception handling. With little more work it can be improved.
5) I am still not happy with the overall design. Did things in a hurry.

That said, I might not be continuing with this effort unless I really feel an urge to finish it off. So, If you guys think it would be cool or you think you have a much better way to do it, then surely let me know in the comments section.

Slices in C++

Lets just take a dive and see how an implementation of a naive binary search would look in Go and in C++ with my slice implementation.

Go binary search:
func binary_search_rec(arr []int, find int) bool {
    mid := len(arr) / 2

    if arr[mid] == find {
        return true
    }

    if mid == 0 {
        return false
    }

    if find > arr[mid] {
        return binary_search_rec(arr[mid+1:len(arr)], find)
    } else {
        return binary_search_rec(arr[0:mid], find)
    }
}

C++ binary search - with slices:
#include "slice.hpp" // My slice header

template <typename C>
bool binary_search(slice<C> slice, typename C::value_type val)
{
    int mid = len(slice)/2 ; 
                             
    if (slice[mid] == val) return true;
    if (mid == 0) return false;

    if (val > slice[mid]) return binary_search(slice(mid+1, end), val);
    else return binary_search(slice(beg,mid), val);
}

I am sure you can just take a guess and how it would look like without slices. So does this look anything close to 'cool' to you ? This technique can be used at all places where we want to operate on only a part of the data set at a time. Eg: QuickSort etc.

Also, note that the C++ version above just works well with a std::vector or std::string or std::array (Not testes, but should work), but thats not the gist of this comparison, its the terseness.

Besides these trivial examples, there are lots of cases where we need to have just the view of the container subset. For eg, while implementing string algorithms like pattern matching, you will always need to have a view of the substring of the text or whatever. The current available option is to call 'substr' method of string which will return you a copy of the string, now thats expensive. Otherwise, pass the start and end iterator which can be error prone.


Under the hood

The slice class adds a very thin wrapper around the container on which you want to create the slice. One thing it does currently is to derive from the base class i.e if you are creating a slice for a vector, then the slice<C> class is derived from vector. Here template parameter 'C' is the container itself.

Below example shows how you create a slice:

    std::vector<int> vec{1,2,3,4,5,6,7,8,9,10};

    // all elements
    auto vslice = make_slice(vec, 0, vec.size());
    assert(len(vslice) == 10);

    vslice = vslice(beg, 8); // slice of first 8 elements


1. 'make_slice' is a template function which creates a new slice for you.
2. 'operator()' is overloaded to provide the slicing functionality.
3. Two or more slices created by copy construction or by assignment shares the same container underneath since the slice just holds the reference to the container.
4. slice does not own the container. In the above case vector 'vec' is controlled by the user. If vec gets destroyed, no slice must be used after that.
5. 'len' is a global function for getting the length of the slice. Same as 'len' function in Go.

Source Code Location : https://github.com/arun11299/Slices-in-Cpp

The source is currently in a .cc file instead of a header (I am just too lazy sometimes). You can find few more examples on using slices in it.

Compilation : It should work just fine if you have enabled C++11. I have tested it on both Clang and g++ 4.8.


Sunday, 19 April 2015

Moving forward....

In the previous post, I tried my best explaining about 'move' semantics in C++11. In this post I am planning to cover up remaining things, namely:
1) Forwarding References
2) Reference Collapsing Rules
3) Forwarding Constructors
4) Type Limiting Forwarding Constructors
And as many examples as possible ! Also, all examples were compiled with g++ 4.8.2 and run on Ubuntu 14.02.

Since this post is a follow up on my previous post (the subject line does not indicate that), I would recommend to read through that before jumping here.

Link to previous post : Who moved my object ? No, not std::move!

Let's get started!!

Forwarding References

Now, since we are pretty much comfortable with R-Value references, lets go through some very simple examples and guess (it wouldn't be a guess if you know :) ) what would happen.

class Elf {
};

void take_elf(Elf&& elf) {  //..... (1)
    // do elf stuffs
}

template <typename T>
void templated_elf(T&& param) { // .. (6)
    // do elf stuffs 
}

int main() {
    Elf&& legolas = Elf();    // .... (2)  
    
    Elf&& dobby = legolas;    // .... (3)
    
    Elf rl_dobby = legolas;   // .... (4)

    take_elf(rl_dobby);     // .... (5)

    templated_elf(rl_dobby);// .... (7) 

    return 0;
}

Lets go through the code as per the numbering and find out issues as we go by them. Everything is being done with/on class Elf.

1. It is a function taking an r-value reference of type class Elf. Looks good.

2. Variable 'legolas' is an r-value reference and we are assigning it with an unnamed object or a temporary object of type class Elf. This is what the purpose of r-value reference is, i.e. binding with temporaries. So, all looks fine here.

[Important]
3. Here too, variable 'dobby' is an r-value reference, but in this case we are assigning it with 'legolas' instead of a temporary Elf object. The difference is that, a temporary variable is not an r-value reference . So, once a temporary is bound to an r-value reference for eg: 'legolas', it behaves just like an l-value from there on. That means, 'legolas' is an r-value reference at its point of definition at (2), but from there on, it behaves just like an l-value.
Thus, this statement would result in an compilation error (type mismatch), since we are trying to assign an l-value ('legolas') to an r-value reference which is 'dobby'.

4. If (3) is clear, this statement should appear as a regular assignment statement and there is nothing more to that. We are assigning 'legolas' to 'rl_dobby'. Since both are l-values here, there is no compilation error.

5. This is similar to (3). We are trying to call function 'take_elf' which expects an r-value reference as parameter with an l-value 'rl_dobby'. Same reasons mentioned for (3) holds true here as well.
So, this will result in compilation error (type mismatch). You would see the same error, if the call parameter is replaced to 'legolas'.

6. Welcome to 'Forwarding References'. You might now ask, what is the big difference ? It looks exactly like 'take_elf' except the fact that it is templated. Lets answer this in the next section.


Welcome to Forwarding References


So, your question was "what's the big difference? It's just templated", wasn't it ? Well, what makes it special is the template itself.

Had it been a regular template function, i.e. 'T' or 'T&' or 'T*' instead of 'T&&', it would have behaved just like its explicit type counterpart (mostly), except for the fact that the template version accepts all data types (the signature part).

But a template r-value reference or a forwarding reference (T&&) does not exhibit the same behavior as that of its explicit type counterpart for eg: 'take_elf'. The difference is that 'T&&' behaves like an r-value reference i.e binds with rvalues/ temporaries but they can also behave like an l-value !
This duality allows them to bind with both r-values and l-values.

Thus, statement (7) in the code is valid and does not result in compilation error like it did for (3). It is because of the dual nature of forwarding reference (T&&).

Besides this, they can also bind to const, volatile and both const-volatile objects making them 'greedy functions'. It is so greedy that we have a separate section on it.

We have already seen one such example of forwarding reference in my previous post, where a vector was being passed to a function 'do_work_on_thread' which used a forwarding reference. (Well, that example was slightly incorrect. There is an update beneath that section).

How to write a forwarding reference ?
Quoting from Scott Meyer's 'Effective Modern C++':
"""
For a reference to be universal, type deduction is necessary, but it's not sufficient. The form of the reference declaration must also be correct, and that form is quite constrained. It must be precisely "T&&" 
""" 
NOTE: Here, universal reference means forwarding reference. Initially, it was Scott Meyers who identified the need to have a different terminology for such references and he named it as 'universal' reference. Later, the standard committie standardized it as 'forwarding reference'.

Based upon the above, something like below is not a forwarding reference:

template <typename T>
void func(std::vector<T>&& arg) {...}

But this is a forwarding reference:
auto&& var = Test();


Reference Collapsing Rules

There are basically four rules associated with reference collapsing and is only applicable in case of forwarding references. These are a set of deduction rules which must be remembered while working with forwarding references.










The easiest way to remember is that, only when there are four '&', it collapses into an r-value reference, for all other cases it collapses into an l-value reference. :)

Below example depicts the first two rules:

#include <iostream>

template <typename T>
void func(T& val) {
    val += 10; // For the sake of simplicity,
               // hoping its always int :D
}

int main() {
    int a = 10; 
    int& ans = a;
    func(a);
    std::cout << a << " " << ans << std::endl; // prints 20 20

    int&& ans2 = std::move(a);
    func(a);
    std::cout << ans2 << std::endl; // prints 30

    return 0;
}


Below example depicts the remaining 2 rules which are more important.

#include <iostream>

template <typename T>
void func(T&& param) {
    param += 10; // Assuming its int
}

int main() {
    int a = 10; 

    func(a);
    std::cout << a << std::endl; // prints 20, Rule no 3

    func(std::move(a)) ; // Rule no 4
    return 0;
}


As you would have got it by now, it's because of these rules when an l-value is passed to a forwarding reference, it accepts it as a reference.

Now, you could ask, why these rules are required at all ? These rules are required for 'perfect forwarding' to work, which we will see in the next section.

Perfect Forwarding


Why is such a thing there in the first place ? Well, it is because of the dual nature of forwarding references which we had seen earlier. Due to that, basically two different types converged to become an l-value reference inside the function body. After that, there is no way (without std::forward and some template magic) to determine what was the exact type passed into the function.

Consider below example:
void do_work(??) { // What type of argument ?
}


template <typename T>
void wrapper(T&& args) {
    // do some work on argss
    do_work(args);
}

So, here you are. Inside function 'wrapper' when you call 'do_work', you want to copy 'args' if it was passed as l-value or pass it as r-value if it was passed as r-value. Without 'std::forward', you could have done some template magic (check previous blog post for the source) to get the same effect, but that is error prone and I am not even sure if it would be a practical solution.
Let's see how std::forward solves this issue.

std::forward

Like std::move, std::forward is also a function template that does nothing but casting. But unlike std::move which does an unconditional cast, std::forward does a conditional cast.

This is how the implementation of std::forward looks like in g++ 4.8.2 (Made it visually appealing).

template <typename T>
T&&  forward(typename std::remove_reference<T>::type& t) noexcept
{ 
    return static_cast<T&&>(t); 
}

template <typename T>
T&&  forward(typename std::remove_reference<T>::type&& t) noexcept
{   
      return static_cast<T&&>(t);
}

Basically, the first function is enough for std::forward to work and from my experimentation also, the second overload never got called. So not sure why the second overload is there. In our examples, we will consider std::forward as the first function only.

Now, lets get back to our previous example and try to solve it by using std::forward.

class Test {
};

void do_work(const Test& t) { 
    Test a(t);
    //.....
}

void do_work(Test&& t) {
    // .......
}


template <typename T>
void wrapper(T&& args) {
    // do some work on argss
    do_work(std::forward<T>(args));
}

We solved the problem by creating 2 overloads of the 'do_work' function. First one accepting the argument by reference and the second one accepting the argument as r-value reference.
It's the job of the std::forward to cast it to the correct type and thereby call the correct overload.

Let's see how that is done. Consider the case of an l-value first:
Test t;
wrapper(t);

// Inside wrapper std::forward is called with template
// type as T.
// Since an l-value was passed, due to reference collapsing rules
// std::forward$lt;T> would essentially be std::forward<Test>

//So std::forward would be instantiated like

Test& &&  forward(typename std::remove_reference<Test&>::type& t) noexcept
{ return static_cast<Test& &&>(t); }

// Applying reference collapsing rules on it
Test& forward(Test& t) noexcept
{ return static_cast<Test&>(t); }

So, as you see for the above case, 'std::forward' casts it to an l-value reference and thereby calling the first 'do_work' overload.

Lets see the same for an r-value reference:
Test t;
wrapper(std::move(t));

// Inside wrapper, since an r-value was passed, due to reference collapsing rules
// std::forward<T> would essentially be std::forward<Test&&>

// So std::forward would be instantiated like this
Test&& && forward(Test&& && t) noexcept {
    return static_cast<T&& &&>(t);
}

//Applying reference collapsing rules
Test&& forward(Test&& t) noexcept {
    return static_cast<T&&>(t);
}

As you can see, this time it correctly cast it to an r-value reference, thereby making it call the second overload of the 'do_work' function.
This is the reason why std::forward is called conditional cast.


With that we finish with the basics of forwarding references and std::forward. The thing to remember is that, move is called with r-value references and forward is called with forwarding references.

Where to go from here ?

Back to your life :)
Take a break. Once you are ready to face fresh challenges, go through below resources:
1) http://thbecker.net/articles/rvalue_references/section_01.html
2) Scott Meyers "Effective Modern C++". Some examples in this post are influenced by his explanation.
3) What we have covered are the basics. There are more to move semantics based on its usage. For eg: Move constructor and its fallacies.
4) Know about move only types like std::unique_ptr, std::thread etc and how to work with them.



Sunday, 12 April 2015

Who moved my object ? No, not std::move!

It's been quite a long time since I really wanted something to blog about. So, I thought why not write up something on the new features introduced in C++11 which I found difficult to grasp or come in terms with initially.
It was not the lambda expression or the auto type deduction or the new c++11 threading memory model, but something much easier (Yes, easy because I understand it now ;) ), the move semantics and everything related to it.
The intention of this post is to ease your way into understanding move semantics and its 'not- so-obvious' fallacies.


Temporaries reloaded!


Anyone who has been working with C++ for sometime knows what I want to say here. And you would be right, there is nothing new with the temporaries as such in the new standard. What has been added is a way by which creation of temporaries does not add overhead to the run-time performance of your c++ code (and I know you want it as fast as possible, don't you?)
Let us consider an example code:

void my_func(const std::string& name) {
    names_vec.push_back(name);
}

std::string name("Galadriel");
my_func(name);                    // Passing lvalue
my_func(std::string("Gandalf"));  // Passing temporary string object
my_func("Goku");                  // Passing string literal

This looks like a 'good enough' piece of code. Passing temporaries like this will work since its lifetime would be extended since it gets bound to the const object.
And when doing 'push_back' inside the function, 'name' will get copied for the first two cases, whereas for the third case (string literal), a temporary string object needs to be created before copying it inside the 'names_vec' container.
Huh! Was that the best we could do ? Yeah, kind of in C++98/C++03. Hell no! for C++11. We can do much better in C++11, thanks to move semantics.

Welcome to move semantics


With C++11, a new type for temporary variables was born or standardized. This is called as 'R-Value Reference' , represented as 'T&&' (where T is any built-in or user defined type).
These R-value reference types gets bound to the temporaries or to other R-value references.

void my_func(std::string&& name) {
    names_vec.push_back(name);
}

my_func(std::string("Vegeta"));
std::string lval("lvalue");
my_func(lval);                // Compile error! my_func cannot accept lvalue

You see, how the temporary string object got accepted by our new function signature but got rejected at compile time for the lvalue! What I have shown you here is one way by which R-value reference type accepts parameters i.e as temporaries.

Now, the question is, how would I make 'my_func' accept my lvalue? Here is where 'std::move' comes into picture.

What does std::move do ? 



A hell lot of stuff you might be thinking, but no! All it does is a casting operation! See it for yourselves.
Below is the implementation of std::move in g++ 4.8.2 (Cleaned up a bit for easier understanding):

  template <typename T>
  typename std::remove_reference<T>::type&&
    move(T&& t) 
    { return static_cast<typename std::remove_reference<T>::type&&>(t); }

NOTE: 'std::remove_reference' is a type trait which as the name suggests provides the actual type passed in but without the reference. This is required because of the 'reference-collapsing' rule which we would be discussing later, but for the current context, it can be ignored.

If you ignore the 'remove_reference' type trait, the function template boils down to an unconditional cast to an R-value reference! That means, 'std::move' does not 'move' anything, to move or not is actually decided by the compiler. Doing an 'std::move' is just an request to the compiler to do move optimization.

So, I repeat the question, what would do to make lvalue to be accepted by 'my_func' ? Yes, you move the lvalue which would cast it to an r-value.

Where is the optimization ?



This is the meaty part of the discussion. In simple terms, what move means is to point to a different memory location where the required object is placed or allocated. So, instead of copying the object to a new memory location, just point it to its current location. This is obviously a lot more efficient than copying!
This can be shown easily by the below diagram:



As you can see from the diagram, moving vector 'vl' does not require copying the entire vector. 
But, the same is the case if we had passed it by reference or const reference, what is so special here ? To answer this question, lets look at another slightly bigger example (Read comments in code to understand):

#include <iostream>
#include <utility>
#include <vector>
#include <algorithm>

template<typename T>
void do_work_on_thread(T&& ip_vec) {
    std::for_each(ip_vec.begin(), ip_vec.end(), 
                    [](int v) { std::cout << v << std::endl; } );
    return;
}

void accept_vec(std::vector<int> vec) {       // First complete copy
    std::cout << "Accept vec non optimized" << std::endl;
    // do some work on the input vector
    // .....

    // Asynchronously work on the vector by feeding 
    // it to a thread.
    do_work_on_thread(vec);                   // Second complete copy
    return;
}

void accept_vec2(std::vector<int>&& vec) {    // No copy, vector just moved
    std::cout << "Accept vec optimized" << std::endl;
    // do some work on the input vector
    // .....

    // Asynchronously work on the vector by feeding
    // it to a thread.
    do_work_on_thread(std::move(vec));        // No copy, vector just moved
    return;
}

int main() {
    std::vector<int> v = {1,2,3,4,5,6,7,8,9};
    accept_vec(v);              // Old style
    accept_vec2(std::move(v));  // New move optimized style
    return 0;
}

Being a bit more smart, one copy can be avoided in the 'old style', but still one copy is required for the life time guarantees of the vector. Whereas with move, no copy is required at all!

UPDATE:
In the comments section, Pavel raised a question on where the second copy is happening in case of 'accept_vec' function. And he is right on target, there is no second copy happening there. The vector gets passed as a reference (Because of reference collapsing rules which we will probably discuss in next post).
So, the problem has become just more sever for me. For the 'accept_vec' case, I need to pass the vector by value to 'do_work_on_thread' since the vector is going to be handled asynchronously and I need to guarantee its life time and also at the same time, it should not pass by value if I am moving it.

Well, it seems like it can be achieved with some template magic.Here it goes:

#include <iostream>
#include <utility>
#include <type_traits>

class Test {
public:
    Test() {
        std::cout << "Cons" << std::endl;
    }   
    Test(const Test& other) {
        std::cout << "Copy Cons" << std::endl;
    }   
    Test(Test&& other) {
        std::cout << "Move cons" << std::endl;
    }   
    void operator= (const Test& other) {
        std::cout << "Assignment op" << std::endl;
    }   
    void operator= (Test&& other) {
        std::cout << "Move assignment op" << std::endl;
    }
};

template <typename T, bool U>
struct detail {
        // Constructor copies it to arg1
        detail(const T& param): arg1(param) { 
                std::cout << "Generic detail" << std::endl; 
        }
        typename std::remove_reference<T>::type arg1;
}; 

// Template specialization for detail when T is not a reference
// Here I am just assuming it to be an r-value reference, whereas
// in real scenario one should handle for other types as well.
template <typename T>
struct detail<T, false>  {
        // Constructor just moves the param to arg1
        detail(T& param): arg1(std::move(param)) { 
                std::cout << "Specialized detail" << std::endl; 
        }
        typename std::remove_reference<T>::type arg1;
};

/* Assume that this function
* will process 'arg' asynchronously.
* That means, fwd_ref must own the arg and its
* life time must not be controlled by any other frame
*/
template <typename T>
void fwd_ref(T&& arg) {
    std::cout << "Called fwd_ref" << std::endl;
    std::cout << std::is_reference<T>::value << std::endl;
    detail<T, std::is_reference<T>::value> d(arg);
    return;
}

int main() {
    Test t;
    fwd_ref(std::move(t)); // pass as R-value reference
    fwd_ref(t);            // Pass by value
    return 0;
}


So, in the above code, the magic happens in the 'detail' struct, which copies the constructor argument when it receives a reference, and move the argument when it is 'not-a' reference (which we consider as move for simplicity.
One thing to remember here is that there are other simpler ways to achieve the same with some design change. The only thing that made it complicated is the use of 'forwarding reference' assuming it to be used along with thread.


Move Constructors and Move Assignment


We all know that, if a class is managing a memory resource, it is important to provide the class with a copy constructor, assignment operator and destructor. Support for move operation in user defined classes is done via 'move constructors' and 'move assignment'. So, the 'Big three' becomes 'Big five' in C++11.
Lets look at an example right away(Would not be using smart pointers, apologies for that) [Compiled with g++ 4.8.2]:

#include <iostream>
#include <utility>
                  
class MemoryMgr {
public:
    MemoryMgr(size_t size):mp_block(new uint8_t[size]()),
                           m_blk_size(size) {
    }
        
    MemoryMgr(const MemoryMgr& other) {
        std::cout << "Copy cons" << std::endl;
        if (this == &other) return;
        mp_block = new uint8_t[other.m_blk_size]();
        m_blk_size = other.m_blk_size;
        std::copy(other.mp_block, other.mp_block + other.m_blk_size,
                  mp_block);
    }
/*
    MemoryMgr& operator=(MemoryMgr other) {
        std::cout << "Ass op" << std::endl;
        std::swap(other.mp_block, mp_block);
        std::swap(other.m_blk_size, m_blk_size);

        return *this;
    }   
*/ 
   MemoryMgr(MemoryMgr&& other) {
        std::cout << "Move copy cons" << std::endl;
        mp_block = other.mp_block;
        m_blk_size = other.m_blk_size;
        
        other.mp_block = nullptr;
        other.m_blk_size = 0;
    }

    MemoryMgr& operator=(MemoryMgr&& other) {
        std::cout << "Move assignment op" <<  std::endl;
        if (this == &other) return *this;

        delete[] mp_block;
        mp_block = other.mp_block;
        m_blk_size = other.m_blk_size;

        other.mp_block = nullptr;
        other.m_blk_size = 0;
    }


    virtual ~MemoryMgr() noexcept {
        delete[] mp_block;
        m_blk_size = 0;
    }

private:
    uint8_t* mp_block = nullptr;
    size_t m_blk_size = 0;
};


int main() {
    MemoryMgr obj1(100);
    MemoryMgr obj2(std::move(obj1)); // Move constructor will be called
                                     // NOTE: Once moved, obj1 should not be used again inside main

    MemoryMgr obj3(40);
    obj3 = std::move(obj2);          // Move assignment will be called. Much efficient than regular assignment op
                                     // NOTE: Once moved, obj2 should not be used again inside main

    return 0;
}

One can go through the example and check how the implementation is done for move constructor and move assignment operator. It should be very much clear that, the move operations are much more efficient and faster than the regular copy constructor and assignment operator.

NOTE: I have commented out the assignment operator, since the compiler was finding it ambiguous with the mover assignment operator. This is because, we are accepting the argument by value, which can accept r-value references as well. This can be corrected by making the assignment operator take input parameter by const reference. I leave this to the reader as an exercise :)


Not everything is green !


I would be very skeptic if everything looked so easy and straight-forward in C++. So, lets get down to the quirkiness.

1. Move is more efficient for head based containers/objects
Though all containers in the standard library support move operation, but not for all containers moving would be as cheap as for showed for std::vector. For eg, take the case for std::array, in which the container elements are directly inside std::array object.
Therefore, for this case, the move operation needs to be done for all the elements inside the array. If there is no move constructor provided for the object held by the array, then the time taken is as good as for the copy operation.

2. Once moved, the object state is valid but in unspecified state
Consider the below example:
    std::vector v1 = {1, 2, 3, 4};
    assert( v1.size() == 4 );            // Assertion is valid

    std::vector v2 = std::move(v1);
    assert( v2.size() == 4 );            // Assertion is valid
  
    assert( v1.size() == 4 );            // Assertion is INVALID, size() for v1 may or may not be 4

    v1 = v2;                             // v1 can be assigned to new vector


As can be seen from the example, once moved, it is incorrect to rely on the state of the moved object, for eg: size(), empty(), etc.
The moved object can be assigned with other similar objects sice that does not require any precondition checks.

3. Move constructor and Move assignment operators are not generated by default in many cases
Unlike copy constructor and assignment operator, compiler does not always (*) generate default move constructor and default assignment operator when not provided by the user.

NOTE(*): Not for all cases default constructor or default copy constructor or default assignment operator are implicitly created by the compiler.

a. Default move operations are generated only if user has not explicitly provided implementation for copy constructor or assignment operator or destructor.
b. Default move operations are not generated if the user has provided an implementation for copy constructor or assignment operator or destructor. In this case compiler thinks that something different  or extra needs to be done for moving the resource(s) managed by the class.

4. Do not move objects in return statement
Consider below example:
// Return by Copy version
Object create() {
    Object obj;
    return obj;
}


// Move Version
Object create() {
    Object obj;
    return std::move(obj);
}

Which one do you think would be more efficient, copy version or the move version ? The heading would have given it away anyways :) But, for the sake of explanation, its the copy version that would be more efficient. This is because of NRVO (Named return value optimization), which every decent compiler does (mostly with optimization flags turned on). If RVO/NRVO kicks in, the compiler can completely elide copy and move construction.

With NRVO, the object would be created directly at the destination, requiring no copy at all. This is called as 'copy ellision'.

With the move version, compiler does not do NRVO/RVO, but one has to depend upon the move constructor of Object.

But, NRVO/RVO is not applicable at all times, for eg: the return statements are provided under if-else statements! In this wouldn't it be better to move rather than depend on compiler ?? Cool down, as per the standard, if copy-ellision is not applicable, the 'std::move' is implicitly applied. :)

UPDATE (From Scott Meyers Modern effective C++) : NRVO/RVO optimization requires that we return the local object from the function. But, this is not the case if we return the moved object. The moved object is basically a reference to the object, not the object itself.

Code demonstrating the same [Compiled with g++ 4.8.2]:
#include <iostream>
#include <array>

class Object {
public:
    Object() {
        std::cout << "Cons" << std::endl;
    }   
    Object(const Object& other) {
        std::cout << "Copy Cons" << std::endl;
    }   
    Object(Object&& other) {
        std::cout << "Move cons" << std::endl;
    }   
    void operator= (const Object& other) {
        std::cout << "Assignment op" << std::endl;
    }   
    void operator= (Object&& other) {
        std::cout << "Move assignment op" << std::endl;
    }   
};

std::array<Object, 2> get_array_no_move() {
    std::cout << "Enter get_array" << std::endl;
    std::array<Object, 2> arr;
    std::cout << "Array init" << std::endl;
    arr[0] = Object();
    arr[1] = Object();
    std::cout << "Leaving" << std::endl;
    
    return arr;
}


std::array<Object, 2> get_array() {
    std::cout << "Enter get_array" << std::endl;
    std::array<Object, 2> arr;
    std::cout << "Array init" << std::endl;
    arr[0] = Object();
    arr[1] = Object();
    std::cout << "Leaving" << std::endl;
    
    return std::move(arr);
}

int main() {
    std::array<Object, 2> res = get_array();
    std::cout << "Got res" << std::endl;

    std::array<Object, 2> res2 = get_array_no_move();
    return 0;
}

If you look at the output (after compiling with -O2 optimization flag), it would be clearly visible that get_array_no_move does not have any move constructors called while returning the array object, but the move constructor gets called in case of get_array function. This is because for get_array_no_move the copy and move constructor got elided because of NRVO.

5. Move does not work with const objects
Obviously, how can a constant object move!! Well, the technical reason is that, doing a move may or may not modify the source object, which is against the requirement of const objects.
For this reason, the below example will compile, but you cannot reap the benefits of move:
class Test {
public:
    Test(const std::string& name): m_name(std::move(name)){
    }
private:
    std::string m_name;
};


Phew! So far so good. I was planning to continue with Forwarding references and  Perfect Forwarding in this same post, but now I think it would be better to have them in a separate post else someone would sue me for causing mental depression.

Will update the post with the link to the next post. Till then, wait-for-it.

Update : Part two of the series  

Sunday, 22 February 2015

Fun with Template Meta-programming, enable_if_all

It is true that sometimes ignorance is bliss...sometimes. In my case ignorance or to be more specific, lack of curiosity over this weekend to "just try it out once" lead me to do a lot of crazy stuff with template meta-programming.

What I wanted
While working on my side project work, I came across with the need of having a class with forwarding constructor. Now, people familiar with C++11 would know how greedy a forwarding constructor can be, it's so greedy that it starts taking up the role of copy constructor (for non const object) until we limit it by SFINAE (Substitution Failure Is Not An Error). 

One can lookup here to know what I am talking about.

So, back to my problem. I am sitting here with a need to have a forwarding constructor taking variable template parameters or variadic templates, but also have my copy constructor work for me without any issues. The known solution for such cases is to use some kind of SFINAE or to be more specific use 'std::enable_if'. But, here I am working with variadic templates instead of just known number of template parameters. At this point, I just made one heck of an 'assumption' that 'std::enable_if' won't work with variadic templates and immediately (out of too much excitement) sat down to implement my own generic 'enable_if_all' type trait. Oh, boy! that was embarassing !

What was the big idea ?
The idea was to create a type trait that would take any other type_trait structures (C++11 has got so many of them, check it out here) which accepts 1 or 2 template parameter in general along with bunch variable number of other types on which the type_trait would be applied to get the final result, which is either 'true' or 'false', just like 'std::enable_if'.

But, after a lot of attempts I figured out many things that won't work with template meta-programming. Finally I had to settle for a very naive solution and at the same time figured out that 'std::enable_if' also works with variadic templates and also with type_traits requiring more that 1 template parameter (for eg. std::is_constructible, std::is_same).

In the process, I discovered a missing feature that if present would have been so beautiful and powerful. Currently, there are two ways by which you can pass a template class as a template parameter:

1. Along with the type, i.e as a complete type.

template <typename T>
class Test {
};

template <class ClsT>
class Composer {
};

int main() {
    Composer<Test<int>> cmp;
    return 0;
}

2. Without the type information i.e as template template parameter


template <typename T>
class Test {
};

template <template <typename> class ClsT, typename T>
class Composer {
private:
    ClsT<T> cmp_;
};

int main() {
    Composer<Test, int> cmp;
    return 0;
}

Now, what would have been more beautiful is if we had something like template type 'explosion'. Something like below:

template <typename A, typename B> class Test {};

template <template <typename T1, typename T2> class ClsT>
class Composer {
};


In this case, we would be able to make use of template parameters T1 and T2 as well. But, currently c++ does not allow that. Had it worked, I would be still ignorant about 'std::enable_if'.

What I have ?
Nothing much to be proud showing off, but some important lessons learnt (which I may forget over a period of time :)) related to template meta-programming. It is surely a different language within C++. It might be harsh to untrained eyes, but a real beauty to someone who understands it. Hope it won't be killed by 'auto' in future c++ standards.

Here is my version of 'enable_if_all'. Note that, a lot of things are still missing in it.

#include <type_traits>

template <typename Type>
struct SType {
    typedef Type type;
};

template <template <typename...> class ValidatnFn, typename... Types>
struct enable_if_all;

template <template <typename> class ValidatnFn, typename LastT>
struct enable_if_all<ValidatnFn, LastT> {
    static bool const value = ValidatnFn<LastT>::value;
};

// This is for type_trait checks taking in only single template parameter
template <template <typename> class ValidatnFn, typename FirstT, typename... RestT>
struct enable_if_all<ValidatnFn, FirstT, RestT...> {
    static bool const value = ValidatnFn<FirstT>::value & enable_if_all<ValidatnFn, RestT...>::value ;
};

template <template <typename, typename> class ValidatnFn, typename FirstT, typename LastT>
struct enable_if_all<ValidatnFn, FirstT, LastT> {
    static bool const value = ValidatnFn<typename FirstT::type, LastT>::value;
};

// This is for type_trait checks taking in 2 template parameters
template <template <typename, typename> class ValidatnFn, typename FirstT, typename SecT, typename... RestT>
struct enable_if_all<ValidatnFn, FirstT, SecT, RestT...> {
    static bool const value = ValidatnFn<typename FirstT::type, SecT>::value & enable_if_all<ValidatnFn, FirstT, RestT...>::value ;
};

If you are eyes are itching, please pay a visit to doctor!  :)

Tuesday, 13 January 2015

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Sentinel)

Finally, we are at the final part of this series. Hope you enjoyed the reading till now. 
If you have not had the chance to go over the previous two articles, I will link you to those:

Part 1
Part 2


Template Argument Deduction

Just like other topics, I can only dream about knowing all the nooks and crooks of argument deduction in the template world. But, what I can put forth here is what every experienced C++ programmer must know about it.  

Most of the things I will be writing about is directly influenced by what is already explained in "C++ Templates: The Complete Guide" book. It is highly recommended for everyone who makes use of templates in their C++ code to read through it and always keep it at your hands distance.

Why Argument Deduction ?

A simple template code which one might have written to understand template would have looked like this:
#include <iostream>
#include <string>
#include <typeinfo>

template <typename T>
void func(T arg) {
   std::cout << "func called with arg " << typeid(T).name() << std::endl;
}

int main() {
    func<int>(5);
    func<char>('a');
    return 0;
}


Now, assume you have below function, which you need to call many times and at many places in your code:
template<typename T1, typename T2, typename T3>
void someCommonFunction(T1& a, T2 b, T3 c) {
....
}

...
someCommonFunction<std::string, char, int>(str, 'b', 5);

This is what you would be writing every where! Too much verbosity!!
This is where argument deduction comes in. It deduces the types auto-magically thereby relieving us from the burden of mentioning the intended template types everywhere.

Also, a very important point to remember is that, template type deduction works only for function templates and member function templates, it is not applicable at the class level.


Let's cover some basics...

Look at the below example:
template <typename T>
T max (const T& a, const T& b) {
    return a < b ? b : a;
}

int max_val = max(10, 11);

Lets now see how compiler might be thinking while deducing the types:
1. It encounters the first parameter as 10 and considers its data type as 'int'. So, the parameterized type 'T' of function template max is considered 'int'  as one of the option.

2. Now, it checks the data type of the second argument, which is also 'int'. This matches with the previous conclusion done in step #1 and also fits perfectly as the argument type of the function template based upon the paremeter type i.e both the arguments should be of same type.
Hence, in this case, type deduction succeeded.

3. Let's now consider the max call like below:
int max_val = max(10, 10.1);

4. For the above case, step #1 will tentatively deduce 'T' as 'int'. But, for the second argument, 'T' will be deduced as 'double'. This second deduction does not conform with the first deduction and also w.r.t the parameter type of the function template i.e. both argument needs to be of the same type.
Hence, for this case, argument deduction rightfully fails and the compiler looks for other overloaded function with the same name 'max' to match it with the actual call.


Type Deduction Scenarios

There are various scenarios depending upon the argument type of the function template on which deduction of type T is dependent. These are:
1. Argument type is either a pointer or reference.
2. Argument type is pass by value.
3. [C++11 specific] Argument type is a forwarding reference. (We will not be covering this now.)

NOTE: Examples in this section are influenced by what is provided in 'Effective Modern C++' by Scott Meyers.

Case 1. When argument type is either reference or pointer.
Remember that, if the argument type of the callee is a reference or a pointer, then, the pointer or the reference is ignored. 
For example:
template <typename T>
void func(T& param) {   // See that argument is a reference type
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x);  // 'x' is int; 'T' is int; Argument type is 'int&'

f(cx); // 'cx' is 'const int'; 'T' is 'const int'; Arg type is 'const int&'

f(rx); // 'rx' is 'const int&'; 'T' is 'const int'; Arg type is 'const int&'

Following the comments, it should be pretty straightforward to grasp what is happening here.
The important point to remember here is that:
When we pass a const object to a reference parameter, the object remains as a const i.e. const-ness (also volatile) of the object becomes part of the type deduced for T.
Let's see the same example again with a small change to the argument type:
template <typename T>
void func(const T& param) {
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x);  // x is int; T is int; Arg type is const int&

f(cx); // cx is const int; T is int; Arg type is const int&

f(rx); // rx is const int&; T is int; Arg type is const int&
If param was a pointer ( or a pointer to const ) instead of a reference, things would have been the same. The note above regarding the const-ness (also volatile) of the object also holds in case of the pointer.


Case 2. When argument is passed by value
When the parameter is passed by value, a completely new object is created at the target destination. This has some major changes in the rules regarding c-v (const - volatile) qualifier which we saw for reference/pointer types.

As before, if the callers expression type is a reference, ignore the reference part. If the callers expression type has c-v qualifiers, then they are stripped off.

Example:
template <typename T>
void func(T param) {
}

int x = 27;
const int cx = x;
const int& rx = x;

f(x); // x is int; T is int; Arg type is int

f(cx);// cx is const int; T and Arg type are int

f(rx);// rx is const int& ; T and Arg type are int

As can be seen from the above example that the c-v qualifiers are stripped off when passed by value.


Decaying

Case 1. Decaying of Array into a pointer
In many contexts, an array decays into a pointer to its first element. This is what we know right from C. But, this decaying also takes place in case of function templates accepting parameter by-value!

Example:
const char name[] = "Arun"; // Type of name is const char[5]

const char* pname = name; // array decays to a pointer

template <typename T>
void func(T param) {
}

f(name); // T is const char*, name is const char[5]

Now, the confusing part, if we modify the function template to accept argument by-reference instead of by-value, the deduced type for T is the actual type of the array i.e const char[5], and the type of func's parameter is const char (&) [5].



Use of type deduction

Have a look at boost source or STL implementation, they are every where. It is one of the very basic requirements to understand how template works after all.
It is used in type traits, boost function etc.

What I will be showing here is a pretty useless example, but will cover how in general type traits are written and also shows a peek into boost function traits

Beware, following code is only for those who can bend their mind :) 
#include <iostream>

// Type trait helper function
// Determines at compiler time whether type is signed integer or not
template <typename T> struct is_integer { enum {value = false}; };
template <> struct is_integer<int> { enum {value = true}; };

// Base structure of function trait
template <typename Func> struct function_traits {};

// Partially specialized function_trait
// to accept a function signature
template <typename R>
struct function_traits<R (void)> {
    typedef R result_type;
};

// Partially specialized function_trait 
// to accept a function pointer
template<typename R>
struct function_traits<R (*) (void)> {
    typedef R result_type;
};


int test_func() {
    std::cout << "This is a test function" << std::endl;
    return 42;
}

template <typename Fun>
typename function_traits<Fun>::result_type 
my_function(Fun& f) {
    return f();
}

int main() {
    if (is_integer<function_traits<float(void)>::result_type>::value == true) {
        std::cout << "Return type of function is integer" << std::endl;
    } else {
        std::cout << "Return type is not integer!!" << std::endl;
    }

    int (*fptr)() = test_func;

    if (is_integer<function_traits<int (*)(void)>::result_type>::value == true) {
        std::cout << "Return type of function is integer" << std::endl;
    } else {
        std::cout << "Return type of the function is not integer!!" << std::endl;
    }

    std::cout << my_function(test_func) << std::endl;

    return 0;
}

For those who reached at this point of the page in 5 seconds, have patience, it's not that difficult! Just go one line at a time and you will get it for sure.

Basically, we are just identifying the return type of the function based upon the signature, boost function traits does  a lot more than that.

Disabling Type Deduction

Yes, you can do that as well. With C++, getting things done is never an issue, the only ugly part that can be is the code itself. :)
Recently, I got bogged down by the question, why 'std::forward' cannot use type deduction ?
As always, I found the answer in THIS SO post. That's when the use of 'identity type trait' stuck me. So simple, yet so powerful. 
Lets dig in to an example right away...
#include <iostream>
#include <string>

template <typename T>
void func_deduct(T& val) {
    std::cout << "With deduction: " << val << std::endl;
}

// Identity type trait
template <typename T>
struct identity {
    typedef T type;
};

template <typename T>
void func_nodeduct(typename identity<T>::type& val) {
    std::cout << "No deduction: " << val << std::endl;
}

int main() {
    std::string str("Will it work?");
    //func_nodeduct(str);                // Uncomment this and face the wrath of compiler
    func_nodeduct<std::string>(str);

    return 0;
}


As to why deduction does not work, in short, it is because template type T in function 'func_nodeduct' appears in 'nondeduced' context.

From the Standard 14.8.2.4/4:
The nondeduced contexts are:
  • The nested-name-specifier of a type that was specified using a qualified-id.
  • A type that is a template-id in which one or more of the template-arguments is an expression that references a template-parameter.
When a type name is specified in a way that includes a nondeduced context, all of the types that comprise that type name are also nondeduced. However, a compound type can include both deduced and nondeduced types. [Example: If a type is specified as A<T>::B<T2>, both T and T2 are nondeduced. Likewise, if a type is specified as A<I+J>::X<T>IJ, and T are nondeduced. If a type is specified as void f(typename A<T>::B, A<T>), the T in A<T>::B is nondeduced but the T in A<T> is deduced. ]

There are lots more to be discussed about argument deduction, but I will call it a day. One can lookup here to understand it in more detail.

Saturday, 3 January 2015

C++ Templates: 'Name Lookup', 'ADL' and 'Argument Deduction' (Part 2)

This post as the name suggests is a follow up of my previous post . I would recommend to go through it once before we start digging into 'Argument Dependent Lookup' (ADL from here on) or 'Koenig Lookup'.

Let's start with a surprise story from Andrew Koenig himself which I had come across in  Dr. Doob's journal:
One of the comments on my article last week noted that argument-dependent lookup in C++ is often called "Koenig lookup". I didn't invent it, but unfortunately, I don't know who did, so I don't know where the credit -- or blame -- is really due.

Back to Hello, World! 
No, not "hello world" again! But wait, this has a lot to do with ADL! ADL is what that makes your C++ hello world program so simple! Just stay with me.

Here we go:
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

This simple program has a lot of C++ intricacies abstracted, in the end making the program extremely simple to comprehend. 

So, let's dissect and understand what is going on behind the scenes for this little innocuous piece of code. For the starters, how many function arguments and function calls do you see happening/appears will happen in line 4 ?
  1. If you answered "What! There are no function calls happening here", welcome to C++, enjoy this article and be ready to get shocked.
  2. Did you say "Two function calls and Three arguments" ?. Close enough, but not quiet there.
  3. But, if you answered "Three function calls and Three arguments", you may or may not be right. 
The answer can be either option 2 or option 3. It really depends upon the argument type.

What is std::cout ?

'std::cout' is an object of 'ostream' class. These objects can write sequences of characters and represent other kinds of data.

From Standard 27.4.1 [iostream.objects]
#include <ios>
#include <streambuf>
#include <istream>
#include <ostream>

namespace std {
  extern istream cin;
  extern ostream cout;
  extern ostream cerr;
  extern ostream clog;

  extern wistream wcin;
  extern wostream wcout;
  extern wostream wcerr;
  extern wostream wclog;
}

What is '<<' ?

It is an overloaded function for operator <<. For our case this is a member function of class ostream and can also be a regular function outside of class ostream but inside the namespace 'std' in which case it accepts 'ostream' class as its argument.
We will get more into it shortly.

What is std::endl ?

'std::endl' is a function template declared as:
template <class charT, class traits>
basic_ostream<charT,traits>& endl(basic_ostream<charT,traits>& os);
This is the third function which gets called after its passed as a parameter in the second call to the function operator <<.

The Real "Hello World"

Brace yourselves, waiting for you is an ugly code.

#include <iostream>

int main() {
    std::cout.operator <<("Hello, World!").operator <<(std::endl);
    return 0;
}
[ We have seen worse, haven't we? :) ]

"But, but there are just 2 function calls and 2 arguments !?". No, there are 3 function calls (do not forget 'std::endl' which is a function and gets called later) and 2 arguments for this case.

Consider another example:
#include <iostream>

namespace std_c {

    // Forward declare
    class Ex;

    std::ostream& operator << (std::ostream&, const Ex&);

    class Ex {
    public:
        Ex(int val = 0): val_(val) {}
        friend std::ostream& operator << (std::ostream& out, const Ex& obj);
    private:
        int val_;
    };

    std::ostream& operator << (std::ostream& os, const Ex& obj) {
        os << obj.val_;
        return os;
    }
};

int main() {
    std_c::Ex tmp(42);
    std::cout << tmp << std::endl;

    return 0;
}

How do you think this is working ?
Questions :
  1. For sure ostream class or cout object does not have a operator << member function that can understand how to print our custom class 'Ex'. So, its certain that a code similar to what we have shown for 'Hello world' won't work here. So how does it print it?
  2. Inside the 'main' routine, we have not mentioned any scope for function '<<', it is unqualified name. How did it end up calling the one inside namespace std_c 
Let's dive into ADL to find answers for the above questions.

Welcome to the world of ADL!

ADL stands for 'Argument dependent lookup'. As the name suggests, it is a type of lookup based upon the type of arguments. Hence, ADL is only applicable for function name lookup. To be more precise, ADL is only applicable for lookup of unqualified function names based upon the arguments of the function.
Though it's still bit incomplete, but we will get there.

Where does the lookup happen ?

From cppreference:
The lookup for the unqualified function names are done in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified lookup.
Does that strike something ? It says, the compiler also looks up to find a suitable function in the namespace of the arguments of the function !

For our last example, this is how the code looks like:
std_c::operator << (std::cout, std_c::Ex).operator << (std::endl);

Wow! It's calling the operator << funtion under namespace 'std_c' !! 

How did that happen ?
  1. First, compiler does an unqualified lookup for function 'operator <<'. The unqualified lookup gives the algorithm with the member function of ostream class or cout object.
  2. The function given out by unqualified lookup is rejected as it cannot take the arguments we are passing due to type mismatch.
  3. Now, ADL kicks in.
  4. It checks the arguments passed to 'operator <<'.  They belong to namespaces 'std' and 'std_c' resp.
  5. Compiler finds a 'operator <<' function under namespace std. But, that also gets rejected due to type mismatch. How could it, Ex is a user defined class.
  6. Now, compiler will look under the namespace 'std_c' since that's what our namespace is for the second argument.
  7. This definition of 'operator <<' fits the bill perfectly. So, this function is chosen.
  8. If any access specifier checks are applicable, then its done.

General (read 'Golden') Rules for ADL:

1. ADL is only applicable for unqualified function names.
2. ADL is not applicable if argument types are built-in types.
3. ADL kicks in only when unqualified lookup fails OR a template function or member function is found. [The second statement is purely from my observation. Please correct me if I am wrong here]
4. (From cppreference) If the lookup set produced by usual unqualified lookup contains any 
     of the following:
a) a declaration of a class member
b) a declaration of a function at block scope (that's not a using-declaration)
c) any declaration that is not a function or a function template (e.g. a function object or another variable whose name conflicts with the name of the function that's being looked up)
     Then the argument-dependent lookup is not considered.
5. If both unqualified lookup and ADL returns a set of applicable definitions, then overload resolution is applied to find out the best matching declaration.
6. Designing code with over dependence on ADL will hit you with force(Either physically or mentally).
7. For all other details look here.

Examples:
In this section, I will present some more examples to demonstrate ADL and some other cool tricks.
Compiler used: g++ 4.8.2 with C++11 flag enabled


Example 1:
Lets see an overly complicated example of comparing values of two objects:


template <typename T>
const T& max(const T& a, const T& b) {
    return a < b ? b : a;
}

namespace Ctx {
    class Object {
    public:
        Object(int val): val_(val) {}
        bool operator < (const Object& b) {
            std::cout << "Insidious called" << std::endl;
            return this->val_ < b.val_;
        }
    public:
        int val_ = 0;
    };

    template<typename T>
    bool operator < (const T& a, const T& b) {
        std::cout << "operator overload called" << std::endl;
        return a.val_ < b.val_;
    }
};

using namespace Ctx;

int main() {
    Object a(10), b(11);
    const Object& res = max(a, b);
    return 0;
}

Analysis:
What the code does:
1. It has a function-template called max, which receives 2 arguments of same type and compares it straightway like integers.
2. To make the max function work with objects, we have to define operator < function for those classes.
3. Just for the sake of demonstration, I have created two operator < functions, one of them as a member function and the second one inside the namespace.
4. When the max function is called, based upon the lookup, the most eligible operator < function will get called.

So, when you run this program, you expect to see "Insidious called" in the output. And that is what should happen based upon what we have seen till now.

But, I am always getting "operator overload called" as the output ! Why would that be ? It should be an easy-peasy job for the unqualified lookup, no ? Well, this is what I thought and it took me some time to find out the cause.

Inside the max function, operator < is called on an const object, so I have to make the member function const as well.

Change the line as below for the member function and you will see the expected result:
bool operator < (const Object& b) const {

Now, run it again and you should see the output as "Insidious called".

So, we saw how ADL saved the day for us in the first case (or did it just save the day and made future of the code uncertain ? ).


Example 2:
Now, let's see one more similar but bit more complicated example. The lookup flow would be similar to what we saw in example 1, but in this case we will get to see overload resolution kick in as well.


namespace test {

    template<typename T>
    struct Any {
        T val_;
    };

    class Ops {
      public:
        Ops():val_(0){}
        template <typename T>
        Ops& operator<< (const T& val) {
            std::cout << "Called member function" << std::endl;
            this->val_ = val;
            return *this;
        }
      private:
        int val_;
    };

    template <template<typename> class E,  typename T>
    Ops& operator<< (Ops& op, const E<T>& val) {
        std::cout << "Global function" << std::endl;
        return op;
    }
}

int main() {
    test::Ops op;
    int k = 9;
    test::Any<int> a;

    op << a;

    return 0;
}

Analysis:
Pretty much similar to what we have seen till now, only in this case, since the member function is templated, both unqualified lookup and ADL returns a set of definitions.

Unqualified lookup gives out the member function.
ADL gives out the second function.

But, it turns out that, the outside function provided by ADL better matches with the arguments passed, hence the output of the code is "Global function".

I had asked this question sometime back in Stack Overflow, you can see the answer here.



Example 3:
Lets see an example on how ADL works with friend functions.


class Ex {
public:
    friend void foo(const Ex& obj) {
        std::cout << "foo() called" << std::endl;
    }
};

int main() {
    foo(Ex());
    // (foo)(Ex()); // This will give error: "‘foo’ was not declared in this scope"
}

So, it appears that even friend functions are resolved correctly using ADL.
As per the standard:

friend declarations (11.3) may introduce a (possibly not visible) name into the enclosing namespace.
That means, in our example, function foo is visible inside the scope of class (class is also a namespace), thus unqualified look-up will fail to find foo, but ADL can find it since the argument of the function is the class in which we have defined our friend function.



Example 4:
Now, for the cool trick that I was talking about. It appears that there is a way to not to do ADL! One just has to write the function name in round brackets while calling it.

Here is an example:
class GlobalScope {
};

namespace Inner {
    class InnerScope: public GlobalScope { // GlobalScope Identified by unqualified lookup
    };

    void test_func(const InnerScope& param) {
        std::cout << "Inner scope test function" << std::endl;
        return;
    }
};

void test_func(const GlobalScope& param) {
    std::cout << "Global scope test function" << std::endl;
    return;
}

int main() {
    Inner::InnerScope obj1;
    test_func(obj1);  // Should call the one inside Inner due to ADL,
                      // since that is perfect match compared to the one
                      // in global scope


    (test_func)(obj1); // Disables ADL

    GlobalScope obj3;
    test_func(obj3);

    return 0;
}


Try it out for yourselves!

With this, I hope ADL would be pretty much clear. Remember the general rules and you should be just fine.

In the next part, which would be the final one as per the title, we will discuss about template argument deduction.

(To be continued....)