Archive for the Uncategorized Category

Tutorial on tag dispatching

Posted in Uncategorized on December 15, 2014 by Crazy Eddie

Update: Stephan T. Lavavej (aka STL) provided some feedback: “Overloading on true_type/false_type is still tag dispatching – indeed, it is the simplest, most common form.” I thought I had good reasons for distinguishing situations in which the type is pre-tagged by a trait or typedef as being “tag dispatching” from times when you need to calculate the tag or when the tag isn’t really documenting a concept but just an on/off kind of thing. I now think those reasons were flawed and any time you tag a type and then use that tag as an overload parameter is “tag dispatching”. There is a difference between these methods obviously, but not anything that distinguishes a separate technique. I’ve left the language in this article alone though.

There’s been some confused notions passed around recently that lead me to think there needs to be more information about probably one of the simplest, most powerful metaprogramming techniques that exist in C++: tag dispatching. The reason tag dispatching is powerful is that it leverages the language and the compiler to do work for you so that you don’t have to. It’s a technique whereby you use overload resolution rules to decide between otherwise ambiguous functions.

Concepts and tags

The first part to understand with regard to tag dispatching is the idea of “concepts”. Concepts are not yet a formal part of the C++ language, but they’re integral to discussing many of the components of the standard library–especially those parts that came from the STL.

There’s a good description of concepts over in the boost documentation. There they define a concept as:

A concept is a set of requirements consisting of valid expressions, associated types, invariants, and complexity guarantees. A type that satisfies the requirements is said to model the concept. A concept can extend the requirements of another concept, which is called refinement.

As that link discusses there are different aspects to a concept. There’s the legal expressions available for use on the types that implement the concept. This is the kind of thing that can be analyzed with introspection techniques–in C++ these often center around SFINAE. In this part of the concept we might ask whether, given value ‘x’ of type ‘T’ that implements a concept, whether ‘++x’ is a legal expression.

Other parts of a concept though can’t really be analyzed with C++ metaprogramming. The invariants for example can’t really be asserted at the static level. They can and should be asserted during debug builds, but most of the time that’s the only place where one could get the necessary information to check invariants. Complexity guarantees also can’t be statically analyzed and will be mostly impractical to assert in the product–you’re basically stuck testing if you even have that.

In C++ we address all this using tagging. We tag a type with something that can be accessed during compile time that tells us what concept a type is implementing. This unfortunately means we depend on the author of the type to tag it correctly, but there’s quite literally nothing we can do about that. We might be charged with not writing robust code, but we’re no less beholden to the developer than that we assume someone won’t subclass an interface and disobey the contract of that interface. There are languages such as Rust that try to solve this problem, but C++ does not so there’s really no solution to this problem here.

In C++ we tag types with other types. These tag types are just empty classes that will eventually be compiled out of the program. They may inherit other tags, which will be explained in the next section, but they’ll not be polymorphic so they’ll not have any associated RTTI information. We use them only as values as well and in such ways that they’re never actually used in the program–this allows the compiler to remove them from the final object code.

InputIterator is an example of a concept. The standard library provides the tag for this concept by defining input_iterator_tag. We gain access to this tag through iterator_traits, which contains an internal typedef called iterator_category. The standard also defines a default version of the iterator_traits template that tries to access the iterator_category within its parameter. This means that we can tag our custom iterators either by specializing this template or by just tagging the iterator internally like so:

struct custom_iterator
{
  using iterator_category = std::input_iterator_tag;
  // much more stuff...
};

The reason an external traits template is required is because pointers are also iterators, and we can’t tag them in this way. Thus the standard defines a couple partial specializations for pointers. Here’s the non-const version:

template < typename T >
struct iterator_traits<T*>
{
  using iterator_category = std::random_access_iterator_tag;
  // ... other useful stuff...
};

Even when this is not necessary though it can be beneficial to do provide a metafunction or traits template to retrieve the tag.

Concept refinement

Concepts can refine other concepts similar to how classes can inherit other classes. When a concept refines another concept it means that types implementing the refining concept also implement the refined concept. So for example concept ‘A’ might be refined by concept ‘B’. Say ‘A’ has the requirement that ‘x.foo()’ be a legal expression that results in a type convertible to bool. Concept ‘B’ can require additional expressions and can put further requirements on the result of ‘x.foo()’, saying it requires type bool rather than something convertible to bool for example–since bool is convertible to bool this is fine.

ForwardIterator is an example of a concept that refines another. It refines InputIterator and the main thing that it adds is the ability to iterate over the same range multiple times. This part of the refinement is an essential difference between an InputIterator and a ForwardIterator but it changes nothing about the static interface of the types that implement either. Using introspection alone you’d not be able to decide if an iterator has this ability or not–only tags solve this. Another thing it does is override the ‘*i++’ expression to return a reference. This obeys InputIterator’s version because a reference to ‘value_type’ is convertible to ‘value_type’. This is actually a rather unfortunate and short-sighted part of the standard library because it means filter and transform iterators don’t fit very well into the existing tag structure.

In C++ we represent the refinement relationship using inheritance in our tags. Thus in the standard library the tag for ForwardIterator is the aptly named forward_iterator_tag and it inherits from input_iterator_tag. This allows us to write tag-dispatched functions that specialize at the refinement level we want, catching all below. The advance function makes a good example to explain this because if ‘n’ is some value greater than 1 we want it to just jump directly to that new place given a random access iterator (an iterator for which ‘i += n’ is a valid expression).

Tag dispatching

A tag dispatched function is a function that catches types at a generic level, retrieves a tag labeling the type as implementing the different target concepts, and then uses that tag in an argument for implementation functions. Here’s a version of advance to explain:

// This version actually works more like 'next' in the standard library.
namespace detail_ {

template < typename ForwardIterator >
ForwardIterator advance(ForwardIterator it, int n, std::forward_iterator_tag)
{
    assert(n > -1 && "Can only move a forward iterator forward!");
    while (n--) ++it;
    return it;
}

template < typename RandomAccessIterator >
RandomAccessIterator advance(RandomAccessIterator it, int n, std::random_access_iterator_tag)
{
    it += n;
    return it;
}

}

template < typename Iterator >
Iterator advance(Iterator it, int n)
{
    using tag = typename std::iterator_traits<Iterator>::iterator_category;
    // We're doing this silly and not allowing input iterators, we should make a readable message
    static_assert(std::is_base_of<std::input_iterator_tag, tag>::value, "Can only advance ForwardIterators and their refinements.");

    return detail_::advance(it, n, tag{});
}

It’s actually debatable whether we want to have that static_assert in the entry function–I purposfully made the implementation terrible to raise this concern. We might want to allow people to provide an overload for that base iterator concept. In general though you don’t want to do this–you’re writing your dispatch functions to be as complete as they’ll ever be and you don’t want to try being too generic. YAGNI. Without the static_assert you’ll get what is known far and wide as, “template vomit,” which is where C++ gets its reputation for very long error messages. It probably wouldn’t be all that deep.

The way this code works depends on tags that have direct correlation to the concepts being implemented. RandomAccessIterator is a refinement of BidirectionalIterator, which is in turn a refinement of ForwardIterator. Furthermore we may soon have ContiguousIterator, which will be a refinement of RandomAccessIterator. The tags directly correlate to these concepts and their refinement relations through inheritance:

struct input_iterator_tag {};
struct forward_iterator_tag : input_iterator_tag {};
struct bidirectional_iterator_tag : forward_iterator_tag {};
struct random_access_iterator_tag : bidirectional_iterator_tag {};

// in C++ 17 maybe
struct contiguous_iterator_tag : random_access_iterator_tag {};

// Orthoganal concept that doesn't refine any of the others and isn't refined by them
struct output_iterator_tag {};

C++ has a rule regarding function overloading that requires the compiler to chose the closest match. This means that when we pass a value of type bidirectional_iterator_tag, which due to inheritance is convertible to forward_iterator_tag, the compiler will pick the ForwardIterator version of our overload set. However, if instead random_access_iterator_tag or any subclass is passed in the RandomAccessIterator version will be chosen.

We accept the tag by value because we don’t care about slicing and to give the compiler a better opportunity to optimize that parameter out of the program.

It’s important that these relationships are consistent, otherwise people that use them may run into problems. The purpose of these tags is to filter input types to our functions down to the interface that function depends on. We break that if our tags don’t correlate directly with the concepts we’re claiming to implement.

Getting it wrong

In his technique comparison article, Mr. Fultz compares different methods to resolve generic overloads. Unfortunately the entire article is colored by his desired outcome, the last method. Especially problematic is the tag dispatching version because it confounds tags, which should be representing concepts, with a particular overload ordering. Here is his set of tags:

struct base_tag {};
struct sequence_tag : base_tag {};
struct range_tag : sequence_tag {};

These tags are meant to represent a static if logic rather than concepts but yet they still correlate with concepts. The range_tag is meant to represent dynamic sequences like std::vector, sequence_tag is for static sequences like std::tuple, and base_tag is the default, which assumes “streamable”. The goal, explained by the author elsewhere, is to implement logic like the following:

template < typename T >
void print(T const& t)
{
  if (is_range<T>) range_print(t);
  else if (is_sequence<T>) sequence_print(t);
  else stream_print(t);
}

This rather misses the point of tag dispatching though, which is to leverage the compiler to decide this stuff for you. The problems here can be seen if you consider something he claims about his tag deciding function: “Now even though this logic is little complicated it only needs to be written once.” One would assume he means we could use this logic for dispatching other functions, but if I used it for that I might write something like so:

namespace detail_ {

template < typename T >
void foo(T const& t, base_tag);

template < typename T >
void foo(T const& t, sequence_tag);

}

template < typename T >
void foo(T const& t)
{
  using tag = typename get_print_tag<T>::type;
  detail_::foo(t, tag{});
}

It doesn’t really matter how the detail parts are written, if I pass in a std::vector to foo then bad things are going to happen. I should be able to depend on it behaving correctly but I can’t. Granted, it’s going to error out at compile time given these particular concepts, but it’s not exactly going to be pretty. To avoid the template vomit phenomenon I’d have to write something like this:

template < typename T >
void foo(T const& t, sequence_tag)
{
    static_assert(is_sequence<T>::value, "Passing off a non-sequence as a sequence you silly sod!");
}

This shouldn’t be necessary.

Fixing it

So lets see if we can fix the issues. The first thing we should do is make sure the tags correspond with concepts correctly. We have two orthoganal concepts and one that’s just totally unrelated to the other two:

  1. Sequence: a static sequence
  2. Range: a dynamic sequence
  3. Streamable: something we can use the stream operator on

The first two are mutually incompatible. There’s no such thing as a static, dynamic sequence–a container is one or the other but not both. The third is entirely unrelated to either of the other. A type that is either Sequence or Range could either be or not be Streamable. Our corrected tags are then:

struct streamable_tag {};
struct range_tag {};
struct sequence_tag {};

At this point we can see that the problem doesn’t really decompose into something that tag dispatching can get a purchase on. There’s nothing at all for the compiler to decide. Underneath these different kinds of concept there is certainly room for tag dispatching. Ranges for example are one of a variety of different kinds within a hierarchy of concepts. Sequences also come in different kinds like ForwardSequence or BidirectionalSequence. All of these types will have different tags associated with them that can be dispatched upon, but they’ll not be attached to the same traits metafunction or typedefs as in the orthoganal hierarchy.

We can still do something very much like tag dispatching though by creating the static if/else logic that Mr. Fultz was after:

// Because technically we're just defaulting here, so we shouldn't tag the type as streamable
// if we don't know it is.
struct try_streaming_tag {};

template < typename T >
struct select_print_tag
  : mpl::eval_if
    <
      is_range<T>
    , mpl::identity<range_tag>
    , mpl::if_
      <
        is_sequence<T>
      , sequence_tag
      , try_streaming_tag
      >
    >
{};

Note that eval_if is used for the outer if because we don’t want the inner to be evaluated if the first check is true. As far as whether we do range first or sequence first it doesn’t matter–we could go either way and since they’re orthoganal and unrelated no call would change. What does matter though is the last one because we want to use range or sequence overloads if possible and default to the stream. Since ranges and sequences can be streamable there’s an ambiguity that must be resolved and we’re picking non-stream overload first. This is also why the whole issue can’t be quickly solved by simple enable_if checks (sequence vs. range could, but the streaming bit throws a wrench in).

This seems like its pretty verbose, and it is, but we can now use this function and the tags without screwing up:


template < typename T >
void print(T const& t, try_streaming_tag) { std::cout << t; }

template < typename T >
void print(T const& t, range_tag) { for (auto const& v : t) print(v); }

template < typename T >
void print(T const& t, sequence_tag)
{
  fusion::for_each(t, [](auto const& v) { print(v); });
}

template < typename T >
void print(T const& t)
{
  using tag = typename select_print_tag<T>::type;
  print(t,tag{});
}

No artificial, arbitrary, and ill-conceived inheritance hierarchy for the tags is necessary here. The only reason for tags to inherit from each other is if they are reflecting concepts that refine each other. This allows the base-class resolution to trigger the use of overloads for the base-concept. We don’t need or even desire that here.

It’s hard to justify using tags like this. We’re not really benefiting from them at all. We still have to teach the compiler how to select the tag, so we’re not re-using the compiler’s logic. It does seem more legible though than either the “function overloading” or “conditional overloading” examples from the original blog. The former because that version has to repeat the logic in the select metafunction above for each function (with some sharing between them, which IMO just makes it harder to understand), and the latter because it’s a little bit macro heavy. The selection logic is also more focused this way, but that’s not really usual for tag dispatching (where the selection logic isn’t even in code but in the compiler).

Another method that is perhaps arguably better would be to use std::true_type and std::false_type as tags and just have is_xxx parameters–almost tag dispatching but not quite. Would work like so:


template < typename T, typename Ignore >
void print(T const& t, std::true_type, std::false_type, Ignore)
{
  range_print(t);
}

template < typename T, typename Ignore >
void print(T const& t, std::false_type, std::true_type, Ignore)
{
  seq_print(t);
}

template < typename T >
void print(T const& t, std::false_type, std::false_type, std::true_type)
{
  std::cout << t;
}

// Tick traits don't appear to return the standard bool types so we need some translation
template < typename T >
struct test_ : mpl::if_<T, std::true_type, std::false_type> {};

template < typename T >
using test = typename test_<T>::type;

template < typename T >
void print(T const& t)
{
  print(t, test<is_range<T>>{}, test<is_sequence<T>>{}, test<is_streamable<T>>{});
}

There’s some key differences here. First of all, if there ever were a Sequence that was also a Range we’d get a compile failure because there’s no overload to accept it. This is probably desirable because it would point this out rather than just silently defaulting one way or the other. Second of all the if/else logic required by both my former example and all of Mr. Fultz’s examples is gone. This version truly puts that effort on the compiler to figure out, we just give it the patterns and where to get the constraints…and we didn’t need tags at all. Finally, non-streamable types that don’t fit the other two concepts don’t wait to be inside the stream overload to abort during compilation. There’s no overload for that situation either.

Just overload it when you can!

There’s a couple branches from the original version that I didn’t address.

The first is the variant version. This one is actually not necessary and does nothing interesting, variant already knows how to print itself because it overloads the stream operator.

The second is for std::string. This one is important because unless we do something we’ll use the range version of the print function. The way I’ve written it we’d get the same output, but it’s gonna be the slow way and it’s completely logical that we’d put commas between range items–something we’d not want to happen to strings. The way to deal with this though is blazingly simple and requires nearly zero special consideration:

void print(std::string const& str) { std::cout << str; }

Concluding remarks

There’s a lot of reasons to prefer tag-dispatching and the similar construct I explained here over heavy use of SFINAE and/or deep macros. Neither of these latter tools should be discarded, but both introduce complexities that should be avoided when possible. There’s a lot that can be done with tags and things like tags to simplify code. It’s important though that when doing so you don’t try to outsmart the compiler, which can lead you to making poor decisions. Function overloading, with or without tags, can be a very powerful tool.

Sane C++

Posted in Uncategorized on October 9, 2014 by Crazy Eddie

I’ve created a second C++ blog that will be more geared toward practical, everyday programming. I’ll continue using this one as a place to discuss more advanced or, “just because I can,” topics.

Crazy Eddie’s Sane C++

The first post went up today. It is a brief introduction to unit testing using Boost.Test.

Introduction to unit testing

From the about:

This blog is a compliment to my other blog, Crazy Eddie’s Crazy C++. There I focus on more advanced subjects, like template metaprogramming, that are not going to be what you want to be writing all the time (or at all) because it can be tough to maintain. In this blog I focus on more practical–boring if you will–subjects you can and should use in your everyday C++ development. In Crazy C++ you can find tricks that respond to hard problems or just exercise the language; here I intend all content to be accessible and usable.

I assume some familiarity with the C++ language. Perhaps you’ve taken a college course or two, or maybe you’ve read a book and written a few exercise programs. You might even be interning or in your first few years of professional software development. Perhaps some of my posts here will give the occasional insight to an experienced software developer well versed in C++, but for the most part I intend to address those people who wish to go from barely knowing anything to being able to write maintainable C++ code. Expect to see a lot of simple idioms and pattern application.

You can find code examples for most posts in this blog at my github: Sane CPP.

Near future installments will include subjects like pimpl, NVPI, RAII…and whatever.

On another issue: I am currently on sabbatical but am on the lookout for potential employers as well. My linkedin profile serves as a resume; feel free to contact me if you have a position I could be interested in. In the meantime I hope to be more active in both of these blogs.

Pimplcow

Posted in Uncategorized on September 13, 2014 by Crazy Eddie

This article briefly describes a simple little extension of the pimpl idiom that I’ve called “pimplcow”.  It’s not a unique idiom, I know at least one other developer/group that has done the same thing.  I invented it independently though and at least for a while thought it uniquely mine.  Perhaps I can lay claim to the name :p

The basic gist of the idiom can be shown with shared_ptr:


struct object
{
    int get() const;
    void set(int);

    object();
    object(object const&);
    object& operator = (object);

private:
    struct impl;
    std::shared_ptr<impl> pimpl_;

    impl * pimpl();
    impl const* pimpl() const;
};

The declaration of the class isn’t really anything new or interesting except perhaps for the pimpl function and the shared_ptr. The pimpl() function is actually useful outside of this idiom as it allows you to easily copy the constness of your object into the constness of your impl, which generally should match but pointers in C++ change things a bit. The implementation though is where things get a touch interesting:


struct object::impl { int i; };

object::object() : pimpl_(new impl()) {} // or make_shared
object::object(object const& other) : pimpl_(other.pimpl_) {} // shallow copy!
object & object::operator = (object other) { pimpl_.swap(other.pimpl_); }

int object::get() const { return pimpl()->i; }
void object::set(int i) { pimpl()->i = i; }

// Now for the tricky parts...

// For const operations we just return the pimpl we have.
object::impl const* object::pimpl() const { return pimpl_.get(); }

// For non-const operation we assume the object is about to be modified and we now copy the pimpl...maybe.
object::impl * object::pimpl()
{
    if (!pimpl_.unique()) pimpl_.reset(new impl(*pimpl_));
    return pimpl_.get();
}

This now implements the copy-on-write optimization in a VERY easy way. In fact, if you write your pimpl based classes in a sort of simple to follow standard way you end up having to change only two functions to switch between pimplcow and simple pimpl:


object::object(object const& other) : pimpl_(new impl(*other.pimpl_)) {}

object::impl * object::pimpl() { return pimpl_.get(); }

A further optimization recognizes that `shared_ptr` is actually a rather heavy-weight object when you know for certain you’re using shared pointer semantics, such as in the case of pimplcow. So instead you use an invasive reference count pointer such as `boost::intrusive_ptr`.

This construct is really simple and can save a lot of memory in some situations. It can also greatly reduce CPU use if your objects would otherwise be copied a lot but not changed. Yet further you can use value semantics instead of shared pointers to optimize against copying. It’s much, MUCH simpler to think about and use value types than shared pointers when the only thing you’re trying to do is avoid copies. Finally, in these conditions you must already be paying the cost of atomic increment/decrement in the reference count, which is the one thing you need to worry about here.

To close a bit about the reference count. You need to be weary of and measure the effect of atomic operations needed to increment and decrement the reference count for the shared pointer. If you’re making huge numbers of copies and/or destroying many values when you’ve not called a non-const member and thus doing the real copy, you can end up with a LOT of contention between CPUs. The pimplcow idiom is thread safe if you use a thread-safe reference counter, but in some cases it may actually be cheaper to make a real copy than to defer that copying and use a reference count. You’ll also want to avoid taking your parameters by value when you don’t need to because this will inject this atomic (or mutex lock) protected reference count manipulation, and that can indeed be a heavy cost in some cases.

Final close note: You can use this same sort of pattern to implement a flyweight quite easily. Use the usual pimplcow versions of copy and pimpl(), but additionally do a search into a flyweight manager to find an existing identity for the pimpl after you modify it. I have implemented both of these idioms in a sort of pimpl framework library in my github experiments repository: https://github.com/crazy-eddie/experiments/tree/master/performance – I make no claim that this experiments repository will be in a useable state when/if you chose to clone it. In fact I know right now that the pimplvector stuff is completely broken.

Final, final note: This idiom/pattern is indeed thread safe, but only if values are not shared among threads. So long as each thread has its own “copy” of the value you need not worry about any race conditions. This makes the idiom easy to implement and use. If you on the other hand need to share single copies of a value in multiple threads then you need to add further syncronozitation. This is not really unusual and so I don’t consider it a flaw, but you do need to be aware of it because of course bad things will happen if you use it incorrectly and cause a race condition.

A case against naming conventions based on type

Posted in Uncategorized with tags on May 17, 2013 by Crazy Eddie

Many places I have worked have naming conventions that are, in my opinion, poorly conceived.  Naming conventions really should be limited to deciding between camel case, pascal case, and simply, “Give it a name that explains what it is supposed to be for.”  Unfortunately a lot of places go much further and do things like putting ‘m_’ before member variables, having different naming conventions for different classes of entities (Pascal for classes, underscore for functions, camel for variables, _t after typedefs, etc…).  These kinds of conventions miss the point of generic and object oriented programming and thus miss the point of C++ in a great way.

The issue is that what kind of thing an entity is doesn’t really matter so long as a certain interface is obeyed.  Take for example putting the “_t” postfix on typedefs, which many developers and firms do.  What happens now when someone comes along and decides that this type would be better implemented as a class of its own rather than a typedef of something else.  Since typedefs in C++ are weak this is not a surprising or all to rare thing to do.  Now in order to keep from breaking clients everywhere we need to follow the convention for naming classes and then add a typedef so that the “_t” name exists.  This is silly.

Another example comes up with regard to functors.  Functors are classes thats’ instances look and behave like functions that may have state.  If you’ve specified that variables must be named one way, and functions another, then in fact you’ve got a problem here because conceptually this variable is both a variable and a function!  If you replace a function with a functor someplace then again you are in the unfortunate position of having to violate naming conventions or do something silly like make a function to call the functor just so both names exist.

The C++ standard library has a pretty good C++ naming convention.  It uses one convention for almost everything with the caveats that template parameters are named differently (and this is important because of dependent name issues) and so are macros.  Both of these things require special consideration when using so it’s worthwhile to make them stand out.

In C++ types are strong at the time of compilation, but in the source they are interchangeable to a great degree.  So long as something obeys an interface, calling code shouldn’t have to change when you completely change the type some entity is.  Naming conventions that force you to change names based on type hinder this ideal.

CUDA without Visual Studio

Posted in Uncategorized on February 11, 2013 by Crazy Eddie

So I have a solid-state drive that I put my system on. It is only 55G in size and Windows 7 itself takes up over half of that, that’s before installing ANYTHING. Visual Studio takes up around 9G of drive space and though you can ask it to install on another drive, that only moves 1G…the rest INSISTS on being on your C drive.

I wanted to try out cuda development though and I don’t currently have a Linux install in anything but a VM (that will be remedied eventually). The CUDA development kit requires VS though.

It is possible though to get the compiler for VS only by downloading the SDK. The SDK can mostly be installed on another drive. Though there is stuff that insists on being on your system drive, it’s more like .5G rather than 6+. Even stuff that VS installs there and won’t anywhere else can be put on another drive this way. Of course, you won’t have the debugger or the idea…but in a pinch it works.

The next thing I had to argue about is getting the CUDA compiler part to recognize the configuration. I got a weird error saying it couldn’t find configuration “null” in the amd64 directory within bin of the SDK install. That is easily fixed by creating a file called “vsvars64.bat” in that directory that has a single line in it: “call drive:\path_to\SetEnv.cmd /x64”. The ‘path_to’ part will depend on where you installed the SDK.

Then you have to close that cmd window and start again for some reason–talking about the “Windows SDK Command Prompt” from the start menu. After this you can set the PATH to include your cudapath\bin and cudapath\open64\bin. You need to be able to run `nvcc` and `nvopencc`.

Once all this was done I was able to compile a basic cuda program from the command line with: `nvcc –cl-version 2010 –use-local-env file.cu`.

This took hours of pain, google research and forum posting. Hopefully the next person in my shoes can find this and it helps. Look forward to hearing success stories and otherwise.

“Universal design principle”

Posted in Uncategorized on January 17, 2013 by Crazy Eddie

In this video the presenter associates what he calls the Universal design principle to Scott Meyers. He also attributes him with the name. I can’t find another source on this but I have seen the principle passed around by others such as Robert C. Martin. The principle is, “Make interfaces hard to use incorrectly and easy to use correctly.” The presenter had a picture of a pestle and cup; the idea being that it’s really easy to use and can’t be used any other way.

That got me thinking and for the first time I think I believe that this principle is the wrong way to think about it. Why? Because being the kind of gutter thinker I am I immediately thought of other uses for that pestle…some places it could be stuck. I suppose this would be considered the “wrong” use and so it doesn’t follow the principle but wait!

What if you WANT to use it that way?

So I think that perhaps a more useful, a more defensive way of thinking about this principle is to say that, “The use of an interface in any way that works is ‘correct’.” In other words, regardless of the intention of the interface designer, if it can be used in a particular way then it’s a correct use of that interface. It then follows that it’s the duty of the designer to make sure that his or her intentions MUST be followed because they’re the only use that the interface supports.

We then get a sort of inversion of the principle that I think more clearly defines the intention of the principle: to make sure your interfaces are adequately specified and that implementations protect themselves from violations of that interface…BUT…your implementation must also handle correctly any correct use your interface allows. That way if someone decides they want to stick it where the sun doesn’t shine…then either it should not fit, or it at least shouldn’t break.

SFINAE — but sometimes it is.

Posted in Uncategorized on January 1, 2013 by Crazy Eddie

As I was writing up the code for my last article on bind expressions I actually learned something I had not expected and did not know before. As I often say, “Teaching is one of the best ways to learn,” and well, here’s a great example.

When I initially wrote the bind helper function I only had the one set:

template < typename Fun >
binder<typename return_of<Fun>::type, Fun, empty_list> bind(Fun f);

// further definitions for 1, 2, and 3 argument functions...

I then had a helper metafunction that would derive the return type:

template < typename T >
struct return_of { typedef typename T::type type; };

// one of many...
template < typename R >
struct return_of<R(*)()> { typedef R type; };

This worked just fine for function pointers and functors with a return_type subtype definition. The problem later came when I decided to add member function compatibility. This requires building an extra object to convert the calling convention to function syntax rather than member pointer. Here is one such function overload:

template < typename R
         , typename P1, typename A1
         , typename P2, typename A2 >
binder_t< R
        , mf1<R,P1,P2>
        , boost::fusion::vector<A1,A2> >
bind(R(P1::*f)(P2),A1 a1, A2 a2)
{
	typedef mf1<R,P1,P2> F;
	typedef boost::fusion::vector<A1,A2> L1;
	return binder_t<R, F, L1>(F(f), L1(a1,a2));
}

The assumption I had at this point, previous to compiling and after writing 20 lines of repetative overloads, was that the substitution of the member function pointer type in the previous overloads would cause a failure that the compiler would ignore because of SFINAE. Yeah…that’s not how it works.

There are a couple ways to consider why SFINAE will not apply here. The first is to look at 14.8.2 in the standard, note that these are the only allowed deduction failures that SFINAE can catch, and see that what’s going on here, instantiating something within another template that causes an error, is not in that list.

The more direct reason though is this: During the deduction of the types for the bind function we are instantiating another template. That template goes through its own deduction process and passes! There is no syntactic error caused by substituting a member function pointer into the template parameters for return_of. The deduction process is finished here and that template can be instantiated. Remember also that in C++ templates are only instantiated as they are needed. It is perfectly legal to instantiate a template class that has members that would cause syntax errors so long as you never use those members. So this process is finished and we are no longer within the realm where SFINAE even applies in the language with regard to return_of. We then instantiate type within this template and it has one…so again we can’t trigger SFINAE here because the type name exists! Now the instantiation phase happens for type and of course that results in the gibberish code R(T::*)(??)::type. This now is a syntax error, an ill-formed piece of code and we are well past the point of SFINAE, which happens only during deduction.

There are a few ways you can fix this issue. The first is what I chose in the article code: use T::type directly in the signature of my function so that its discovered during deduction where “substitution” actually occurs. This of course required that I write alternate overloads for function pointers.

Another way is to write result_of so that it won’t try define a type type member if type T doesn’t have one:

#include <boost/mpl/has_xxx.hpp>

BOOST_MPL_HAS_XXX_TYPE_DEF(type);

template < typename T, bool = has_type<T>::value>
struct return_of{};

template < typename T, true >
struct return_of { typedef typename T::type type; };

Now what happens in the original definition of bind is that the actual substitution going on is return_of<T> and then we’re trying to access type within that substitution, which fails. This is a substitution failure and it is not an error, it’s simply discarded as a viable overload.

Finally, one more way to fix this issue is to force the definition of return_of::type to occur during the deduction phase of return_of while the template parameter is being substituted in for T:

template < typename T, typename Type = typename T::type>
struct return_of { typedef Type type; };

Now the substitution failure is happening within the result_of deduction, which must be deduced to finish deducing bind. This substitution failure is also “not an error” and the whole works is discarded as a viable overload for bind

It’s an interesting little gotcha. Some people I’ve talked to claim that what I did was legal in C++03 but only due to unclear language in the standard, which has now been clarified. Others claim that the standard already defines what I first did as ill-formed. I tend to believe the latter as it seems to coincide with my interpretation, but either way you certainly cannot depend on it working and if it does you should simply ignore that fact as a fluke. If you have code that relies on this working, get working on replacing it with whichever of the above best fits the problem. If you’re just learning how to use SFINAE to do type introspection and function overloading…keep this in mind and be better armed with a better understanding of the mechanism.

Until next time, happy coding!