Compile time strings with constexpr

Posted in C++, cpp on October 17, 2014 by Crazy Eddie

In Using strings in C++ template metaprograms, Abel Sinkovics and Dave Abrahams describe a set of macros to create strings for metaprogramming in C++. Since C++11 though there is another way, and with C++14 it’s made easier. This article describes that method.

I’m using C++14 features here. Namely not needing postfix decltype in auto functions. My compiler seems to have some issues with auto (it’s not the latest version) so I’m not using it everywhere I could, but where I do I’m not doing the whole -> decltype(expression) thing. It cleans up the code a LOT and is a lot simpler. You should be able to do this in C++11 though by adding those bits where necessary.

With this technique you get a string type that can be used is much easier ways for many of the same purposes. It can be used at both runtime and compile-time and at least in the cases I’ve tried and compilers I have used this version can be used with much longer strings. For example in previous attempts to implement fizzbuzz as a metaprogram I was able to only acheive very low numbers before reaching the limit of template parameters for my compiler. In the sample code for this article I acheived all 100 entries.

Prerequisite utilities: index generation

For many of the functions in this construct we need a variadic template pack of all legal index values in an array. On the stackoverflow C++ Lounge wiki a technique is described that allows this. I altered it slightly:

template < int ... Nums >
struct indices
    template < int I >
    static constexpr auto append(boost::mpl::int_<I>)
        return indices<Nums...,I>();

constexpr indices<> make_indices(boost::mpl::int_<0>)
    return indices<>();

template < int I >
constexpr auto make_indices(boost::mpl::int_<I>)
    using prev = boost::mpl::int_<I-1>;
    return make_indices(prev()).append(prev());

You can use these constructs to create a type that contains all indices for an array, and then you can use this in a pack expression to retrieve that list of indices as a variadic template pack and then use that to initialize an array.

Basic operations

Before moving on to the more interesting stuff I will briefly describe the fundamental definitions to get started with:

template < int Size >
struct string
    using data_type = char const (&) [Size+1];

    constexpr int size() const
        return Size;

    constexpr data_type c_str() const
        return arr;

    // ...
    char arr[Size+1];

The most interesting thing here is the definition of data_type. For a great many things we could simply return char const* but in many cases it helps us a lot to get the raw array type out of that function. It will decay into a pointer when necessary. The rest of this stuff is pretty self explanitory: we define constexpr functions to retrieve the size and the actual string, and a place to put it.

The array needs to be Size+1 so that it can store the null terminator and thereby behave like a C string. This could be ommitted if you never wish to use it as a c-string, but in many ways it’s easier to enable that form of access.


Building this thing then becomes the first tricky part we run into. We want to be able to build this thing out of a c-style string. We also want to be able to construct a zero size string without parameters.

template < int Size >
struct string
    // ...

    constexpr string(char const (&str) [Size+1])
        : string(str, detail_::make_indices(boost::mpl::int_<Size>()))

    constexpr string() : arr{} {}

    char arr[Size+1];

    template < int I, int ... Indices >
    constexpr string(char const (&str) [I], detail_::indices<Indices...>)
        : arr{str[Indices]..., '\0'}

Line 6 here is where interesting things start. Normally you’d not take a c-style string as an array but instead as a pointer to char. Here though we want the size of the array and in C++ strings in quotes are actually static arrays; they have a size we can fetch through the type system. To fetch that size we simply make it a template parameter. The rest is funky syntax to take an array by reference.

On line 7 we use the new delegating constructor syntax to defer to a private constructor that uses the indexing trick. So we call make_indices as the second argument to that constructor.

Line 15 begins the other side of the indexing trick. Here we say that we expect a pack of integers. On line 16 we accept the type in which that pack will expand, thus giving us the pack as part of the deduction system. Finally on line 17 we finish the indexing trick and use it to initialize our internal array.

The default constructor should probably only be available when Size==0 but I’m being lazy here. You could use enable_if to fix this or create a specialization for string<0>.

The final thing here though is that it would be nice to have a helper function so we don’t need to count characters and specify the size. It’s pretty darn easy:

template < int Size >
constexpr auto make_string(char const (&cstr) [Size])
    return string<Size-1>(cstr);

Already we have a lot of power. We can define constexpr strings quite easily and we can demonstrably use them in places that require constant expressions:

    constexpr auto s0 = make_string("hello");
    BOOST_CHECK_EQUAL(s0.c_str(), "hello");

    using c = boost::mpl::char_<s0.c_str()[2]>;
    using s = boost::mpl::int_<s0.size()>;

    BOOST_CHECK_EQUAL(c::value, 'l');
    BOOST_CHECK_EQUAL(s::value, 5);


I’m going to skip over the definitions for push_front and push_back and instead just talk about append. It’s the more thorough example, leaving the other two fairly easy to understand once you can follow through append.

Like the constructor we need to use the indexing trick:

template < int Size >
struct string
    // ...
    template < int I >
    constexpr string<Size+I-1> append(char const (&cstr)[I]) const
        return append( detail_::make_indices(boost::mpl::int_<Size>())
                     , detail_::make_indices(boost::mpl::int_<I-1>())
                     , cstr );
    // ...
    char arr[Size+1];

    template < int ... ThisIndices
             , int ... ThatIndices
             , int I >
    constexpr string<Size+I-1> append( detail_::indices<ThisIndices...>
                                     , detail_::indices<ThatIndices...>
                                     , char const (&cstr) [I] ) const
        char const newArr[] = {arr[ThisIndices]..., cstr[ThatIndices]..., '\0'};
        return string<Size+I-1>(newArr);

It’s kind of funky looking but there’s no reason to pass on arr because we already have it. So we pass on the indices to reference both arrays and a reference to the array we want to append. We can’t just create a class variable or type to hold pre-created indices because the pack part is essential to the indexing trick and that is only available if we make it part of the function signature.

The interesting part here is on line 23 where the new c-string is created by merging the two inputs. Using the indexing trick again here twice–we can have two packs because they resolved inside clear boundaries–we just use array initialization to do so, just as with the constructor.

There’s an alternative to the newArray method. It requires a constructor that takes individual chars and it must be a variadic template. Because of this requirement I decided I didn’t much care for that constructor and so did it this way. You can see the difference in the github change history for string.hpp.

That’s basically the whole bit. It’s nice to have operators though:

template < int I0, int I1>
constexpr string<I0+I1> operator + (string<I0> const& s0, string<I1> const& s1)
    return s0.append(s1);

And now we have compile-time string processing! Check out the unit test:

    constexpr auto s0 = '[' + make_string("hello") + ' ' + make_string("world") + ']';

    BOOST_CHECK_EQUAL(s0.c_str(), "[hello world]");

That test uses some additional operations (push_front and push_back) that are not significantly different from the append operation.


There are a few limitations to this construct. Using them in constexpr scope will eventually hit a wall of recursion limits. I’ve found though that this is much, much greater than the MPL string.

Another limitation is that even in constexpr functions parameters are never constexpr. Thus you can’t do anything like this:

template < int I >
constexpr void fun(string<I> str)
    using illegal = boost::mpl::char_<str.c_str()[0]>;

A final limitation is that different strings of the same length are not distinct types. This construct would therefor not be sufficient to implement the keys in my json library. Furthermore, because of the aforementioned issue where function parameters are never constexpr, there’s really no great way to turn them into types. Anything that might would necessarily follow the same sort of structure as Abel Sinkovics and Dave Abrahams’s version–namely it would need macros. However, for the purposes discussed in that article, such as creating a compile-time regex generator, this constexpr string type would be sufficient I’d think.


So, why would you use such a thing? I have no really great plan to. This was done more or less because I could and because the methods used are interesting. I’ve not initiated anything new here, but I’ve tied together a few different techniques. This string type also might be practically useful for something like the tasks mentioned in Abel Sinkovics and Dave Abrahams’s article.

As a test I did implement fizzbuzz as a constexpr string generator. I indeed hit a recursion limit wall but was able to work around it by making the string slightly shorter. It thus goes all the way from 1-100, unlike my previous attempts to create one with the MPL string where the upper limit was closer to 20 and required increasing some numeric defines before including MPL. It might be possible to remove some recursion by mixing constexpr and template metaprogramming–for example implement the index trick more like was done on the C++ Lounge site.

Obviously to make this construct practically useful in a non-painful way you’d want to add many omitted functions, the most glaring of which is an index operator.

The code for this blog is available on my github if you’ve not noticed already.

Sane C++

Posted in Uncategorized on October 9, 2014 by Crazy Eddie

I’ve created a second C++ blog that will be more geared toward practical, everyday programming. I’ll continue using this one as a place to discuss more advanced or, “just because I can,” topics.

Crazy Eddie’s Sane C++

The first post went up today. It is a brief introduction to unit testing using Boost.Test.

Introduction to unit testing

From the about:

This blog is a compliment to my other blog, Crazy Eddie’s Crazy C++. There I focus on more advanced subjects, like template metaprogramming, that are not going to be what you want to be writing all the time (or at all) because it can be tough to maintain. In this blog I focus on more practical–boring if you will–subjects you can and should use in your everyday C++ development. In Crazy C++ you can find tricks that respond to hard problems or just exercise the language; here I intend all content to be accessible and usable.

I assume some familiarity with the C++ language. Perhaps you’ve taken a college course or two, or maybe you’ve read a book and written a few exercise programs. You might even be interning or in your first few years of professional software development. Perhaps some of my posts here will give the occasional insight to an experienced software developer well versed in C++, but for the most part I intend to address those people who wish to go from barely knowing anything to being able to write maintainable C++ code. Expect to see a lot of simple idioms and pattern application.

You can find code examples for most posts in this blog at my github: Sane CPP.

Near future installments will include subjects like pimpl, NVPI, RAII…and whatever.

On another issue: I am currently on sabbatical but am on the lookout for potential employers as well. My linkedin profile serves as a resume; feel free to contact me if you have a position I could be interested in. In the meantime I hope to be more active in both of these blogs.


Posted in Uncategorized on September 13, 2014 by Crazy Eddie

This article briefly describes a simple little extension of the pimpl idiom that I’ve called “pimplcow”.  It’s not a unique idiom, I know at least one other developer/group that has done the same thing.  I invented it independently though and at least for a while thought it uniquely mine.  Perhaps I can lay claim to the name :p

The basic gist of the idiom can be shown with shared_ptr:

struct object
    int get() const;
    void set(int);

    object(object const&);
    object& operator = (object);

    struct impl;
    std::shared_ptr<impl> pimpl_;

    impl * pimpl();
    impl const* pimpl() const;

The declaration of the class isn’t really anything new or interesting except perhaps for the pimpl function and the shared_ptr. The pimpl() function is actually useful outside of this idiom as it allows you to easily copy the constness of your object into the constness of your impl, which generally should match but pointers in C++ change things a bit. The implementation though is where things get a touch interesting:

struct object::impl { int i; };

object::object() : pimpl_(new impl()) {} // or make_shared
object::object(object const& other) : pimpl_(other.pimpl_) {} // shallow copy!
object & object::operator = (object other) { pimpl_.swap(other.pimpl_); }

int object::get() const { return pimpl()->i; }
void object::set(int i) { pimpl()->i = i; }

// Now for the tricky parts...

// For const operations we just return the pimpl we have.
object::impl const* object::pimpl() const { return pimpl_.get(); }

// For non-const operation we assume the object is about to be modified and we now copy the pimpl...maybe.
object::impl * object::pimpl()
    if (!pimpl_.unique()) pimpl_.reset(new impl(*pimpl_));
    return pimpl_.get();

This now implements the copy-on-write optimization in a VERY easy way. In fact, if you write your pimpl based classes in a sort of simple to follow standard way you end up having to change only two functions to switch between pimplcow and simple pimpl:

object::object(object const& other) : pimpl_(new impl(*other.pimpl_)) {}

object::impl * object::pimpl() { return pimpl_.get(); }

A further optimization recognizes that `shared_ptr` is actually a rather heavy-weight object when you know for certain you’re using shared pointer semantics, such as in the case of pimplcow. So instead you use an invasive reference count pointer such as `boost::intrusive_ptr`.

This construct is really simple and can save a lot of memory in some situations. It can also greatly reduce CPU use if your objects would otherwise be copied a lot but not changed. Yet further you can use value semantics instead of shared pointers to optimize against copying. It’s much, MUCH simpler to think about and use value types than shared pointers when the only thing you’re trying to do is avoid copies. Finally, in these conditions you must already be paying the cost of atomic increment/decrement in the reference count, which is the one thing you need to worry about here.

To close a bit about the reference count. You need to be weary of and measure the effect of atomic operations needed to increment and decrement the reference count for the shared pointer. If you’re making huge numbers of copies and/or destroying many values when you’ve not called a non-const member and thus doing the real copy, you can end up with a LOT of contention between CPUs. The pimplcow idiom is thread safe if you use a thread-safe reference counter, but in some cases it may actually be cheaper to make a real copy than to defer that copying and use a reference count. You’ll also want to avoid taking your parameters by value when you don’t need to because this will inject this atomic (or mutex lock) protected reference count manipulation, and that can indeed be a heavy cost in some cases.

Final close note: You can use this same sort of pattern to implement a flyweight quite easily. Use the usual pimplcow versions of copy and pimpl(), but additionally do a search into a flyweight manager to find an existing identity for the pimpl after you modify it. I have implemented both of these idioms in a sort of pimpl framework library in my github experiments repository: – I make no claim that this experiments repository will be in a useable state when/if you chose to clone it. In fact I know right now that the pimplvector stuff is completely broken.

Final, final note: This idiom/pattern is indeed thread safe, but only if values are not shared among threads. So long as each thread has its own “copy” of the value you need not worry about any race conditions. This makes the idiom easy to implement and use. If you on the other hand need to share single copies of a value in multiple threads then you need to add further syncronozitation. This is not really unusual and so I don’t consider it a flaw, but you do need to be aware of it because of course bad things will happen if you use it incorrectly and cause a race condition.

Beginning Template Meta-Programming: Introducing meta-functions

Posted in Template Metaprogramming on June 28, 2013 by Crazy Eddie

I intended this to be an article about “arbitrary binding” in template meta-programming, but I found that there was much I needed to introduce in order to do so and the article was becoming quite lengthy without actually providing much.  Thus I’ve decided to begin a series on template meta-programming that will introduce the topic and expand into more complex subjects, partial application being one of them.

Some history – expression templates

Meta-programming in C++ was pretty much a total accident.  The template system was defined without meta-programming in mind.  For many years in C++ development, and unfortunately still not as rare as it should be, developers were afraid of templates and in some cases avoided them entirely.  More than one team would have coding standards that dictated that no developer use templates.  The problem was that many compilers did not implement templates very well and in fact there are still issues with some of the more commonly used compilers.  If you chose to go down the road of becoming efficient at Template Meta-Programming (TMP) you are sure to run into a few compiler bugs.

Even so though, some brave souls embraced templates and came up with some very interesting constructs.  The STL is but one of the more fascinating examples.  Many of the ideas introduced there indicated that there was much potential for expressive power in templates.  One very early example of techniques that hinted at TMP though are expression templates.  If you search through the Internet for this term you’ll find a great many excellent articles and papers.  Expression templates are a powerful concept that can be used to reduce the amount of temporaries created during calculations of complex objects such as matrices.  The idea is that operators, such as addition, don’t actually perform addition at all but instead create an object that implements the functions that perform that task.  This is in some ways hinting at lazy evaluation except that (until C++11 changed the meaning of auto) nobody in their right mind wants to declare the type that would be necessary to store that lazy evaluator in a variable.  Instead, the act of assignment or access would perform all the calculations in the complete expression leaving out that which is unnecessary, encoding the evaluation of each cell into a single expression, and eliminating the need for temporary objects.  What follows is a very brief example (the rest of the operators are left as an exercise):

// represents a delayed addition operation.
// Represents a result but isn't itself one.
template < typename Right
         , typename Left >
struct matrix_addition
    matrix_addition( expr_matrix const& lh
                   , expr_matrix const& rh)
        : m1(lh)
        , m2(rh)

    int get_value(int x, int y) const
        return m1.get_value(x,y) 
             + m2.get_value(x,y)

    operator expr_matrix() const // perform the addition and return the result.
        expr_matrix tmp;
        for (size_t i = 0; i < 3; ++i)
            for (size_t j = 0; j < 3; ++j)
                tmp.assign_value(i,j, get_value(i,j));
        return tmp;

    expr_matrix const& m1;
    expr_matrix const& m2;

// override '+' to return addition result representation.
template < typename Right 
         , typename Left >
matrix_addition<Right,Left> operator + (Right const& rh, Left const& lh)
    return matrix_addition<Right,Left>(rh,lh);

    expr_matrix m1;
    expr_matrix m2;


    // Creates a full result through implicit casting.
    expr_matrix result = m1 + m2;


    // The only additions performed here are on [0,0].  None of the others are performed.
    // A complete summation of the matrixes used is not done.
    BOOST_CHECK((m1 + m2).get_value(0,0) * 2 == (m1 + m2 + result).get_value(0,0));

Expression templates are an example of an embedded Domain Specific Language (DSL). They turn expressions involving the matched types and operator overloads into a set of recursively defined types that represent an abstract syntax tree.

Our first “pseudo” meta-function

One of the first examples of a metafunction generally introduced is the factorial function.  I’m going to call this a pseudo metafunction because it doesn’t match any defined protocol for creating a consistent language for template metaprogramming.  As with functional programming in general, this compile time factorial calculator is recursive in nature.  In something like Haskel or Scala you would use pattern matching on the argument to invoke a different calculation for the argument should it be general or should it be your base case.  In TMP you do this with template specialization:

template < size_t I > // general case
struct pre_tmp_fact
    // recursively invoke pre_tmp_fact
    // and multiply the result with the current value.
    enum { value = I * pre_tmp_fact<I-1>::value };

template < > // base case specialization
struct pre_tmp_fact<0>
    // 0! == 1
    enum { value = 1 };

    BOOST_CHECK(pre_tmp_fact<5>::value == 5 * 4 * 3 * 2);

Defining a protocol

In C++ Template Metaprogramming, Abrahams and Gurtovoy define a protocol for implementing a consistent language for template metaprogramming in C++.  That protocol involves three basic definitions:

A meta-function is a template that takes types as parameters and ‘returns’ a type in the form of an internal type typedef.
Value Meta-Function
Value meta-functions are metafunctions that represent values. They have all the characteristics of a meta-function and also an internal value member. They are “identity meta-functions” in that they return themselves as their type. If you create instances of them they are generally convertible to the underlying raw value type they represent, with the appropriate value of course.
Meta-Function Class
Since a meta-function is not a type it cannot be passed as a parameter to another meta-function. Therefor an additional concept, that of meta-function class, needs to be defined. It is a type that has an internal meta-function called apply.

Our first meta-function

With the above in mind, we can recreate our factorial metafunction using these concepts.  In this article I’ll implement very basic, minimally functional integer wrapper meta-functions.  In the future I will use the Template Metaprogramming Library (MPL), which is part of the boost library set.  It’s worth reviewing here, for instruction purposes, but it’s generally easy to use what other people have built and tested than reinventing the wheel.  This time though, we’ll reinvent the wheel, we just won’t add all the spokes and will not true it:

// integer value wrapper metafunction.
template < int I >
struct int_
    typedef int_<I> type;
    enum : int { value = I };

    // Has a value...
	BOOST_CHECK(int_<5>::value == 5);
	// Is an identity metafunction...
	BOOST_CHECK(int_<5>::type::value == 5);
	// This recursion is OK because the compiler won't instantiate until requested.
	// Can go on forever.
	BOOST_CHECK(int_<5>::type::type::value == 42);

// This is not how boost's MPL does it!

// Forward declaration of metafunction
template < typename T >
struct prev;

// Specialize for int_ wrappers.
template < int I >
struct prev<int_<I>>
    typedef int_<I-1> type;
    enum { value = int_<I-1>::value };

    BOOST_CHECK(prev<int_<5>>::value == 4);

// General case.
template < typename T >
struct tmp_fact
    typedef tmp_fact<T> type;
    enum { value = T::value * tmp_fact<typename prev<T>::type>::value };

// Specification for base case.
template < >
struct tmp_fact<int_<0>>
    : int_<1> // You can subclass the result sometimes, to be preferred.

    BOOST_CHECK(tmp_fact<int_<5>>::value == 5 * 4 * 3 * 2);

Moving forward

If you are interested in continued study of template meta-programming in C++ then I suggest purchasing the book, C++ Template Metaprogramming.  I will also be continuing forward with some other instructional material in future blogs, including implementing named parameters and partial evaluation (arbitrary binding) in templates.  This has been but a rudimentary look at the very basics that will be needed by future blog entries.

I might suggest also playing with the above code and trying to improve it. There are a great many ways in which it can be. The factorial metafunction for example could use tail recursion and could even use subclassing all the way through. It is much easier to read and work on a metafunction that can be defined in terms of other metafunctions through simple inheritance, so unless you must do otherwise to leverage lazy evaluation (any time you can avoid explicitly refering to type within a metafunction you’re doing lazy evaluation) it should be preferred. Often this requires decomposition of more complex metafunctions into smaller, simpler ones. This is of course, like with your usual code, never a bad thing.

GotW #91

Posted in C++11, GotW on May 31, 2013 by Crazy Eddie

I can’t seem to post to Herb Sutter’s blog, my comments just never appear and ones I’ve made are gone.  Maybe he has me blocked for some mysterious reason.  So I thought I’d put my answer to the problems here.

Anyone that wants to write solid C++ should be reading his blog.  I suggest subscribing.

So, my answer to GotW 91 is:

#1 – Some extra locking and reference counting will take place. A new small object will be created on the stack. Not much will happen actually. The locking is the greatest concern.

#2 – I don’t believe it matters until threads are introduced. If you call this function via a thread spawning though you want to be sure that you do so by value. Doing it by reference can create a race condition in which you reduce the reference count while the pointer is being used or copied by the other side. When that happens it could be that you’re the last one holding the pointer and then the other side gets to have it’s resource yanked out from under it. This is bad voodoo.

#3 has made me want to come up with a set of smart pointer types that secure and ensure correct ownership semantics. In this case you’d take a pointer (by value) of the kind that specifies how you’ll use it. If only within scope then you’d specify that with type. It’s a pretty tough problem though…at least for me.

The answer to the question is one of three (given the types in the question):

1. Use reference – when in doubt, use a reference rather than pointer. If you don’t need a pointer, don’t use one.
2. Use unique_ptr by value – You intend to steal ownership of the pointer and you want this ownership transfered away from client code immediately.
3. The parameter could be null but you don’t want to steal it.  Use a raw pointer (for now at least…if I ever solve the above idea then you may have a smart pointer to use instead)

A case against naming conventions based on type

Posted in Uncategorized with tags on May 17, 2013 by Crazy Eddie

Many places I have worked have naming conventions that are, in my opinion, poorly conceived.  Naming conventions really should be limited to deciding between camel case, pascal case, and simply, “Give it a name that explains what it is supposed to be for.”  Unfortunately a lot of places go much further and do things like putting ‘m_’ before member variables, having different naming conventions for different classes of entities (Pascal for classes, underscore for functions, camel for variables, _t after typedefs, etc…).  These kinds of conventions miss the point of generic and object oriented programming and thus miss the point of C++ in a great way.

The issue is that what kind of thing an entity is doesn’t really matter so long as a certain interface is obeyed.  Take for example putting the “_t” postfix on typedefs, which many developers and firms do.  What happens now when someone comes along and decides that this type would be better implemented as a class of its own rather than a typedef of something else.  Since typedefs in C++ are weak this is not a surprising or all to rare thing to do.  Now in order to keep from breaking clients everywhere we need to follow the convention for naming classes and then add a typedef so that the “_t” name exists.  This is silly.

Another example comes up with regard to functors.  Functors are classes thats’ instances look and behave like functions that may have state.  If you’ve specified that variables must be named one way, and functions another, then in fact you’ve got a problem here because conceptually this variable is both a variable and a function!  If you replace a function with a functor someplace then again you are in the unfortunate position of having to violate naming conventions or do something silly like make a function to call the functor just so both names exist.

The C++ standard library has a pretty good C++ naming convention.  It uses one convention for almost everything with the caveats that template parameters are named differently (and this is important because of dependent name issues) and so are macros.  Both of these things require special consideration when using so it’s worthwhile to make them stand out.

In C++ types are strong at the time of compilation, but in the source they are interchangeable to a great degree.  So long as something obeys an interface, calling code shouldn’t have to change when you completely change the type some entity is.  Naming conventions that force you to change names based on type hinder this ideal.

CUDA without Visual Studio

Posted in Uncategorized on February 11, 2013 by Crazy Eddie

So I have a solid-state drive that I put my system on. It is only 55G in size and Windows 7 itself takes up over half of that, that’s before installing ANYTHING. Visual Studio takes up around 9G of drive space and though you can ask it to install on another drive, that only moves 1G…the rest INSISTS on being on your C drive.

I wanted to try out cuda development though and I don’t currently have a Linux install in anything but a VM (that will be remedied eventually). The CUDA development kit requires VS though.

It is possible though to get the compiler for VS only by downloading the SDK. The SDK can mostly be installed on another drive. Though there is stuff that insists on being on your system drive, it’s more like .5G rather than 6+. Even stuff that VS installs there and won’t anywhere else can be put on another drive this way. Of course, you won’t have the debugger or the idea…but in a pinch it works.

The next thing I had to argue about is getting the CUDA compiler part to recognize the configuration. I got a weird error saying it couldn’t find configuration “null” in the amd64 directory within bin of the SDK install. That is easily fixed by creating a file called “vsvars64.bat” in that directory that has a single line in it: “call drive:\path_to\SetEnv.cmd /x64″. The ‘path_to’ part will depend on where you installed the SDK.

Then you have to close that cmd window and start again for some reason–talking about the “Windows SDK Command Prompt” from the start menu. After this you can set the PATH to include your cudapath\bin and cudapath\open64\bin. You need to be able to run `nvcc` and `nvopencc`.

Once all this was done I was able to compile a basic cuda program from the command line with: `nvcc –cl-version 2010 –use-local-env`.

This took hours of pain, google research and forum posting. Hopefully the next person in my shoes can find this and it helps. Look forward to hearing success stories and otherwise.


Get every new post delivered to your Inbox.

Join 26 other followers