C++11

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Rjwilmsi (talk | contribs) at 22:02, 24 September 2007 (Typo & format fix , typos fixed: refering → referring using AWB). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Future software

C++0x is the planned new standard for the C++ programming language. It is intended to replace the existing C++ standard, ISO/IEC 14882, which was published in 1998 and updated in 2003. These predecessors are informally known as C++98 and C++03. The new standard will include several additions to the core language and will extend the C++ standard library, incorporating most of the C++ Technical Report 1 libraries - most likely with the exception of numerical libraries. Since the standard is not yet finalized, this article may not reflect the most recent state of C++0x. Up-to-date state of the next C++ standard is published on the ISO C++ committee website. The most recent report, N2336, was published in July 2007.

The ISO/IEC JTC1/SC22/WG21 C++ Standards Committee aims to introduce the new standard in 2009 (hence the standard that is today called C++0x will become C++09) which means that the document must be ready for ratification of the member states of ISO in 2008. To be able to finish on schedule, the Committee decided to focus its efforts on the solutions introduced up until 2006 and ignore newer proposals [1].

Programming languages such as C++ use an evolutionary process to allow programmers to program faster, more elegantly, and in a manner that produces maintainable code. This process inevitably raises compatibility issues with existing code, which has happened occasionally during the C++ development process. However, according to the announcement made by Bjarne Stroustrup (inventor of the C++ language and member of the committee), the new standard will be almost 100% compatible with the current standard [2].

Candidate changes for the impending standard update

As mentioned, the modifications for C++ will involve both the core language and the standard library.

In the development of every utility of the new standard, the committee has applied some directives:

  • Maintain stability and compatibility with C++98 and possibly with C;
  • Prefer introduction of new features through the standard library, rather than extending the core language;
  • Prefer changes that can evolve the programming technique;
  • Improve C++ to facilitate systems and library design, rather than to introduce new features only useful to specific applications;
  • Increase type safety by providing safer alternatives to current, unsafe techniques;
  • Increase performance and the ability to work directly with hardware;
  • Provide proper solutions for real world problems;
  • Implement “zero-overhead” principle (additional support required by some utilities must be used only if the utility is used);
  • Make C++ easy to teach and to learn without removing any utility needed by expert programmers.

Attention to beginners is important, because they will always comprise the majority of computer programmers, and because many beginners do not intend to extend their knowledge of C++, limiting themselves to operate in the fields in which they are specialized [1]. Additionally, considering the vastness of C++ and its usage (including areas of application and programming styles), even the most experienced programmers can become beginners in a new programming paradigm.

Extensions to the C++ core language

The main focus of the C++ committee is the development of the language core. The presentation date of C++0x depends on the progress of this part of the standard.

C++ is often criticized for the unsafe management of data types. Even with this standard, C++ cannot become completely safe in type management (like Java), because this would involve the elimination of uninitialized pointers and the consequent loss of identity of the whole language. Despite this, many people insist that C++ include some mechanism for safe management of the pointer. For this reason, the new standard will provide support for smart pointers, but only through the standard library.

Areas where the C++ core will be significantly improved are multithreading support, generic programming support, and more flexible constructor and initialization mechanisms.

Multitasking utilities

The C++ standard committee plans to introduce some tools for multiprocessing and multithreaded programming.

At this stage, full support for multiprocessing appears too dependent on the operating system used, and too complex to be resolved only through an extension of the core language. The common understanding is that multiprocessing support shall be created via a high-level library instead of a low-level library (with potentially dangerous synchronization primitives), as the Committee's goal is not to motivate programmers to use a potentially dangerous standard library instead of a secure but non-standardized library.

In the next standard some utilities for multi-threaded programming will be added to C++, while development of a thread library remains a lower priority for the committee [1].

Parallel execution

Although the proposal still has to be defined, a mechanism to create the construction cobegin and coend using threads is likely to be introduced in the next standard (or in the future).

For the time being, the keyword used for the new notation is not important. Instead, it is important to know how simple it will be to implement parallel executable instruction blocks.

active
{
  // First block.
  {
    // ...
  }
  // Second block.
  for( int j = N ; j > 0 ; j-- )
  {
    // ...
  }
  // Third block.
  ret = function(parameter) ;
  // Other blocks.
  // ...
}

All blocks are executed in parallel. After the execution of every block has ended, the program continues with a single thread.

Asynchronous function call

Another undefined proposal that could be introduced in the next standard is a mechanism to implement an asynchronous function-like call using concept of future.

The syntax will probably be similar to following example:

int function( int parameter ) ;
// Calls the function and immediately returns.
IdThreadType<function> IdThread = future function(parameter) ;
// Waits for the function result.
int ret = wait IdThread ;

Thread-local storage

In a multi-threaded environment, every thread must often have unique variables. This already happens for the local variables of a function, but it does not happen for global and static variables.

A new local storage in addition to the numerous existing ones (extern, static, register, auto and mutable) has been proposed for the next standard: the Thread-Local Storage. This new local storage should be called simply thread (at the moment in the official documentation it uses the name __thread).

A thread object is similar to a global object. Global objects have a scope that covers all the program execution, while thread objects have a scope limited to a single thread. At the end of thread execution the thread objects are rendered inaccessible. Like any other static-duration variable, a thread object can be initialized using a constructor and destroyed using a destructor.

Atomic operations

Often, a thread needs to perform a task without being interrupted. For example, a thread might require exclusive access to a global variable or real-time access to a peripheral.

To perform such an atomic operation, a new keyword atomic has been proposed:

atomic
{
  // Atomic operations.
  ...
}

The volatile modifier

The modifier volatile informs the compiler that a variable’s value can be modified at any moment. In other words, accesses and modifications to a variable with this modifier must go directly to the variable's location in memory, and cannot take advantage of temporary storage in a CPU register. Furthermore, compilers cannot strip seemingly redundant reads from or writes to volatile variables. Other code segments and external hardware may attempt to read or write the variable's value concurrently. In the absence of the volatile keyword, compilers assume that a variable will maintain its value unless it receives an assignment in the current scope, allowing the generally desirable optimization known as dead code elimination.

An earlier proposal for C++0x included using the volatile keyword for communication between threads. Under this proposal, the compiler would have guaranteed thread-safety for concurrent reading and writing of volatile objects. This would, in effect, cause compilers to emit thread-synchronization code for accessing/modifying co-modified objects. However, this solution appears inappropriate if applied to complex objects and would have left no way to express the current semantics. The meaning of volatile will not change, instead specialised atomic types and operations will be available.

Class features

Classes are an important feature of C++. The C++0x changes to class features focus on making class hierarchies easier to work with and less fragile.

Constructor delegation

In standard C++, constructors are not allowed to call other constructors - each constructor must construct all of its class members itself, which often results in duplicate initialization code.

C++0x will allow constructors to call other peer constructors (known as delegation). This will allow constructors to utilize another constructor's behavior with a minimum of added code. Other languages, such as Java, provide this.

The syntax will be:

class SomeType
{
  int number;

public:
  SomeType(int newNumber) : number(newNumber) {}
  SomeType() : SomeType(42) {}
};

This comes with a caveat: C++03 considers an object to be constructed when its constructor finishes executing, but C++0x will consider an object constructed once any constructor finishes execution. Since multiple constructors will be allowed to execute, this will mean that each delegate constructor will be executing on a fully-constructed object of its own type. Derived class constructors will execute after all delegation in their base classes is complete.

Constructor inheritance

In standard C++, when a derived class is created from a base class, the derived class has an entirely separate set of constructors. The derived class may call base class constructors and forward parameters to them, but the user must create each constructor separately. Since this is merely rote work, it can be prone to causing user error.

C++0x will allow a class to specify that constructors will be forwarded. This means that the C++0x compiler will generate code to perform the forwarding of the call. Note that this is an all-or-nothing feature; either all of that base classes constructors are forwarded or none of them are. Also, note that there are restrictions for multiple inheritance, such that you cannot forward from two classes that use constructors with the same signature. Nor can you later declare a constructor with a signature that matches a forwarded constructor.

The syntax will be as follows:

class BaseClass
{
public:
  BaseClass(int iValue);
};

class DerivedClass : public BaseClass
{
public:
  using BaseClass::BaseClass;
};

Member initializers

In standard C++, members of classes are initialized only in constructors. C++0x will allow the following syntax for initializing members:

class SomeClass
{
private:
  int iValue = 5;
};

Any constructor will initialize iValue with 5 if the constructor does not override the initialization with its own, as follows:

class SomeClass
{
public:
  SomeClass() {}
  explicit SomeClass(int iNewValue) : iValue(iNewValue) {}

private:
  int iValue = 5;
};

The empty constructor will initialize iValue as the class states, but the constructor that takes an int will initialize it to the given parameter.

Allow 'sizeof' to work on members of classes without an explicit object

In standard C++, the sizeof operation can be used on types and objects. But it cannot be used to do the following:

struct SomeType { OtherType member; };

sizeof(SomeType::member); //Does not work.

This should return the size of OtherType. C++03 does not allow this, so it is a compile error. C++0x will allow it.

Defaulting/deleting of standard functions on C++ objects

In standard C++, the compiler will provide, for objects that do not provide for themselves, a default constructor, a copy constructor, a copy assignment operator operator=, and a destructor. As mentioned, the user can override these defaults by defining their own version. C++ also defines several global operators (such as operator, and operator new) that work on all classes, which the user can override.

The problem is that there are very few controls over the creation of these defaults. Making a class inherently non-copyable, for example, requires declaring a private copy constructor and copy assignment operator and defining them as nilpotent (empty). Attempting to use these functions will cause a compiler or linker error for any code that tries to copy them. However, this is not an ideal solution.

Further, in the case of the default constructor, it is useful to want to explicitly tell the compiler to generate it. The compiler will not generate a default constructor if the object is defined with any constructors. This is useful in many cases, but it is also useful to be able to have both a specialized constructor and the compiler-generated default.

C++0x will allow the explicit use, or disuse, of these standard object functions. For example, the following type explicitly declares that it is using the default constructor:

struct SomeType
{
  SomeType() = default; //The default constructor is explicitly stated.
  SomeType(OtherType value);
};

Alternatively, certain features can be explicitly disabled. For example, the following type is non-copyable:

struct NonCopyable
{
  NonCopyable & operator=(const NonCopyable&) = delete;
  NonCopyable(const NonCopyable&) = delete;
  NonCopyable() = default;
};

A type can be made impossible to allocate with operator new:

struct NonNewable
{
  void *operator new(std::size_t) = delete;
};

This object can only ever be allocated as a stack object or as a member of another type. It cannot be directly heap allocated without non-portable trickery. (Since placement new is the only way to call a constructor on user-allocated memory and this use has been forbidden as above, the object cannot be properly constructed.)

Explicit conversion operators

Standard C++ added the explicit keyword as a modifier on constructors to prevent single-argument constructors to be used as implicit type conversion operators. However, this does nothing for actual conversion operators. For example, a smart pointer class may have an operator bool() to allow it to act more like a primitive pointer: if it includes this conversion, it can be tested with if(smart_ptr_variable) (which would be true if the pointer was non-null and false otherwise). However, this allows other, unintended conversions as well. Because C++ bool is defined as an arithmetic type, it can be implicitly converted to integral or even floating-point types, which allows for mathematical operations that are not intended by the user.

In C++0x, the explicit keyword can now be applied to conversion operators. As with constructors, they prevent further implicit conversion.

Operator overloading

C++0x will allow the overloading of the following additional operators:

  • static_cast<>
  • dynamic_cast<>
  • const_cast<>

Additionally, the operators which could only be defined as class members ([], (), operator=, and operator->) may now be defined globally.

The new and delete operators can be defined globally such that they operate on a single class, thus allowing the class-overloading of these operators without changing the definition of the class.

Initializer lists

Standard C++ borrows the initializer list concept from C. The idea is that a struct or array can be created giving a list of arguments in the order of the member's definitions in the struct. These initializer lists are recursive, so an array of structs or struct containing other structs can use them. This is very useful for static lists or just for initializing a struct to a particular value. C++ has constructors, which can replicate the initialization of an object. But that alone does not replace all of the utility of this feature. Standard C++ allows this on structs and classes, except that these objects must conform to the Plain Old Data (POD) definition; non-POD classes cannot use initializer lists, nor can useful C++-style lists like std::vector and boost::array.

C++0x will allow any object to be able to construct itself from an initializer list. This is provided through the definition of a "sequence constructor". This is a constructor of the form:

class SequenceClass
{
public:
  SequenceClass(std::initializer_list<int> list);
};
 

This will allow SequenceClass to be constructed from a sequence of integers, as such:

SequenceClass someVar = {1, 4, 5, 6};

Notice that this does not allow for struct-like initialization. A sequence constructor takes a sequence that is all of the same type; this is for array-style initialization. Arrays of classes, as in a std::vector<SomeType> would be as follows:

std::vector<SomeType> someVar = {SomeType(), SomeType(4, 5), SomeType(6,7)};

Arrays of arrays can be initialized by defining the constructor to take std::initializer_list<std::initializer_list<TheType>>.

Modification to the definition of plain old data

In standard C++, a struct must follow a number of rules in order for it to be considered a plain old data (POD) type. There are good reasons for wanting a number of types to fit this definition, as doing so causes implementations to produce object layouts that are compatible with C, as well as using initializer lists. However, the list of rules in C++03 is very strict, moreso than is necessary to gain the benefits of POD types.

C++0x will relax several rules with regard to the POD definition.

A class/struct is considered a POD if it is trivial, standard-layout, and it has no non-static members that are not PODs. A trivial class or struct is defined as one that:

  1. Has a trivial default constructor. This may use the aforementioned default constructor syntax (SomeConstructor() = default;).
  2. Has a trivial copy constructor, which may use the default syntax.
  3. Has a trivial copy assignment operator, which may use the default syntax.
  4. Has a trivial destructor, which may not be virtual.

A standard-layout class or struct is defined as one that:

  1. Has only non-static data members that are of standard-layout type
  2. Has the same access control (public, private, protected) for all non-static members
  3. Has no virtual functions
  4. Has no virtual base classes
  5. Has only base classes that are of standard-layout type
  6. Has no bases classes of the same type as the first defined non-static member
  7. Either has no base classes with non-static members, or has no non-static data members in the most derived class and at most one base class with non-static members. In essence, there may be only one class in this classes hierarchy that has non-static members.

Template features

Much of the effort that has gone into the development of C++0x concerns its facilities for generic programming. The existing template mechanisms are very popular because they can provide code reuse, but have a number of drawbacks. One of the goals of C++0x was to strengthen templates and yet make them easier to use.

Aliasing of templates with keyword using (template typedefs)

Currently it is possible to use template aliasing only if all the parameters' list is defined through the typedef directive. It isn't possible to create an alias with undefined parameters.

In some cases it is possible to resolve the problem using default parameters, in other cases the unique usable shortcut is to create a new template englobing the original template. But this solution doesn't resolve the problem. From the example it's possible to see that the two declared types from the different techniques of aliasing (generica_ifc_1 e generica_ifc_2) are not recognized compatible from the compiler.

template< class first, class second, class third > class generic{} ;
typedef generic< int, float, char > generic_ifc_1 ;
template< class second > class generic_iXc: generic< int, second, char > {} ;
typedef generic_iXc<float> generic_ifc_2 ;

The problem is important because aliasing allows more flexible use of template libraries, therefore for the STL (Standard Template Library) also. For example:

MyVector<int, MyAlloc<int> > vector ;

It would be more comfortable to have an alias which inserts int only once. In the next standard of C++, it will probably be possible to declare alias of template through a new syntax:

template< class T > using Vector = MyVector< T, MyAlloc<T> > ;
Vector<int> int_vector ;

This new syntax can also be used inside a class declaration to create a member alias, as it happens already for typedef.

Variadic templates

Standard C++ templates (classes and functions) can only take a set sequence of arguments. C++0x will allow template definitions to take an arbitrary number of arguments of any type.

template<typename... Values> class tuple;

This template class tuple will take any number of typenames as its template parameters:

class tuple<std::vector<int>, std::map<std::string, std::vector<int> > someInstanceName;

The number of arguments can be 0, so class tuple<> someInstanceName will work as well.

If one does not want to have a variadic template that takes 0 arguments, then this definition will work as well:

template<typename First, typename... Rest> class tuple;

Variadic templates may also apply to functions, thus providing a type-safe mechanism similar to the standard C variadic function mechanism:

template<typename... Params> void printf(const std::string &strFormat, Params... parameters);

Note the use of the ... operator on the right of the type Params in the function signature, rather than the left as in the template specification.

The actual use of variadic templates is often recursive. The variadic parameters themselves are not readily available to the implementation of a function or class. As such, the typical mechanism for defining something like a C++0x variadic printf replacement would be as follows:

void printf(const char *s)
{
  while (*s)
  {
    if (*s == '%' && *(++s) != '%')
      throw std::runtime_error("invalid format string: missing arguments");
    std::cout << *s++;
  }
}
template<typename T, typename... Args>
void printf(const char* s, T value, Args... args)
{
  while (*s)
  {
    if (*s == '%' && *(++s) != '%')
    {
      std::cout << value;
      printf(*s ? ++s : s, args...); // call even when *s == 0 to detect extra arguments
      return;
    }
    std::cout << *s++;
  }
  throw std::runtime_error("extra arguments provided to printf");
}

This is a recursive call. Notice that the variadic template version of printf calls itself, or in the event that args is empty, calls the simple case.

There is no simple mechanism to iterate over the values of the variadic template. The expected use is in some form of recursion as above. However, there are two ways to use variadic templates that do not involve recursion.

With regard to function templates, the variadic parameters can be forwarded. When combined with r-value references (see below), this allows for perfect forwarding:

template<typename TypeToConstruct> struct SharedPtrAllocator
{
  template<typename ...Args> tr1::shared_ptr<TypeToConstruct> ConstructWithSharedPtr(Args&&... params)
  {
    return tr1::shared_ptr<TypeToConstruct>(new TypeToConstruct(static_cast<Args&&>(params)...));
  }
}

This unpacks the argument list into the constructor of TypeToConstruct. The static_cast<Args&&>(params) syntax is the syntax that perfectly forwards arguments as their proper types, even with regard to const-ness, to the constructor. The ... unpack command will propagate the forwarding syntax to each parameter. This particular factory function automatically wraps the allocated memory in a tr1::shared_ptr for a degree of safety with regard to memory leaks.

Additionally, the number of arguments in a template parameter pack can be determined as follows:

template<typename ...Args> struct SomeStruct
{
  static const int size = sizeof(Args...);
}

The syntax SomeStruct<Type1, Type2>::size will be 2, while SomeStruct<>::size will be 0.

Concept

In C++, template classes and functions necessarily impose restrictions on the types that they take. For instance, the STL containers require that the contained types be default-constructable. Unlike the dynamic polymorphism that class inheritance hierarchies exhibit, where a function that accepts an object of type Foo& can be passed any subtype of Foo, any class can be used as a template so long as it supports any operations that template uses. In the case of the function the requirement an argument must meet is clear (being a subtype of Foo), but in the case of a template the interface an object must meet is implicit in the implementation of that template. Concepts provide a mechanism for codifying the interface that a template parameter must meet.

The primary motivation of the introduction of concepts is to improve the quality of compiler error messages. If a programmer attempts to use a type that does not provide the interface a template requires, the compiler will generate an error. However, such errors are often difficult to understand, especially for novices. There are two main reasons for this. First, error messages are often displayed with template parameters spelled out in full; this leads to extremely large error messages. On some compilers, simple errors can generate several kilobytes of error messages. Second, they often do not immediately refer to the actual location of the error. For example, if the programmer tries to construct a vector of objects that do not have a copy constructor, the first error almost always refers to the code within the vector class itself that attempts to copy construct its contents; the programmer must be skilled enough to understand that the real error was that the type doesn't support everything the vector class requires.

In an attempt to resolve this issue, C++0x adds the language feature of concepts. Similar to how OOP uses a base-class to define restrictions on what a type can do, a concept is a named construct that specifies what a type must provide. Unlike OOP, however, the concept definition itself is not always associated explicitly with the type being passed into the template, but with the template definition itself:

template<LessThanComparable T> const T& min(const T &x, const T &y)
{
  return x < y ? x : y;
}

Rather than using an arbitrary class or typename for the template type parameter, it uses LessThanComparable, which is a concept that was previously defined. If a type passed into the min template function does not satisfy the requirements of the LessThanComparable concept, then a compile error will result, telling the user that the type used to instantiate the template does not fit the LessThanComparable concept.

A more generalized form of the concept is as follows:

template<typename T> requires LessThanComparable<T>
  const T& min(const T &x, const T &y)
  {
    return x < y ? x : y;
  }

The keyword requires begins a list of concept declarations. It can also be used for concepts that use multiple types. Additionally, it can be used as requires !LessThanComparable<T>, if the user wishes to prevent the use of this particular template if the type matches the concept. This mechanism can be used in a way similar to template specialization. A general template would handle types with fewer features, explicitly disallowing the use of other, more feature-rich, concepts. And those concepts would have their own specializations that use those particular features to achieve greater performance or some other functionality.

Concepts are defined as follows:

auto concept LessThanComparable<typename T>
{
  bool operator<(T, T);
}

The keyword auto, in this instance, means that any type that supports the operations specified in the concept will be considered to support the concept. Without the use of the auto keyword, then the type must use a concept map in order to declare itself as supporting the concept.

This concept says that any type that has an operator < that takes two objects of that type and returns a bool will be considered LessThanComparable. The operator need not be a free-function; it could be a member function of the type T.

Concepts can involve multiple objects as well. For example, concepts can express that a type is convertible from one type to another:

auto concept Convertible<typename T, typename U>
{
  operator U(const T&);
}

In order to use this in a template, it must use a generalized form of concept usage:

template<typename U, typename T> requires Convertible<T, U>
  U convert(const T& t)
  {
    return t;
  }

Concepts may be composed. For example, given a concept named Regular:

concept InputIterator<typename Iter, typename Value>
{
  requires Regular<Iter>;
  Value operator*(const Iter&);
  Iter& operator++(Iter&);
  Iter operator++(Iter&, int);
}

The first template parameter to the InputIterator concept must conform to the Regular concept.

Concepts can also be derived from one another, like inheritance. Like in class inheritance, types that meet the requirements of the derived concept also meet the requirements of the base concept. It is defined as per class derivation:

concept ForwardIterator<typename Iter, typename Value> : InputIterator<Iter, Value>
{
  //Add other requirements here.
}

Typenames can also be associated with a concept. These impose the requirement that, in templates that use those concepts, these typenames are available:

concept InputIterator<typename Iter>
{
  typename value_type;
  typename reference;
  typename pointer;
  typename difference_type;
  requires Regular<Iter>;
  requires Convertible<reference, value_type>;
  reference operator*(const Iter&); // dereference
  Iter& operator++(Iter&); // pre-increment
  Iter operator++(Iter&, int); // post-increment
  // ...
}

Concept maps allow types to be explicitly bound to a concept. They also allow types to, where possible, adopt the syntax of a concept without changing the definition of the type. As an example:

concept_map InputIterator<char*>
{
  typedef char value_type ;
  typedef char& reference ;
  typedef char* pointer ;
  typedef std::ptrdiff t difference_type ;
};

This map fills in the required typenames for the InputIterator concept when applied to char* types.

As an added degree of flexibility, concept maps themselves can be templated. The above example can be extended to all pointer types:

template<typename T> concept_map InputIterator<T*>
{
  typedef T value_type ;
  typedef T& reference ;
  typedef T* pointer ;
  typedef std::ptrdiff t difference_type ;
};

Further, concept maps can act as mini-types, with function definitions and other constructs commonly associated with classes:

concept Stack<typename X>
{
  typename value_type;
  void push(X&, const value_type&);
  void pop(X&);
  value type top(const X&);
  bool empty(const X&);
};
template<typename T> concept_map Stack<std::vector<T> >
{
  typedef T value_type;
  void push(std::vector<T>& v, const T& x) { v.push back(x); }
  void pop(std::vector<T>& v) { v. pop back(); }
  T top(const std::vector<T>& v) { return v. back(); }
  bool empty(const std::vector<T>& v) { return v. empty(); }
};

This concept map allows templates that take types that implement the concept Stack to take a std::vector, remapping the function calls directly to the std::vector calls. Ultimately, this allows a pre-existing object to be converted, without touching the definition of the object, into an interface that a template function can utilize.

Finally, it should be noted that some requirements can be checked using static assertions. These can verify some requirements that templates need, but are really aimed at a different problem.

Angle bracket

With the introduction of generic programming through templates, it was necessary to introduce a new type of bracket. In addition to round brackets, square bracket and curly bracket, C++ introduced angle brackets. This creates lexical ambiguities which (in the current standard) are often resolved incorrectly (according to the programmer's intentions), leading to a parse error:

typedef std::vector<std::vector<int> > Table ;  // Ok.
typedef std::vector<std::vector<bool>> Flags ;  // Error! ">>" interpreted as right shift

void func( List<B>= default_val ) ;  // Error! ">=" interpreted as comparison
void func( List<List<B>>= default_val ) ;  // Error! ">>=" interpreted as right shift assignment

template< bool I > class X  {};
X<(1>2)> x1 ;  // Ok.
X< 1>2 > x1 ;  // Error! First ">" interpreted as closing angle bracket

In C++0x the lexical analysis phase will interpret ">" as a closing angle bracket even when immediately followed by ">" or "=", if the innermost nested open bracket is an angle bracket. This fixes all of the above errors except the last, for which the programmer must still insert disambiguating parentheses.

X<(1>2)> x1 ;  // Ok.

In this way, after a left round bracket and until a right round bracket, the compiler doesn’t recognize the characters <> as angle brackets.

Extern template

In standard C++, the compiler must instantiate a template whenever a fully specified template is encountered in a translation unit. This can dramatically increase compile-time, particularly if the template is instantiated in many translation units using the same parameters. There is no way to tell C++ not to provoke an instantiation of a template.

C++0x will introduce the idea of external templates. C++ already has syntax for forcing the compiler to instantiate at a particular location:

template class std::vector<MyClass>;

What C++ lacks is the ability to prevent the compiler from instantiating a template in a translation unit. C++0x will simply extend this syntax to:

extern template class std::vector<MyClass>;

This tells the compiler not to instantiate the template in this translation unit.

Other major C++0x language features

Generalized constant expressions

C++ has always had the concept of constant expressions. These are expressions, like 3+4 which will always yield the same results and have no side effects. Constant expressions are optimization opportunities for compilers, and compilers frequently execute them at compile time and store the results in the program. Also, there are a number of places were the C++ specification requires the use of constant expressions. Defining an array requires a constant expression, and enumerator values must be constant expressions.

However, constant expressions have always ended whenever a function call or object contructor was encountered. So something as simple as:

int GetFive() {return 5;}

int some_value[GetFive() + 5]; //create an array of 10 integers. illegal C++

This is not legal C++, because GetFive() + 5 is not a constant expression. The compiler has no way of knowing if GetFive actually is constant at runtime. In theory, this function could affect a global variable, call other non-runtime constant functions, etc.

C++0x will introduce the keyword constexpr, which allows the user to guarantee that a function or object constructor is a compile-time constant. The above example can be rewritten as follows:

constexpr int GetFive() {return 5;}

int some_value[GetFive() + 5]; //create an array of 10 integers. legal C++0x

This allows the compiler to understand, and verify, that GetFive is a compile-time constant.

The use of constexpr on a function imposes very strict limitations on what that function can do. First, the function must have a non-void type. Second, the function contents must be of the form, "return expr". Third, expr must be a constant expression, after argument substitution. This constant expression may only call other functions defined as constexpor, or it may use other constant expression data variables. Fourth, all forms of recursion in constant expressions are forbidden. Lastly, a function with this label cannot be called until it is defined in this translation unit.

Variables can also be defined as constant expression values:

constexpr double forceOfGravity = 9.8;
constexpr double moonGravity = forceOfGravity / 6;

Constant expression data variables are implicitly const. They can only store the results of constant expressions or constant expression constructors.

In order to construct constant expression data values from user-defined types, constructors can also be declared with constexpr. A constant expression constructor must be defined before its use in the translation unit, as with constant expression functions. It must have an empty function body. It must initialize its members with constant expressions. And the destructors for such types should be trivial.

Copying constexpr constructed types should also be defined as a constexpr, in order to allow them to be returned by value from a constexpr function. Any member function of a class, such as copy constructors, operator overloads, etc, can be declared as constexpr, so long as they fit the definition for function constant expressions. This allows the compiler to copy classes at compile time, perform operations on them, etc.

A constant expression function or constructor can be called with non-constexpr parameters. Just as a constexpr integer literal can be assigned to a non-constexpr variable, so too can a constexpr function be called with non-constexpr parameters, and the results stored in non-constexpr variables. The keyword only allows for the possibility of compile-time constancy when all members of an expression are constexpr.

Static assertions

The C++ standard provides two methods to test assertions: the macro assert and the preprocessor directive #error. However neither is appropriate for use in templates: the macro tests the assertion at execution-time, while the preprocessor directive tests the assertion during preprocessing, which happens before instantiation of templates (hence the assertion cannot check properties that depend on the template parameters).

The new utility introduces a new way to test assertions at compile-time, using the new keyword static_assert. The declaration assumes the following form:

static_assert( constant-expression, error-message ) ;

Here some examples of how static_assert can be used:

static_assert( 3.14 < GREEKPI && GREEKPI < 3.15, " GREEKPI is inaccurate!" ) ;
template< class T >
struct Check
{
  static_assert( sizeof(int) <= sizeof(T), "T is not big enough!" ) ;
} ;

When the constant expression is false the compiler produces an error message. The first example represents an alternative to the preprocessor directive #error, in contrast in the second example the assertion is checked at every instantiation of the template class Check.

Static assertions are also useful outside of templates as well. For instance, a particular implementation of an algorithm might depend on the size of a long being larger than an int, something the standard does not guarantee. (This situation is more plausible if one replaces long with the new type long long. Such an assumption is valid on most systems and compilers, but not all.)

Lambda expression

In standard C++, particularly in conjunction with C++ standard library algorithm functions such as sort and find, the user will often wish to define predicate functions near the invocation of the algorithm function call. The language has only one mechanism for this, however: the ability to define a class inside of a function. This is often cumbersome and verbose, as well as interrupting the flow of the code.

The obvious solution would be to allow for the definition of Lambda expressions and lambda functions. A lambda function is a function that has no specific name, yet can be passed around as an object.

C++0x will provide a mechanism for the user to generate lambda function objects.

Lambda functions are defined as though the implementation will silently create a function-local class at the site of the function generation.

The simplest form of a C++0x lambda function is also the most complex in terms of explanation:

<>(x, y) (x + y)

There are two categories of C++0x lambda functions. This example is an expression lambda function. It is defined as such because the contents of the function, contained between a ( ) pair, is a C++ expression, not a statement.

The two variables, x and y have types that are defined by the user of the lambda function, similar to template function type deduction based on parameters. For example:

std::vector<int> someList;
... //fill the list.
std::remove_if(someList.begin(), someList.end(), <>(x) (x == 5));

This code will cause x to be an int type, because std::remove_if expects a function object that takes an integer in its operator() function.

In this case, the lambda function in question takes an integer and returns a boolean value.

If the user wishes, the lambda function can specify the types of its paramters:

<>(int x, int y) (x + y)

This kind of lambda function is a non-generic lambda function. No type inference for the parameters will be done.

It may also specify its return type:

<>(int x, int y) -> int (x + y)

A statement-based lambda function uses statements, including an explicit return, instead of an expression. This allows for more complex logic, but the return type must be explicitly specified:

<>(int x, int y) -> int
{
  int z = x + y;
  int w = x - y;
  if(z == w)
    return x
  else
    return y;
}

References to variables defined in the scope of the lambda function can be used as well. The set of variables of this sort is commonly called a closure. They are defined as follows:

std::vector<int> someList;
int total = 0;
std::for_each(someList.begin(), someList.end(), <>(int x: int &myTotal = total) {myTotal += x});
printf("%i", total);

This would display the total of all elements in the list.

Closure variables need not be references to external variables. For example:

std::vector<int> someList;
std::remove_if(someList.begin(), someList.end(), <>(int x: int myTotal = 0){(myTotal += x) > 20});

This removes elements after the running total reaches twenty, including the first element to reach 20.

As a shortcut, closure variables referring to local variables can be defined as:

int total = 0;
<>(x: &total) {total += x}

This is equivalent to:

int total = 0;
<>(x: auto &total = total) {total += x}

This makes it easier to specify local variables in a closure. You can also define them without a reference.

As with all references to stack variables, it is up to the user to make certain that the references are not stored and used once the stack is out of frame. For example, if the unnamed lambda function is named by storing it in a function object (which can store a callable object with a given signature), it must not use stack references if that object is stored past the point where this stack is in frame. For example:

function<int (int)> theFunc;
{
  int value = 0;
  theFunc = <>(x: &value) {value += x;}
  theFunc(5);
}
theFunc(8);

This is technically legal, but the second call will start writing onto a part of the stack that is out of frame at the call site. So using lambda functions in callback routines can be dangerous, if they store references to stack variables.

For lambda functions that are guaranteed to run in the scope of its definition, it is possible to use all available stack variables without having to explicitly reference them:

std::vector<int> someList;
int total = 0;
std::for_each(someList.begin(), someList.end(), <&>(x) (total += x));

The specific implementation can vary, but the expectation is that the lambda function will store the actual stack pointer of the function it is created in.

Type Determination

In standard C++ (and C), the type of a variable must be explicitly specified in order to use it. However, with the advent of template types and template metaprogramming techniques, the type of something, particularly the well-defined return value of a function, may not be easily expressed. As such, storing intermediates in variables is difficult, possibly requiring knowledge of the internals of a particular metaprogramming library.

C++0x allows this to be mitigated in two ways. First, the definition of a variable with an explicit initialization can use the auto keyword. This creates a variable of the specific type of the initializer:

auto someStrangeCallableType = Boost::Bind(&SomeFunction, _2, _1, someObject);
auto otherVariable = L"This is a string";

The type of someStrangeCallableType is simply whatever the particular template function override Boost::Bind returns for those particular arguments. This is easily known to the compiler, but is not easy for the user to determine upon inspection.

The type of otherVariable is also well-defined, but it is easier for the user to determine. It is a const wchar_t *, which is the same type as the string.

Additionally, the keyword, decltype can be used to compile-time determine the type of an expression. For example:

int someInt;
decltype(someInt) otherIntegerVariable = 5;

This is more useful in conjunction with auto, since the type of the variable is known only to the compiler.

auto is also useful for reducing the verbosity of the code. For instance, instead of writing

for (vector<int>::const_iterator itr = myvec.begin(); itr != myvec.end(); ++itr)

the programmer can use the shorter

for (auto itr = myvec.begin(); itr != myvec.end(); ++itr)

This difference grows as the programmer begins to nest containers, though in such cases typedefs are a good way to decrease the amount of code.

Rvalue Reference/Move semantics

In standard C++, temporaries (termed "R-values", as they lie on the right side of an operator=) can be passed to functions, but they can only be accepted as const & types. As such, it is impossible for a function to distinguish between an actual R-value and a regular object that is passed as const &. Furthermore, since the type is const &, it is not possible to actually change the object.

C++0x will add a new reference type called an R-value reference. It is defined as typename &&. These can be accepted as non-const values, which allows an object to modify them. This modification allows for certain objects to create move semantics.

For example, a std::vector is, internally, a wrapper around a C-style array with a size. If a vector temporary is created or returned from a function, it can only be stored by creating a new vector class and having it copy all of the R-value's data into it. Then the temporary is destroyed, deleting its data.

With R-value references, a "move constructor" of std::vector that takes an R-value reference to a vector can simply copy the array pointer out of the R-value, leaving it in an empty state. There is no array copying, and the destruction of the empty temporary does not destroy the memory. The function returning a vector temporary need only return a std::vector<>&&. If vector has no move constructor, then the copy constructor will be invoked with a const std::vector<> & as normal. If it does have a move constructor, then the move constructor can be invoked, and significant memory allocation can be avoided.

Additionally, R-value references allow developers to provide perfect function forwarding. When combined with variadic templates, this ability allows for function templates that can perfectly forward arguments to another function that takes those particular arguments. This is most useful for forwarding constructor parameters, to create factory functions that will automatically call the correct constructor for those particular arguments.

Strongly typed Enumerations

In standard C++, enumerations are not type-safe. They are effectively integers, even when the enumeration types are distinct. This allows the comparison between two enum values of different enumeration types. The only safety that C++03 provides is that a variable of one enum type may not directly be set into another enum type. Additionally, the underlying integral type, the size of the integer, cannot be explicitly specified; it is implementation defined. Lastly, enumeration values are scoped to the enclosing scope. As such, it is not possible for two separate enumerations to have matching member names.

C++0x will allow a special classification of enumeration that has none of these issues. This is expressed using the enum class declaration:

enum class Enumeration
{
  Val1,
  Val2,
  Val3 = 100,
  Val4 /* = 101 */,
};

This enumeration is type-safe. Enum class values cannot be converted to integers; as such, they cannot be compared to integers either (Enumeration::Val4 == 101 gives a compiler error).

The underlying type of enum classes is explicitly specified. The default, as in the above case, is int, but it can be changed as follows:

enum class Enum2 : unsigned int {Val1, Val2};

The scoping of the enumeration is also defined as the enumeration name's scope. Using the enumerator names requires explicitly scoping. Val1 is undefined, but Enum2::Val1 is defined.

Additionally, C++0x will allow standard enumerations to provide explicit scoping as well as the definition of the underlying type:

enum Enum3 : unsigned long {Val1 = 1, Val2};

The enumerator names are defined in the enumeration's scope (Enum3::Val1), but for backwards compatibility, enumerator names are also placed in the enclosing scope.

Ranged-Based For Loop

The Boost C++ library defines a number of "Range" concepts. Ranges represent a controlled list, much like a container, between two points in that list. Ordered containers are a superset of the range concept, and two iterators in an ordered container can define a range as well. These concepts, and algorithms that operate on them, will be incorporated into C++0x's standard library. However, the utility of range concepts is such that C++0x will provide a language feature built around them.

The statement for will allow for easy iteration over a range concept:

int my_array[5] = {1, 2, 3, 4, 5};
for(int &x : my_array)
{
  x *= 2;
}

The first section of the new for loop defines the variable that will be used to iterate over the range. The variable, as with variables declared in the regular for-loop, only has scope for the duration of the loop. The second section, after the ":", represents the range concept being iterated over. In this case, there is a concept map that allows C-style arrays to be converted into range concepts. This could have been a std::vector, or any object that conforms to a Range concept.

New literals, predefined and user-defined

Standard C++, like many languages, has a number of literal values. For example, "12.5" is a literal value that the compiler will convert into a floating-point number of type double. However, it also has a number of modifiers on literal values. The literal "12.5f" tells the compiler that, while it is a floating-point number, it should be converted into a float type. C++0x will introduce a number of other literals, mostly aimed at string types (for Unicode-encoded literals).

For the purpose of enhancing support for Unicode in C++ compilers, the definition of the type char as been modified to be at least the size necessary to store an eight-bit coding of UTF-8. It was previously defined as being large enough to contain any member of the compiler's basic execution character set.

There are three Unicode encodings that C++0x will support: UTF-8, UTF-16, and UTF-32. In addition to the previously noted changes to the definition of char, C++0x will add two new character types: char16_t and char32_t. Each of these is designed to store UTF-16 and UTF-32 respectively.

The following shows how to create string literals for each of these encodings:

u8"I'm a UTF-8 string."
u"This is a UTF-16 string."
U"This is a UTF-32 string."

The type of the first string is the usual const char *. The type of the second string is const char16_t*. The type of the third string is const char32_t*.

It is also sometimes useful to avoid escaping strings manually, particularly for using literals of XML files or scripting languages. C++0x will provide a raw string literal:

R""The String Data \ Stuff " ""

Everything between the double '"' marks is part of the string. The '"' and '\' characters do not need to be escaped.

C++0x will also include the ability for the user to define new kinds of literal modifiers that will construct objects based on the string of characters that the literal modifies.

Literals transformation is redefined into two distinct phases: raw and cooked. A raw literal is a sequence of characters of some specific type, while the cooked literal is of a separate type. The C++ literal 1234, as a raw literal, is this sequence of characters '1', '2', '3', '4'. As a cooked literal, it is the integer 1234. The C++ literal 0xA in raw form is '0', 'x', 'A', while in cooked form is the integer 10.

Literals can be extended in both raw and cooked forms, with the exception of string literals, which can only be processed in cooked form. This exception is due to the fact that strings have prefixes that affect the specific meaning and type of the characters in question.

All user-defined literals are suffixes; defining prefix literals is not possible.

User-defined literals processing the raw form of the literal are defined as follows:

OutputType operator"Suffix"(const char *literal_string);

OutputType someVariable = 1234Suffix;

The second statement executes the code defined by the user-defined literal function. This function is passed "1234" as a C-style string, so it has a null terminator.

An alternative mechanism for processing raw literals is through a variadic template:

template<char...> OutputType operator"Suffix"();

OutputType someVariable = 1234Suffix;

This instantiates the literal processing function as operator"Suffix"<'1', '2', '3', '4'>. In this form, there is no terminating null character to the string. The main purpose to doing this is to use C++0x's constexpr keyword and the compiler to allow the literal to be transformed entirely at compile time, assuming OutputType is a constexpr-constructable and copyable type, and the literal processing function is a constexpr function.

For cooked literals, the type of the cooked literal is used, and there is no alternate template form:

OutputType operator"Suffix"(int the_value);

OutputType someVariable = 1234Suffix;

For string literals, the following are used, in accordance with the previously mentioned new string prefixes:

OutputType operator"Suffix"(const char * string_values, size_t num_chars);
OutputType operator"Suffix"(const wchar_t * string_values, size_t num_chars);
OutputType operator"Suffix"(const char16_t * string_values, size_t num_chars);
OutputType operator"Suffix"(const char32_t * string_values, size_t num_chars);

OutputType someVariable = "1234"Suffix;      //Calls the const char * version
OutputType someVariable = u8"1234"Suffix;    //Calls the const char * version
OutputType someVariable = L"1234"Suffix;     //Calls the const wchar_t * version
OutputType someVariable = u"1234"Suffix;     //Calls the const char16_t * version
OutputType someVariable = U"1234"Suffix;     //Calls the const char32_t * version

Character literals are defined similarly.

Transparent Garbage Collection

Standard C++ expects the user to manage memory manually. C++0x will provide the option of automatic memory management though implementation-defined garbage collection (GC) mechanisms. The current specification of this is functionally all or nothing. Either the entire program functions under GC or it does not; the expectation is that this will be controlled by a compiler switch.

Translation units can define whether they use strict or relaxed GC. In strict GC, the objects defined in the unit are assumed to be well-behaved with regard to type safety. In particular, pointers are not cast to ints, and vice-versa. This makes the GC's job a lot easier, as it needs to only keep track of pointer types. Under relaxed GC, it must test pointers, integers, and potentially other types.

Also, a translation unit can declare itself not to be used with GC. Thus, a library can forbid its use in a GC-based program. The reverse is available as well; a translation unit can declare itself to be used only if GC is enabled.

When allocating memory, it is possible to allocate non-GC-able memory, though the GC will still look through this memory for pointers. Allocations can be made through a specialized new operator, a non-GC version of std::allocator, or a version of malloc.

long long int

On 32-bit systems, a long long integer type that is at least 64-bits is useful. The C99 standard introduces this type to standard C and it is a long-supported extension by most compilers for C++. (Indeed, some compilers supported it long before its introduction to C99.) C++0x will likely add this type to standard C++.

This is less useful on some 64-bit systems, as one common model for data sizes is as follows:

  • 16 bit: short int
  • 32 bit: int
  • 64 bit: long int

(This is called LP64.)

Nevertheless, in 32-bit systems, as well as 64-bit Windows systems (which use the LLP64 model, which has 32-bit longs), it is a rooted habit to use long long int as the 64-bit integer.

The C++ committee has always shown a reluctance to standardize new fundamental types that haven't been adopted by the C committee (who have independence from the C++ committee, though a liaison exists and the two groups have a significant overlap). But now long long int (abbreviated in long long) has become a de facto standard as well as a de jure standard with C99, so this impasse seems to have been resolved. The C++ committee will approve long long int as a fundamental type (unsigned long long int included).

In the future long long int might be used for 128-bit integers if demand is present, or if on new processors with 128-bit registers.

Null pointer

In the current standard, the constant 0 has the double role of constant integer and null pointer. (This behaviour has existed since the dawn of C in 1972.)

For years, programmers have mostly avoided this possible ambiguity by using the constant NULL instead of 0. However, two of C++'s design choices have converged to produce another ambiguity. In C, NULL is a preprocessor macro defined to be ((void*)0) or 0. In C++, implicit conversions from void* to other pointer types is not allowed, so something as simple as char* c = NULL would fail to compile under the former definition. To fix this, C++ ensures that NULL expands to 0, which as a special case is allowed to be converted to any pointer type. This interacts poorly with the overloading mechanism. For instance, suppose a program has declarations void foo( char* ); void foo( int ); and then calls foo(NULL); this will call the foo(int) version, which is almost certainly not what the programmer intends.

It is likely that in the new standard will introduce a new keyword, reserved only to indicate the null pointer; at the moment nullptr is proposed for this role.

nullptr cannot be assigned to integer types, nor compared with it, but it can be compared and assigned to any pointer.

Obviously the existing role of 0 will remain for compatibility reasons.

If the new syntax is a success, the C++ committee could declare deprecated the usage of 0 and NULL as a null pointer, and eventually abolish this double role.

C++ standard library extension

The most audacious novelties will be through the C++0x standard library, even if in reality, almost all new bookstores will not need core updating as they could work on the current standard.

The greatest part of introduced libraries are defined in the document “C++ Standards Committee's Library Technical Report” (called TR1), whose definitive layout goes up again to 2005. This library has already been adopted by some compilers and can be called using the “namespace std::tr1”.

A second technical report (called TR2) is currently in preparation, and is slated for completion after the standardization of C++0x. For this reason, the current paragraph references only some of the meaningful libraries introduced from TR1.

Tuple types

Tuples are collections composed of heterogeneous objects of pre-arranged dimensions. Every type of object is admitted for tuple’s elements.

This new utility is implemented through a new header and benefits from some core extensions of C++ language, as:

  • variadic templates,
  • reference to reference,
  • default arguments for template functions (now available only for template classes).

Here is the definition of tuple in the header <tuple>:

template< class T1 = unspecified,
          class T2 = unspecified,
          ...,
          class TM = unspecified > class tuple ;

An example of definition and use of the tuple type:

typedef tuple< int, double, long &, const char * > test_tuple ;
long lengthy = 12 ;
test_tuple proof( 18, 6.5, lengthy, "Ciao!" ) ;
lengthy = get<0>(proof) ;  // Assign to 'lengthy' the value 18.
get<3>(proof) = " Beautiful!" ;  // Modify the fourth tuple’s element.

It’s possible to create the tuple proof without defining its contents, but only if the tuple elements' types possess default constructors. Moreover, it’s possible to assign a tuple to another tuple: if the two tuples’ types are the same, it is necessary that each element type possesses a copy constructor; otherwise, it is necessary that each element type of the right-side tuple is convertible to that of the corresponding element type of the left-side tuple or that the corresponding element type of the left-side tuple has a suitable constructor.

typedef tuple< int , double, string       > tuple_1 t1 ;
typedef tuple< char, short , const char * > tuple_2 t2( 'X', 2, "Hola!" ) ;
t1 = t2 ;  // Ok, first two elements can be converted,
           // the third one can be constructed from a 'const char *'.

Relational operators are available (among tuples with the same number of elements), and two expressions are available to check a tuple’s characteristics (only during compilation):

  • tuple_size<T>::value returns the elements’ number of the tuple T,
  • tuple_element<I, T>::type returns the type of the object number I of the tuple T.

Hash tables

Including hash tables (unordered associative containers) in the C++ standard library is one of the most recurring requests. It was not adopted in the current standard (the one written in 1995 and approved in 1998) due to time constraints only. Although this solution is less efficient than a balanced tree in the worst case (in the presence of many collisions), it performs better in many real applications.

Collisions will be managed only through linear chaining because the committee doesn’t consider opportune to standardize solutions of open addressing that introduce quite a lot of intrinsic problems (above all when erasure of elements is admitted). To avoid name clashes with non-standard libraries that developed their own hash table implementations, the prefix “unordered” will be used instead of “hash”.

The new utility will have four types of hash tables, differentiated by whether or not they accept elements with the same key (unique keys or equivalent keys), and whether they map each key to an associated value.

Type of hash table Arbitrary mapped type Equivalent keys
unordered_set
unordered_multiset
unordered_map
unordered_multimap

New classes fulfil all the requirements of a container class, and have all the methods necessary to access elements: insert, erase, begin, end.

This new utility doesn’t need any C++ language core extensions, only a small extension of the header <functional> and the introduction of headers <unordered_set> and <unordered_map>. No other changes to any existing standard classes are needed, and it doesn’t depend on any other extensions of the standard library.

Regular expressions

Many more or less standardized libraries were created to manage regular expressions. Since the use of these algorithms is very common, the standard library will include them using all potentialities of an object oriented language.

The new library, defined in the new header <regex>, is made of a couple of new classes:

  • regular expressions are represented by instance of the template class basic_regex;
  • occurrences are represented by instance of the template class match_results.

The function regex_search is used for searching, while for ‘search and replace’ the function regex_replace is used which returns a new string. The algorithms regex_search and regex_replace take a regular expression and a string and write the occurrences found in the struct match_results.

Here is an example on the use of match_results:

const char *reg_esp = "[ ,.\\t\\n;:]" ;  // List of separator characters.
// NOTE: algorithms of regular expressions consider backslash like
// C++ compilers, then for example, the character '\n' must be pointed out with "\\n".
// Or using new raw-string literal, R"\n".

regex rgx(reg_esp) ;  // 'regex' is an instance of the template class
                      // 'basic_regex' with argument of type 'char'.
cmatch match ;  // 'cmatch' is an instance of the template class
                // 'match_results' with argument of type 'const char *'.
const char *target = "Polytechnic University of Turin " ;

// Identifies all words of 'target' separated by characters of 'reg_esp'.
if( regex_search( target, match, rgx ) )
{
  // If words separated by specified characters are present.

  for( int a = 0 ; a < match.size() ; a++ )
  {
    string str( match[a].first, match[a].second ) ;
    cout << str << "\n" ;
  }
}

The library “regex” doesn’t need alteration of any existing header and no extension of the core.

General-purpose smart pointers

Management of dynamic memory allocation has always been a critical point in the history of programming languages. Many modern programming languages (such as Java) offer tools for automatic memory management.

Ordinary C++ data pointers have many interesting characteristics:

  • it is possible to copy them,
  • it is possible to assign them,
  • it is possible to use their value,
  • it is possible to use void * like a generic pointer,
  • it is possible to convert them to a base class through a static cast,
  • it is possible to convert them to a derived class through a dynamic cast.

While the principal defects of C++ pointers are:

  • obligatory manual management of dynamically allocated objects,
  • it is possible to refer to an invalid or unallocated address in memory.

The new smart pointers maintain the benefits of ordinary pointers while largely eliminating their weaknesses. The new template class shared_ptr will be similar to the existing C++ STL auto_ptr, with the addition of counting any "sibling" references to the same target object, and sharing ownership of the object in question between related smart pointers. Moreover shared_ptr can even be used with standard library containers.

To get the number of pointers that refer to the same object it is possible to use the function use_count, a member function of shared_ptr. The member function reset resets the smart pointer. A pointer that has been reset is empty and its function use_count returns zero.

Here it is an example of use of shared_ptr:

int main( )
{
  shared_ptr<double> p_first(new double) ;

  if( true )
  {
    shared_ptr<double> p_copy = p_first ;

    *p_copy = 21.2 ;

  }  // Destruction of 'p_copy' but not of the allocated double.

  return 0;  // Destruction of 'p_first' and accordingly of the allocated double.
}

The related proposed template weak_ptr will behave like the smart pointer/references, with the exception that the value being referenced is not "owned" by the pointer, and can be destroyed while still referenced by existing weak pointers. This means that weak pointers will be vulnerable to runtime errors resulting from the destruction of their referenced object. In this way it is possible to maintain a reference to an object without influencing its lifecycle, for instance to prevent reference loops.

This utility requires some changes to the header <memory>, but it doesn’t require an extension of the C++ language.

Extensible random number facility

Computers have deterministic behavior by definition; nevertheless some applications need non-deterministic behavior (even if only in appearance) formed by generation of random numbers.

The only existing standard utility is the function rand, but it isn’t well defined and its implementation is entirely delegated to compiler producers. New utilities for random number generators will be defined through the header <random>; no other modification to headers or to the C++ core is needed.

Random number generators have an internal state and a function that computes the result and drives the generator to the next state. These two characteristics are the generator’s engine. Another very important characteristic is the distribution of results, or rather the interval and the density of the aleatory variable.

Through template class variate_generator it is possible to create a random number generator with a desired engine and distribution. You can choose among engines and distributions provided by the standard or create your own.

  • Engines for pseudo-random numbers

The new library will introduce several engines for generation of pseudo-random numbers. These are all template classes, allowing the programmer to personalize them as needed. The internal state of a pseudo-random engine is determined through a seed (generally a set of variables). Apparent casualness is due only from limited perception of users.

template class int/float quality speed size of state*
linear_congruential int low medium 1
substract_with_carry both medium fast 25
mersenne_twister int good fast 624

* Multiply the value for the byte dimension of used type.

The performance of this engine can be increased using the template class discard_block, or can be combined using the template class xor_combine. For convenience the header <random> also defines some standard engines; an example is the class mt19937 instantiated on the template class mersenne_twister:

typedef mersenne_twister< implementation-defined, 32, 624, 397, 31, 0x9908b0df,
                          11, 7, 0x9d2c5680, 15, 0xefc60000, 18 >
        mt19937 ;

Through the class random_device it’s possible to generate non-deterministic numbers of type unsigned int. Its implementation will require the use of a device whose input is independent from the system (for example an unsynchronized external counter, or a particular transducer) and it will also need use of traditional pseudo-random number generator “to temper the result”.

  • Distribution of random numbers

The new library defines many types of distributions, from uniform distributions to those defined by probability theory: uniform_int, bernoulli_distribution, geometric_distribution, poisson_distribution, binomial_distribution, uniform_real, exponential_distribution, normal_distribution, and gamma_distribution. Obviously, the programmer is free to instantiate standard distributions or he can use his own compatible distributions.

Here is a simple example using the new library:

uniform_int<int> distribution( 0, 99 ) ;
mt19937 engine ;
variate_generator<mt19937, uniform_int<int>> generator( engine, distribution );
int random = generator() ;  // Assign a value among 0 and 99.

Mathematical special functions

The <math> header already defines some common mathematical functions:

  • trigonometric: sin, cos, tan, asin, acos, atan, atan2;
  • hyperbolic: sinh, cosh, tanh, asinh, acosh, atanh;
  • exponential: exp, exp2, frexp, ldexp, expm1;
  • logarithmic: log10, log2, logb, ilogb, log1p;
  • power: pow, sqrt, cbrt, hypot;
  • special: erf, erfc, tgamma, lgamma.

The committee has decided to add new functions to the ‘special’ category that currently require using a non-standard library. These new functions will mainly be of interest to those of engineering and scientific disciplines.

The follow table shows all 23 approved functions for the next C++ standard.

Function name Function prototype Mathematical expression
Associated Laguerre polynomials double assoc_laguerre( unsigned n, unsigned m, double x ) ;
Associated Legendre polynomials double assoc_legendre( unsigned l, unsigned m, double x ) ;
Beta function double beta( double x, double y ) ;
Complete elliptic integral of the first kind double comp_ellint_1( double k ) ;
Complete elliptic integral of the second kind double comp_ellint_2( double k ) ;
Complete elliptic integral of the third kind double comp_ellint_3( double k , double nu ) ;
Confluent hypergeometric functions double conf_hyperg( double a, double c, double x ) ;
Regular modified cylindrical Bessel functions double cyl_bessel_i( double nu, double x ) ;
Cylindrical Bessel functions of the first kind double cyl_bessel_j( double nu, double x ) ;
Irregular modified cylindrical Bessel functions double cyl_bessel_k( double nu, double x ) ;
Cylindrical Neumann functions

Cylindrical Bessel functions of the second kind

double cyl_neumann( double nu, double x ) ; File:C++0x011e.gif
Incomplete elliptic integral of the first kind double ellint_1( double k, double phi ) ; File:C++0x012e.gif
Incomplete elliptic integral of the second kind double ellint_2( double k, double phi ) ; File:C++0x013e.gif
Incomplete elliptic integral of the third kind double ellint_3( double k, double nu, double phi ) ; File:C++0x014e.gif
Exponential integral double expint( double x ) ; File:C++0x015.gif
Hermite polynomials double hermite( unsigned n, double x ) ; File:C++0x016.gif
Hypergeometric series double hyperg( double a, double b, double c, double x ) ; File:C++0x017.gif
Laguerre polynomials double laguerre( unsigned n, double x ) ; File:C++0x018e.gif
Legendre polynomials double legendre( unsigned l, double x ) ; File:C++0x019e.gif
Riemann zeta function double riemann_zeta( double x ) ; File:C++0x020e.gif
Spherical Bessel functions of the first kind double sph_bessel( unsigned n, double x ) ; File:C++0x021e.gif
Spherical associated Legendre functions double sph_legendre( unsigned l, unsigned m, double theta ) ; File:C++0x022e.gif
Spherical Neumann functions

Spherical Bessel functions of the second kind

double sph_neumann( unsigned n, double x ) ; File:C++0x023e.gif

Each function has two variants. Adding the suffix ‘f’ or ‘l’ to a function name gives a function that operates on float or long double values respectively. For example:

float sph_neumannf( unsigned n, float x ) ;
long double sph_neumannl( unsigned n, long double x ) ;

Wrapper reference

A wrapper reference is obtained from an instance of the template class reference_wrapper. Wrapper references are similar to normal references (‘&’) of the C++ language. To obtain a wrapper reference from any object the template class ref is used (for a constant reference cref is used).

Wrapper references are useful above all for template functions, when we need to obtain references to parameters rather than copies:

// This function will obtain a reference to the parameter 'r' and increase it.
void f( int &r )  { r++ ; }

// Template function.
template< class F, class P > void g( F f, P t )  { f(t) ; }

int main()
{
  int i = 0 ;
  g( f, i ) ;  // 'g< void ( int &r ), int >' is instantiated
               // then 'i' will not be modified.
  cout << i << endl ;  // Output -> 0

  g( f, ref(i) ) ;  // 'g<void(int &r),reference_wrapper<int>>' is instanced
                    // then 'i' will be modified.
  cout << i << endl ;  // Output -> 1
}

This new utility will be added to the existing <utility> header and doesn’t need further extensions of the C++ language.

Polymorphous wrappers for function objects

Polymorphous wrappers for function objects (also called “polymorphic function object wrappers”) are similar to function pointers in semantics and syntax, but are less tightly bound and can indiscriminately refer to any function whose arguments are compatible with those of the wrapper.

Through the next example it is possible to understand its characteristics:

function<int ( int, int )> pF ;  // Wrapper creation using
                                 // template class 'function'.

plus<int> add ;  // 'plus' is declared as 'template<class T> T plus( T, T ) ;'
                 // then 'add' is type 'int add( int x, int y )'.

pF = &add ;  // Assignment is correct because
             // parameters and type of return correspond.

int a = pF( 1, 2 ) ;  // NOTE: if the wrapper 'pF' is not referred to any function
                      // the exception 'bad_function_call' is thrown.

function<bool ( short, short )> pg ;
if( pg == NULL )  // It’s always verified because 'pg'
                  // is not assigned to any function yet.
{
  bool adjacent( long x, long y ) ;
  pg = &adjacent ;  // Parameters and value of return are compatible,
                    // the assignment is correct.
  struct test
  {
    bool operator()( short x, short y ) ;
  } car ;
  pg = ref(car) ;  // 'ref' is a template function that return the wrapper
                   // of member function 'operator()' of struct 'car'.
}
pF = pg ;  // It is correct because parameters and value of return of
           // wrapper 'pg' are compatible with those of wrapper 'pF'.

The template class function will be defined inside the header <functional>, and doesn't require any changes to the C++ language.

Type traits for metaprogramming

Metaprogramming consists of creating a program that creates or modifies another program (or itself). This can happen during compilation or during execution. The C++ Standards Committee has decided to introduce a library that allows metaprogramming during compilation through templates.

Here is an example of what is possible, using the actual standard, through metaprogramming: a recursion of template instances for exponential calculus.

template< int B, int N >
struct Pow
{
  // recursive call and recombination.
  enum{ value = B*Pow< B, N-1 >::value } ;
} ;
template< int B > struct Pow< B, 0 >  // N == 0 condition of termination.
{
  enum{ value = 1 } ;
} ;
int quartic_of_three = Pow< 3, 4 >::value ;

Many algorithms can operate on different types of data; C++'s templates support generic programming and make code more compact and useful. Nevertheless it is common for algorithms to need information on the data types being used. This information can be extracted during instantiation of a template class using type traits.

Type traits can identify the category of an object and all the characteristic of a class (or of a struct). They are defined in the new header <type_traits>.

In the next example there is the template function ‘elaborate’ that, depending on the given data types, will instantiate one of the two proposed algorithms (algorithm.do_it).

// First way of operating.
template< bool B > struct algorithm
{
  template< class T1, class T2 > int do_it( T1 &, T2 & )  { /*...*/ }
} ;
// Second way of operating.
template<> struct algorithm<true>
{
  template< class T1, class T2 > int do_it()( T1 *, T2 * )  { /*...*/ }
} ;

// Instantiating 'elaborate' will automatically instantiate the correct way to operate.
template< class T1, class T2 > int elaborate( T1 A, T2 B )
{
  // Use the second way only if 'T1' is an integer and if 'T2' is
  // in floating point, otherwise use the first way.
  return algorithm< is_integral<T1> && is_floating_point<T2> >::do_it( A, B ) ;
}

Through type traits, defined in header <type_transform>, it’s also possible to create type transformation operations (static_cast and const_cast are insufficient inside a template).

This type of programming produces elegant and concise code; however the weak point of these techniques is the debugging: uncomfortable during compilation and very difficult during program execution.

Uniform method for computing return type of function objects

Determining (during compilation) the return type of a template function object, especially if it depends on parameters of the same function, isn’t always intuitive.

Take for example the following code:

struct clear
{
  int    operator()( int    ) ;  // The parameter type is
  double operator()( double ) ;  // equal to the return type.
} ;

template< class Obj > class calculus
{
  public:
    template< class Arg > Arg operator()( Arg& a ) const
    {
      return member(a) ;
    }
  private:
    Obj member ;
} ;

Instantiating the class template calculus with the class clear (i.e. instantiating calculus<clear>), the function object of calculus will have always the same return type as the function object of clear.

If we instantiate the class template calculus using the class confused (i.e. instantiating calculus<confused>):

struct confused
{
  double operator()( int    ) ;  // The parameter type is NOT
  int    operator()( double ) ;  // equal to the return type.
} ;

The return type of calculus will not be the same as the class confused (there can be a conversion from int to double or vice-versa, according to the instance of calculus<confused>.operator()).

The new library, that has been proposed for the next standard, introduces the template class result_of that allows to determine and use the return type of a function object for every declaration. The corrected version calculus_ver2 uses the new utility to derive the return type of the function object:

template< class Obj >
class calculus_ver2
{
  public:
    template< class Arg >
    typename result_of<Obj::operator(Arg)>::type operator()( Arg& a ) const
    { 
      return member(a) ;
    }
  private:
    Obj member ;
} ;

In this way in instances of function object of calculus_ver2<confused> there are no conversions.

The problem of return type determination of a call to a function object is a subset of the more general problem of result type determination of expressions. This problem in future could be resolved by expanding the functionality of typeof to all possible occurrences.

See also

References

C++ Standards Committee papers

  • ISO/IEC DTR 19768 (August 7, 2007) Doc No: N2389 State of C++ Evolution (pre-Kona 2007 Meetings)
  • ISO/IEC DTR 19768 (July 29, 2007) Doc No: N2336 State of C++ Evolution (Toronto 2007 Meetings)
  • ISO/IEC DTR 19768 (June 25, 2007) Doc No: N2291 State of C++ Evolution (Toronto 2007 Meetings)
  • ISO/IEC DTR 19768 (May 3, 2007) Doc No: N2228 State of C++ Evolution (Oxford 2007 Meetings)
  • ISO/IEC DTR 19768 (January 12, 2007) Doc No: N2142 State of C++ Evolution (between Portland and Oxford 2007 Meetings)
  • ISO/IEC DTR 19768 (November 3, 2006) Doc No: N2134 Working Draft, Standard for programming Language C++
  • ISO/IEC DTR 19768 (June 24, 2005) Doc No: N1836 Draft Technical Report on C++ Library Extensions
  • Lawrence Crowl (May 2, 2005) Doc No: N1815 ISO C++ Strategic Plan for Multithreading
  • ^ Detlef Vollmann (June 24, 2005) Doc No: N1834 A Pleading for Reasonable Parallel Processing Support in C++
  • Lawrence Crowl (August 25, 2005) Doc No: N1874 Thread-Local Storage
  • Jan Kristoffersen (October 21, 2002) Doc No: N1401 Atomic operations with multi-threaded environments
  • Hans Boehm, Nick Maclaren (April 21, 2002) Doc No: N2016 Should volatile Acquire Atomicity and Thread Visibility Semantics?
  • Lois Goldthwaite (February 2, 2004) Doc No: N1592 Explicit Conversion Operators
  • Francis Glassborow, Lois Goldthwaite (November 5, 2004) Doc No: N1717 explicit class and default definitions
  • Bjarne Stroustrup, Gabriel Dos Reis (December 11, 2005) Doc No: N1919 Initializer lists
  • Herb Sutter, Francis Glassborow (April 6, 2006) Doc No: N1986 Delegating Constructors (revision 3)
  • Michel Michaud, Michael Wong (October 6, 2004) Doc No: N1898 Forwarding and inherited constructors
  • Bronek Kozicki (September 9, 2004) Doc No: N1676 Non-member overloaded copy assignment operator
  • R. Klarer, J. Maddock, B. Dawes, H. Hinnant (October 20, 2004) Doc No: N1720 Proposal to Add Static Assertions to the Core Language (Revision 3)
  • V Samko; J Willcock, J Järvi, D Gregor, A Lumsdaine (February 26, 2006) Doc No: N1968 Lambda expressions and closures for C++
  • J. Järvi, B. Stroustrup, D. Gregor, J. Siek, G. Dos Reis (September 12, 2004) Doc No: N1705 Decltype (and auto)
  • B. Stroustrup, G. Dos Reis, Mat Marcus, Walter E. Brown, Herb Sutter (April 7, 2003) Doc No: N1449 Proposal to add template aliases to C++
  • Douglas Gregor, Jaakko Järvi, Gary Powell (September 10, 2004) Doc No: N1704 Variadic Templates: Exploring the Design Space
  • Gabriel Dos Reis, Bjarne Stroustrup (October 20, 2005) Doc No: N1886 Specifying C++ concepts
  • Daveed Vandevoorde (January 14, 2005) Doc No: N1757 Right Angle Brackets (Revision 2)
  • Walter E. Brown (October 18, 2005) Doc No: N1891 Progress toward Opaque Typedefs for C++0X
  • J. Stephen Adamczyk (April 29, 2005) Doc No: N1811 Adding the long long type to C++ (Revision 3)
  • Chris Uzdavinis, Alisdair Meredith (August 29, 2005) Doc No: N1827 An Explicit Override Syntax for C++
  • Herb Sutter, David E. Miller (October 21, 2004) Doc No: N1719 Strongly Typed Enums (revision 1)
  • Matthew Austern (April 9, 2003) Doc No: N1456 A Proposal to Add Hash Tables to the Standard Library (revision 4)
  • Doug Gregor (November 8, 2002) Doc No: N1403 Proposal for adding tuple types into the standard library
  • John Maddock (March 3, 2003) Doc No: N1429 A Proposal to add Regular Expression to the Standard Library
  • P. Dimov, B. Dawes, G. Colvin (March 27, 2003) Doc No: N1450 A Proposal to Add General Purpose Smart Pointers to the Library Technical Report (Revision 1)
  • Doug Gregor (October 22, 2002) Doc No: N1402 A Proposal to add a Polymorphic Function Object Wrapper to the Standard Library
  • D. Gregor, P. Dimov (April 9, 2003) Doc No: N1453 A proposal to add a reference wrapper to the standard library (revision 1)
  • John Maddock (March 3, 2003) Doc No: N1424 A Proposal to add Type Traits to the Standard Library
  • Daveed Vandevoorde (April 18, 2003) Doc No: N1471 Reflective Metaprogramming in C++
  • Jens Maurer (April 10, 2003) Doc No: N1452 A Proposal to Add an Extensible Random Number Facility to the Standard Library (Revision 2)
  • Walter E. Brown (October 28, 2003) Doc No: N1542 A Proposal to Add Mathematical Special Functions to the C++ Standard Library (version 3)
  • Douglas Gregor, P. Dimov (April 9, 2003) Doc No: N1454 A uniform method for computing function object return types (revision 1)

Articles

  • a b The C++ Source Bjarne Stroustrup (January 2, 2006) A Brief Look at C++0x
  • ^ C/C++ Users Journal Bjarne Stroustrup (May, 2005) The Design of C++0x: Reinforcing C++’s proven strengths, while moving into the future
  • Web Log di Raffaele Rialdi (September 16, 2005) Il futuro di C++ raccontato da Herb Sutter
  • Informit.com (August 5, 2006) The Explicit Conversion Operators Proposal
  • Informit.com (July 25, 2006) Introducing the Lambda Library
  • Dr. Dobb's Portal Pete Becker (April 11, 2006) Regular Expressions TR1's regex implementation
  • Informit.com (July 25, 2006) The Type Traits Library
  • Dr. Dobb's Portal Pete Becker (May 11, 2005) C++ Function Objects in TR1

External links