Vectorizing Iterator¶
Data parallel constructs (such as forall loops) are implicitly
vectorizable. If the --vectorize compiler flag is thrown (implied by
--fast), the Chapel compiler will emit vectorization hints to the backend
compiler, though the effects will vary based on the target compiler.
In order to allow users to explicitly request vectorization, this prototype
vectorizing iterator is being provided. Loops that invoke this iterator will
be marked with vectorization hints, provided the --vectorize flag is
thrown.
This iterator is currently available for all Chapel programs and does not
require a use statement to make it available. In future releases it will
be moved to a standard module and will likely require a use statement to
make it available.
-
iter
vectorizeOnly(iterables ...)¶ Vectorize only "wrapper" iterator:
This iterator wraps and vectorizes other iterators. It takes one or more iterables (an iterator or class/record with a these() iterator) and yields the same elements as the wrapped iterables.
This iterator exists to provide a way to vectorize data parallel loops without invoking a parallel iterator with the goal of avoiding task creation for loops with small trip counts or where task creation isn't desirable.
Data parallel operations in Chapel such as forall loops are order-independent. However, a forall is implemented in terms of either leader/follower or standalone iterators which typically create tasks. This iterator exists to allow vectorization of order-independent loops without requiring task creation. By using this wrapper iterator you are asserting that the loop is order-independent (and thus a candidate for vectorization) just as you are when using a forall loop.
When invoked from a serial for loop, this iterator will simply mark your iterator(s) as order-independent. When invoked from a parallel forall loop this iterator will implicitly be order-independent because of the semantics of a forall, and additionally it will invoke the serial iterator instead of the parallel iterators. For instance:
forall i in vectorizeOnly(1..10) do; for i in vectorizeOnly(1..10) do;
will both effectively generate:
CHPL_PRAGMA_IVDEP for (i=0; i<=10; i+=1) {}
The
vectorizeOnlyiterator automatically handles zippering, so thezipkeyword is not needed. For instance, to vectorize:for (i, j) in zip(1..10, 1..10) do;
simply write:
for (i, j) in vectorizeOnly(1..10, 1..10) do;
Note that the use of
zipis not explicitly prevented, but all iterators being zipped must be wrapped by avectorizeOnlyiterator. Future releases may explicitly prevent the usezipwith this iterator.