Introduction:-
The aim of technology has always been solving problems facing humanity. The aim isn’t only to
solve problems, but to solve them efficiently in the least possible time. With time problems
become more complex to solve (i.e.: better hardware needed to solve these problems quickly).
In the field of Computer Science, the hardware component responsible for how fast will a
problem be solved is the processor.
In the past hardware corporations like Intel used to introduce faster processor each year. But at
some point they figured out that they wouldn’t be able to introduce new uni-core processors
with more transistors; and since the speed of a processor is directly proportional with the
number of transistors, it wasn’t feasible to have faster uni-core processors.
Hardware corporations found that the way to introduce faster processors is to introduce multicore
processors. Introducing multi-core processors resulted in a problem that former
programming models don’t support multi-core processors programming. So, new programming
models and languages were introduced for programmers to be able to utilize the presence of
multi-cores in one processor. The article demonstrates one of these new models which is
Cilk++.

Introduction:-The aim of technology has always been solving problems facing humanity. The aim isn’t only tosolve problems, but to solve them efficiently in the least possible time. With time problemsbecome more complex to solve (i.e.: better hardware needed to solve these problems quickly).In the field of Computer Science, the hardware component responsible for how fast will aproblem be solved is the processor.In the past hardware corporations like Intel used to introduce faster processor each year. But atsome point they figured out that they wouldn’t be able to introduce new uni-core processorswith more transistors; and since the speed of a processor is directly proportional with thenumber of transistors, it wasn’t feasible to have faster uni-core processors.Hardware corporations found that the way to introduce faster processors is to introduce multicoreprocessors. Introducing multi-core processors resulted in a problem that formerprogramming models don’t support multi-core processors programming. So, new programmingmodels and languages were introduced for programmers to be able to utilize the presence ofmulti-coresin one processor. The article demonstrates one of these new models which isCilk++.

Cilk++ VS C++:-


Cilk++ development started in 1994 in of the MIT labs. Cilk++ is based on C++. So, writing Cilk++

is exactly the same as writing C++ with the ability of programming parallel applications with the

use of some new keywords specifically introduced to Cilk++ to enable parallel programming.

The new keywords are: cilk, spawn, sync, inlet and abort. The “cilk” keyword is added to the

header of any function to notify the compiler that this function may include parallel

programming (i.e.: one of the other Cilk++ specific keywords might be included in this function).

Below is a detailed description for the rule of each of the other keywords.

Spawn:

 

The “spawn” keyword can be added to the start of any line of code to notify the processor that

it can execute this line of code on a separate core of the processor if possible. This line of code

might be a call for function or even a set of functions giving the ability to run a piece of code

rather

Sync:

The “sync” keyword is closely related to the “spawn” keyword. After one or more lines of code

has been “spawned” the “sync” keyword can be put afterwards to notify the processor that

should stop executing the code till all spawned processes finish processing. The advantage of

“sync” is to synchronize parallel processes in order to ensure secure coding (i.e.: no problem

will take place as a result of a process demanding resources that are being used by another

process running in parallel). Below is a figure showing the functionality of “sync”.

Inlet:

The “inlet” keyword is used in advanced programming. Whenever a parent process spawns

other processes, these child processes are supposed to return results to the parent process.

Inlet is used in order to make sure that no write conflicts will take place as a result of multiple child processes writing in a variable in the same parent process. An “inlet” is like a function that

can be written inside another function to control how return values of child processes are to be

written to the parent process and to ensure that only one child process can write to the return

value of the parent process at a time. The code below shows how “inlet” can be used in a

function used to calculate Fibonacci numbers.

cilk int fib (int n)

{

int x = 0;

inlet void summer (int result)

{

x += result;

return;

}

if (n<2) return n;

else {

summer(spawn fib (n-1));

summer(spawn fib (n-2));

sync;

return (x);

}

}

Abort:

The “abort” keyword is also used in advanced programming. Sometimes a process is spawned

in multiple processes, and at some point the result is reached and no further processing of

parallel processes is needed, so the abort keyword is used to stop the unneeded processing of

these other processes. For example, if a search algorithm works by means of having multiple

processes working in parallel on different parts of an array of integers or so, if one of the

processes finds the element the algorithm is searching for, it notifies the processor to stop the

execution of the rest of the processes in order not to waste time and resources.

Parallelism:-

 

 

 

 

 

 

Parallelism is an attribute whose value can be calculated to show how much beneficial was

Cilk++ for an algorithm over running the same algorithm in the ordinary sequential way. To

calculate the value of parallelism for an algorithm both work and depth has to be calculated for

this algorithm. Work is equivalent to the total number of operations done by the algorithm.

Depth is the length of the longest sequential dependence chain. Parallelism is equivalent to the

value of “work” over the value of “depth”. The figure below shows an example for parallelism.

Cilk work-stealing scheduler:-

 

The Cilk work-stealing scheduler is one of the most beneficial features of Cilk. The aim from it is

to ensure maximum utilization of processor’s capabilities for programs to execute in shorter

time. For example, in case of a processor with X cores and a program is running X processes on

its X cores waiting for them to finish in order to output the result; if one of these processes

finishes before the others, it starts taking tasks from the bottom of other process’s stack. The

figure below shows two processes running in parallel on two different cores, where one of

them finishes before the other then use the work-stealing feature to help the unfinished

process in its work.

Practical examples:-

Fibonacci:

Calculating Fibonacci numbers is a good example for an operation that would run in much less

time if processed in parallel on multiple cores of a processor. As shown below in the code and

figure, the recursion of the function is processed in parallel which resulted in executing the

function in much less time.

Without parallelization:

 

int fib (int n)

{

if (n <= 2) return 1;

else

{

int x, y;

x = fib (n-1);

y = fib (n-2);

return x + y;

}

}

With parallelization:

int fib (int n)

{

if (n <= 2) return 1;

else

{

int x, y;

x = cilk spawnfib (n-1);

y = cilk spawnfib (n-2);

cilk sync;

return x + y;

}

}

Quick Sort:

Quick sort is regarded as one of the efficient algorithms for sorting data structures of data

according to a specific criterion for sorting. Quick sort is one of those algorithms that would run

in much less time if implemented using Cilk++ to run on multi-core processors. The idea behind

quick sort is to take the 1st element of data and put it its correct order then repeat the same

thing for the items less than it and the items greater than it till all the elements are sorted.

Below is the algorithm along with a figure to demonstrate how a multi-core processor is utilized

to run it when implemented using Cilk++.

 

c static void QuickSort(index low, index high)

{

if (low < high) then

{

index p = Partition(low, high);

cilk spawn QuickSort (low, p -1);

cilk spawn QuickSort (p +1, high);

cilk sync;

}

}

public static index Partition(index low, index high)

{

Index i,

j,

p;

keytype pivot;

pivot = S[low];

j = low;

for (i = low + 1; i <= high; i++) do

if (S[i] < pivot) then

{

j++;

Exchange S[i] and S[j];

}

p = j;

Exchange S[low] and S[p];

return p;

}

Cilk++ VS OpenMP:-

OpenMP is another programming model for parallel programming. Below is a diagram showing

difference in execution time between Cilk++ and OpenMP when executing the quick sort code

for both the adaptive and parallel models.

Cilk++:-

Cilk++ is available for windows and linux operating systems and can be downloaded from the

following link: http://software.intel.com/en-us/articles/download-intel-cilksdk/

Cilk++ for MAC was a project under development that was never completed and not under

development anymore.

References:-

[1] Prof. Richard (Rich) Vuduc, Introduction to Cilk++. Georgia Tech, College of Computing,

2010.

[2] Mingdong Feng1, Charles E. Leiserson2, Efficient Detection of Determinacy Races in Cilk

Programs. 1 National University of Singapore, 2 MIT Laboratory for Computer Science.

[3] Matteo Frigo, Multithreaded Programming in Cilk. CILK ARTS.

[4] Charles E. Leiserson, Aske Plaat, Programming Parallel Applications in Cilk. MIT Laboratory

for Computer Science.

[5] Cilk 5.4.6 Reference Manual. MIT Laboratory for Computer Science.

[6] Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H.

Randall, Yuli Zhou, Cilk: An Efficient Multithreaded Runtime System. MIT Laboratory for

Computer Science.

 

 

Advertisements