openmp examples - part 1 · #pragma omp barrier performs a barrier synchronization between all the...
TRANSCRIPT
RecapExamples
Assignments
OpenMP Examples - Part 1
Mirto Musci, PhD Candidate
Department of Computer ScienceUniversity of Pavia
Processors Architecture Class, Fall 2011
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
OpenMP Syntax
Most of the constructs of OpenMP are pragmas
#pragma omp c o n s t r u c t [ c l a u s e [ c l a u s e ] . . . ]
(FORTRAN: !$OMP, not covered here)An OpenMP construct applies to a structural blockUsually enclosed by { }
In addition:
Several omp_<something> function callsSeveral OMP_<something> environment variables
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Controlling OpenMP Behavior
Function calls and (for each one) matching environmentvariables:
omp_set_num_threads(int)/omp_get_num_threads()
Control the number of threads used for parallelization(maximum in case of dynamic adjustment)
Must be called from sequential code
Also can be set by OMP_NUM_THREADS environmentvariable
omp_get_thread_num()
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Controlling OpenMP Behavior II
omp_get_num_procs()
How many processors are currently available?
omp_set_nested(int)/omp_get_nested()
Enable nested parallelism
omp_in_parallel()
Am I currently running in parallel mode?
omp_get_wtime()
A portable way to compute wall clock time
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Parallel Regions
Main construct:#pragma omp parallel
De�nes a parallel region overstructured block of code
Threads are created as �parallel�pragma is crossed
Threads block at end of region(implicit barrier)
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Work Sharing: For
Used to assign each thread anindependent set of iterations
Threads must wait at the end
Can combine the directives:
#pragma omp parallel for
Only simple kinds of for loops:
Only one signed integer variableInitialization: var=initComparison: var op last
op: <, >, <=, >=Increment: var++, var--,var+=incr, var-=incr, etc.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Work Sharing: Sections
answer1 = long_computation_1 ( ) ;answer2 = long_computation_2 ( ) ;i f ( answer1 != answer2 ) { . . . }
How to parallelize? These are just two independentcomputations!
#pragma omp s e c t i o n s{#pragma omp s e c t i o nanswer1 = long_computation_1 ( ) ;#pragma omp s e c t i o nanswer2 = long_computation_2 ( ) ;
}i f ( answer1 != answer2 ) { . . . }
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Schedule Clause: Controlling Work Distribution
schedule(static [, chunksize])
Default: chunks of approximately equivalent size, one to eachthreadIf more chunks than threads: assigned in round-robin to thethreadsWhy might we want to use chunks of di�erent size?
schedule(dynamic [, chunksize])
Threads receive chunk assignments dynamicallyDefault chunk size = 1 (why?)
schedule(guided [, chunksize])
Start with large chunksThreads receive chunks dynamicallyChunk size reduces exponentially, down to chunksize
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Data Visibility
Shared Memory programming model
Most variables (including locals) are shared by default(unlike Pthreads!)
{i n t sum = 0 ;#pragma omp p a r a l l e l f o rf o r ( i n t i =0; i<N; i++) sum += i ;
}
Global variables are shared
Some variables can be private
Automatic variables inside the statement blockAutomatic variables in the called functionsVariables can be explicitly declared as private. In that case, alocal copy is created for each thread
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Overriding Storage Attributes
private:
A copy of the variable iscreated for each thread.
No connection between theoriginal variable and theprivate copies
Can achieve the same usingvariables inside { }
i n t i ;
#pragma omp p a r a l l e l f o r \p r i v a t e ( i )
f o r ( i =0; i<n ; i++) { . . . }
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Overriding Storage Attributes II
�rstprivate:
Same, but the initial value iscopied from the main copy
lastprivate:
Same, but the last value iscopied to the main copy
i n t i d x =1;i n t x = 10 ;
#pragma omp p a r a l l e l f o r \f i r s p r i v a t e ( x ) \l a s t p r i v a t e ( i d x )
f o r ( i =0; i<n ; i++) {i f ( data [ i ]==x ) i d x = i ;
}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Single
#pragma omp single
Only one of the threads will execute the following block ofcode
The rest will wait for it to completeGood for non-thread-safe regions of code (such as I/O)Must be used in a parallel regionApplicable to parallel for sections
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Master
#pragma omp master
The following block will be executed by the master thread
No synchronization involved
Applicable only to parallel sections
#pragma omp p a r a l l e l{
do_prep roce s s i ng ( ) ;
#pragma omp s i n g l eread_input ( ) ;#pragma omp masternot i fy_input_consumed ( ) ;
do_proces s ing ( ) ;}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Critical Sections
#pragma omp critical [name]
Standard critical section functionality
Critical sections are global in the program
Can be used to protect a single resource in di�erent functions
Critical sections are identi�ed by the name
All the unnamed critical sections are mutually exclusivethroughout the programAll the critical sections having the same name are mutuallyexclusive between themselves
i n t x = 0 ;#pragma omp p a r a l l e l s ha r ed ( x ){#pragma omp c r i t i c a lx++;
}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Atomic Execution
Critical sections on the cheap
Protects a single variable updateCan be much more e�cient (a dedicated assembly instructionon some architectures)
#pragma omp atomicupdate_statement
Update statement is one of: var= var op expr, var op= expr,var++, var�.
The variable must be a scalarThe operation op is one of: +, -, *, /, ^, &, |, <�<, >�>The evaluation of expr is not atomic!
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Ordered
#pragma omp orderedstatement
Executes the statement in the sequential order of iterations
Example:
#pragma omp p a r a l l e l f o rf o r ( j =0; j<N; j++) {
i n t r e s u l t = heavy_computat ion ( j ) ;#pragma omp o rde r edp r i n t f ( " computat ion(%d ) = %d\n" , j , r e s u l t ) ;
}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Barrier synchronization
#pragma omp barrier
Performs a barrier synchronization between all the threads in ateam at the given point.
Example:
#pragma omp p a r a l l e l{
i n t r e s u l t = heavy_computat ion_part1 ( ) ;#pragma omp atomicsum += r e s u l t ;#pragma omp b a r r i e rheavy_computat ion_part2 ( sum) ;
}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
SyntaxParallelization ConstructsData EnvironmentSynchronization
Reduction
f o r ( j =0; j<N; j++) {sum = sum+a [ j ]∗ b [ j ] ;
}
How to parallelize this code?
sum is not private, but accessing it atomically is too expensiveHave a private copy of sum in each thread, then add them up
Use the reduction clause!#pragma omp parallel for reduction(+: sum)
Any associative operator must be used: +, -, ||, |, *, etc.The private value is initialized automatically (to 0, 1, ~0 . . . )
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Practical Stu�
Login on your machine, open a terminal
Copy the whole content of the shared exercise directory toyour home directory
cp −r /home/ e t c / scambio / p a r a l l e l i s m o /∗ ~
Open the source �les with your favorite editor (e.g. gedit)
In case you need to consult the OpenMP reference manual,change your browser connection setting to use odino.unipv.itas proxy (port 8080)
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 1: Hello World
Take a moment to examine the source code and note howOpenMP directives and library routines are being used.
Use the following command to compile the code:
gcc −fopenmp omp_hello . c −o h e l l o
To run the code, simply type the command hello and theprogram should run.
How many threads were created? Why?
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 1: Hello World
Vary the number of threads and re-run Hello World
Set the number of threads to use by means of theOMP_NUM_THREADS environment variable.
OMP_NUM_THREADS=4
Do you know other ways to set the number of threads?
Your output should look similar to below. The actual order ofoutput strings may vary.
He l l o World from th r ead = 0Number o f t h r e ad s = 4He l l o World from th r ead = 3He l l o World from th r ead = 1He l l o World from th r ead = 2
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 2: Environment Information
Starting from scratch, write a simple program that obtainsinformation about your openMP environment.
Alternately, you can modify the "hello" program to do this.
Using the appropriate openMP functions, have the masterthread query and print the following:
The number of processors availableThe number of threads being usedThe maximum number of threads availableIf you are in a parallel regionIf dynamic threads are enabledIf nested parallelism is supported
If you need help, you can consult the omp_getEnvInfoexample �le.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 3: Parallel For
This example demonstrates use of the OpenMP forwork-sharing construct.
It speci�es dynamic scheduling of threads and assigns aspeci�c number of iterations to be done by each thread.
After reviewing the source code, compile and run theexecutable. (Assuming OMP_NUM_THREADS still set to 4).
gcc −fopenmp omp_workshare1 . c −o workshare1workshare1 | s o r t
Review the output. Note that it is piped through the sortutility. This will make it easier to view how loop iterationswere actually scheduled.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 3: Parallel For
Run the program a couple more times and review the output.What do you see?
Typically, dynamic scheduling is not deterministic.
Everytime you run the program, di�erent threads can rundi�erent chunks of work.
It is even possible that a thread might not do any workbecause another thread is quicker and takes more work.
It might be possible for one thread to do all of the work.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 3: Parallel For
Edit the workshare1 source �le and switch to static scheduling.
Recompile and run the modi�ed program. Notice thedi�erence in output compared to dynamic scheduling.
Speci�cally, notice that thread 0 gets the �rst chunk, thread 1the second chunk, and so on.
Rerun the program. Does the output change?
With static scheduling, the allocation of work is deterministicand should not change between runs.
Every thread gets work to do.
Re�ect on possible performance di�erences between dynamicand static scheduling.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 4: Sections
This example demonstrates use of the OpenMP sectionswork-sharing construct.
Note how the parallel region is divided into separate sections,each of which will be executed by one thread.
As before, compile and execute the program after reviewing it.
gcc −openmp omp_workshare2 . c −o workshare2workshare2
Run the program several times and observe any di�erences inoutput.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 4: Sections
Because there are only two sections, you should notice thatsome threads do not do any work.
You may/may not notice that the threads doing work can vary.
For example, the �rst time thread 0 and thread 1 may do thework, and the next time it may be thread 0 and thread 3.
It is even possible for one thread to do all of the work.
Which thread does work is non-deterministic in this case.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 5: Orphan Directive
This example computes a dot product in parallel.
It di�ers from previous examples because the parallel loopconstruct is orphaned
It's contained in a subroutine outside the lexical extent of themain program's parallel region.
After reviewing the source code, compile and run the program
gcc −fopenmp omp_orphan . c −o orphanorphan | s o r t
Note the result...and the fact that this example will come backto haunt as omp_bug6 later.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Exercise 6: Matrix Multiply
This example performs a matrix multiply by distributing theiterations of the operation between available threads.
After reviewing the source code, compile and run the program
gcc −fopenmp omp_mm. c −o matmult
Review the output. It shows which thread did each iterationand the �nal result matrix.
Run the program again, however this time sort the output toclearly see which threads execute which iterations:
matmult | s o r t | g rep Thread
Do the loop iterations match the schedule(static, chunk)clause for the matrix multiple loop in the code?
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
When things go wrong...
There are many things that can go wrong when developingOpenMP programs.
The omp_bugX.X series of programs demonstrate just a few.
See if you can �gure out what the problem is with each caseand then �x it.
The buggy behavior will di�er for each example. Some hintsare provided in the next slide.
More in details explanations are provided in the following.
Don't cheat!
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Hints
Code Hint
omp_bug1 Fails compilation. Solution provided.
omp_bug2 Thread identi�ers are wrong. Wrong answers.
omp_bug3 Run-time error, hang.
omp_bug4 Causes a segmentation fault. Script provided.
omp_bug5 Program hangs. Solution provided.
omp_bug6 Failed compilation.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug1
This exercise attempts to show the use of the combinedparallel for directive.
It fails because the loop does not come immediately after thedirective.
Corrections include removing all statements between theparallel for directive and the actual loop.
Logic is added to preserve the ability to query the thread idand print it from inside the loop.
Notice the use of the �rstprivate clause to intialize the �ag.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug2
The bugs in this case are caused by neglecting to scope the tidand total variables as private.
By default, most OpenMP variables are scoped as shared.
These variables need to be unique for each thread.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug3
The run time error is caused by by the omp barrier directive inthe print_results subroutine.
By de�nition, an omp barrier can not be nested outside thestatic extent of a sections directive.
In this case it is orphaned outside the calling sections block.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug4
OpenMP thread stack size is an implementation dependentresource.
In this case, the array is too large to �t into the thread stackspace and causes the segmentation fault.
Solution provided - note that it is a script and will need to be"sourced".
For example: source omp_bug4fig.
Be sure to examine the solution �le to see what's going on.
In the last line you may need to change the name of theexecutable to match yours.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug5
The problem is that the �rst thread acquires locka and thentries to get lockb before releasing locka .
Meanwhile, the second thread has acquired lockb and thentries to get locka before releasing lockb .
The solution overcomes the deadlock by using locks correctly.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
BasicBug Fixing
Explanations: omp_bug6
With orphaned directives, the correct scoping of variables iscritical.
The error occurs because the sum variable is scoped incorrectly.
See the omp_orphan routine for one example of correctscoping.
Note that there are other ways.
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Numerical Integration
Mathematically, we know that:∫ 1
0
4.0
(1+ x2)dx = π
We can approximate the integralas a sum of rectangles:
N
∑i=0
F (xi )∆x ≈ π
Where each rectangle has width∆x and height F (xi ) at themiddle of interval i .
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Serial Code
s t a t i c long num_steps=100000;double s tep , p i ;
vo id main ( ){ i n t i ;
double x , sum = 0 . 0 ;
s t e p = 1 . 0/ ( double ) num_steps ;
f o r ( i =0; i< num_steps ; i++){x = ( i +0.5)∗ s t e p ;sum = sum + 4 . 0 / ( 1 . 0 + x∗x ) ;
}p i = s t ep ∗ sum ;p r i n t f ( "Pi = %f \n" , p i ) ;
}
Parallelize thenumericalintegration codeusing OpenMP
What variables canbe shared?
What variables needto be private?
What variablesshould be set up forreductions?
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Parallel Code
s t a t i c long num_steps=100000;double s tep , p i ;
vo id main ( ){ i n t i ;
double x , sum = 0 . 0 ;
s t e p = 1 . 0/ ( double ) num_steps ;
#pragma omp p a r a l l e l f o r \p r i v a t e ( x ) r e d u c t i o n (+:sum)
f o r ( i =0; i< num_steps ; i++){x = ( i +0.5)∗ s t e p ;sum = sum + 4 . 0 / ( 1 . 0 + x∗x ) ;
}p i = s t ep ∗ sum ;p r i n t f ( "Pi = %f \n" , p i ) ;
}
Parallelization codeis a one-liner!
sum is a reduction,hence shared,variable
i is private since itis the loop variable
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Assignment
Modify the pi calculation example, so you can:
vary, at run time, the number of steps
Will the calculated pi value change?
get the total time for the calculation using omp_get_wtime
Implement the computational core in a separate function, andcall it varying the number of thread spawned
Observe di�erences in elapsed timeWhat happens if you use more threads than availableprocessors?
Advanced: reimplement not using the reduction clause
it is slower? faster?
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Outline
1 RecapSyntaxParallelization ConstructsData EnvironmentSynchronization
2 ExamplesBasicBug Fixing
3 AssignmentsAssigment 1: PiAssigment 2: Quicksort
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Quicksort Algorithm
Given an array of n elements (e.g., integers). If array only containsone element: return. Else:
Pick one element to use as pivot.
Partition elements into two sub-arrays:
Elements less than or equal to pivotElements greater than pivot
Recursively quicksort two sub-arrays
Return results
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Considerations
There are a number of ways to pick the pivot element
Commonly �rst or last element, but bad performance if thearray is already orderedRandom index or middle-point index solve the problem
After partitioning, the sub-arrays can bestored in the originaldata array.
Partitioning loops through, swapping elements
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Example
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Serial Code
vo id q u i c k s o r t ( i n t a [ ] , i n t l ower , i n t upper ){
i n t i ;i f ( upper > lowe r ){
i = p a r t i t i o n ( a , lower , upper ) ;q u i c k s o r t ( a , lower , i − 1 ) ;q u i c k s o r t ( a , i + 1 , upper ) ;
}}
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
RecapExamples
Assignments
Assigment 1: PiAssigment 2: Quicksort
Assigment
Re�ne the serial implementation provided
Try to parallelize the code using OpenMP... is not easy!
Try with section constructs, or experment with task
Remember the code is recursive!
Call omp_set_nested(1)Somehow limit thread spawning
Carefully measure performance with omp_get_wtime
Mirto Musci, PhD Candidate OpenMP Examples - Part 1
Appendix For Further Reading
For Further Reading
Blaise BarneyOpenMP Exercise, 2011https://computing.llnl.gov/tutorials/openMP/
exercise.html
Mirto Musci, PhD Candidate OpenMP Examples - Part 1