algorithms and their applications cs2004 (2012-2013) dr stephen swift 17.1 exam revision

36
Algorithms and their Applications CS2004 (2012-2013) Dr Stephen Swift 17.1 Exam Revision

Upload: jade-waters

Post on 03-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Algorithms and their Applications

CS2004 (2012-2013)

Dr Stephen Swift

17.1 Exam Revision

CS2004 Exam Revision Slide 2

IntroductionThis lecture is a revision lecture for the examination for this moduleThe exam is on Thursday 7th May at 2pm

Numerous locations…Please note I was sent a lot of topics for this revision lecture

We can only cover some of them I am afraid...Some of them were sent rather late! ;-)

In this lecture we are going to cover:The assessmentThe format of the examExam topicsBrief revision of some topicsSome example questions

CS2004 Exam Revision Slide 3

Assessment

The overall structure for the assessment is as follows:

60% Coursework/Worksheets These should now be done!The last chance for assessment is this weekAt the usual times…

40% Exam (pending)

CS2004 Exam Revision Slide 4

Exam

A three hour examThe location should now be available

This will consist of:A multiple choice part A worth 40%An essay-type question part B worth 60%

All questions should be attempted

CS2004 Exam Revision Slide 5

What to Revise – Part 1

The exam will cover the theoretical aspects of the moduleThere will be no programming needed

No questions on Java or EclipseHowever you may need to understand and/or write some pseudo code

CS2004 Exam Revision Slide 6

What to Revise – Part 2Topics could include (not restricted to):

Algorithmic conceptsWhat is an algorithm, a program, etc…

Time Complexity and Asymptotic NotationT(n) and O(n)

Data structuresStacks, lists, arrays, queues, etc…

Sorting AlgorithmsBubblesort, Quicksort, etc…

CS2004 Exam Revision Slide 7

What to Revise – Part 3Topics could include (not restricted to):

Graph Traversal AlgorithmsDepth First, Breadth First, A*, MST, etc…

SearchSearch, Search Space, Fitness, Parameter optimisation, etc...

Heuristic Search MethodsHC, SHC, RRHC, SA, ILS, etc...

Evolutionary Computation and Other Methods

Genetic Algorithms, PSO, ACO, etc...

ApplicationsBin Packing, Data Clustering, TSP, etc...

CS2004 Exam Revision Slide 8

Exam – Part A – Part 1

Multiple choice questions (40%)

Choose an answer from a list of options20 questions of 2 marks eachSpend about 3 minutes per questionUse the multiple choice answer sheet

CS2004 Exam Revision Slide 9

Exam – Part A – Part 2

Good strategies for multiple choice questions:

Read through all of the questions firstAnswer the ones that you know firstDo NOT spend too much time on a single questionOften ruling out answers can reduce the options down to a fewDo not leave any blank!

CS2004 Exam Revision Slide 10

Exam – Part B – Part 1

Essay type questions4 questions of 15 marksSpend approximately 25 minutes per questionWrite your answers in the answer book

CS2004 Exam Revision Slide 11

Exam – Part B – Part 2

Good strategies for essay type questions:

Read through all of the questions firstAnswer the ones that you know firstDo NOT spend too much time on a single questionSketching a draft answer can help in laying out complex answersCross out anything you do not want marked

CS2004 Exam Revision Slide 12

Requested TopicsThe following subjects were emailed to me for revision topicsNote that I can only briefly go over them today given the time…

Ant Colony OptimisationParticle Swarm OptimisationData ClusteringBig-O Notation

CS2004 Exam Revision Slide 13

Swarm Intelligence Algorithms

Study of collective behaviour of decentralised, self-organised systems

No central controlOnly simple rules for each individualMost Popular Algorithms

Ant Colony Optimisation (ACO)Particle Swarm Optimisation (PSO)

CS2004 Exam Revision Slide 14

ACO – Key ConceptsUsed for solving graph based problemsAnts (blind) navigate from nest to food sourcesThe shortest path is discovered via pheromone trailsIndirect coordination/communication between agents or actions

Stigmergy

Route selection is determined according to the amount of pheromone on each edgePheromone levels on edges is changed

Ants deposit pheromone on the edges they traversePheromone evaporates over time

CS2004 Exam Revision Slide 15

PSO – Key ConceptsUsed for solving parameter estimation/optimisation based problemsIt uses a number of agents (particles) that constitute a swarm moving around in the search space looking for the best solutionEach particle is treated as a point in an n-dimensional space which adjusts its “flying” according to its own flying experience as well as the flying experience of other particlesA record is kept of each particles position, personal best (location) and the global best (location)

Each particle is accelerated towards its personal best and the global best

CS2004 Exam Revision Slide 16

Data Clustering – Part 1Data Clustering is a common technique for data

analysis, which is used in many fields, including

machine learning, data mining, pattern recognition,

image analysis and bioinformaticsData Clustering is the process of arranging objects (as

points) into a number of sets (k) according to “distance” Each set (ideally) shares some common trait - often

similarity or proximity for some defined distance measureEach set will be referred to as a cluster/groupFor the purposes of this module, each set is mutually

exclusive, i.e. an item cannot be in more than one cluster

CS2004 Exam Revision Slide 17

Data Clustering – Part 2The data that we are clustering usually consists of a number of examples (rows) (n) where we have measured a number of features (variables) (m =3 in the example below)We want to cluster the rows together based on how similar their features areWe shall assume that the data we are clustering is a table or matrix X, where xi is the ith row of X and xij is the jth variable (feature) or row i

For example x92 is 2.9 in the table below:Sample Sepal Length in cm Sepal Width in cm Petal Length in cm

1 5.1 3.5 1.42 4.9 3.0 1.43 4.7 3.2 1.34 4.6 3.1 1.55 5.0 3.6 1.46 5.4 3.9 1.77 4.6 3.4 1.48 5.0 3.4 1.59 4.4 2.9 1.410 4.9 3.1 1.5Etc... 5.4 3.7 1.5

CS2004 Exam Revision Slide 18

Data Clustering – Part 3

Data

ClusteringMethod

DataSimilarity

Number ofClusters

ClusterWorth

CS2004 Exam Revision Slide 19

Representing a ClusterA cluster will be represented as a vector C where ci=j means that object/item/row i is in cluster jFor example C = {1,2,3,2,3,3} (k=3)Largest cluster

The number that appears the most in the vector/cluster

Number of clustersThe most frequent number in the vector/count the clusters

1

23

6Cluster 1

Cluster 2

Cluster 3

4 5

CS2004 Exam Revision Slide 20

Big-O NotationUsed to compare algorithms

Why?

Gives a long term, large input size, limit on the growth rate of the effort an algorithm takesWe measure an algorithms effort in terms of counting primitive operationsWe derive a function T(n) (exact count) from the pseudo code, parameterised in terms of the algorithms input sizeSimple rules are applied to T(n) to get O(n)

What are they?

We will go over a worked example later…

CS2004 Exam Revision Slide 21

Example Multiple Choice – Part 1

What is pseudo code?a) A Java programb) A programming language independent description of an algorithmc) The intermediate code produced by the Java compilerd) An algorithmic description of a computer programe) None of the above

CS2004 Exam Revision Slide 22

Example Multiple Choice – Part 2Algorithm A runs in time complexity O(n2), algorithm B in O(n3) and algorithm C in O(n2). Which of the following statements is true?

a) Algorithm B is always faster than Cb) Algorithm A and C are the samec) Algorithm A is slower than Algorithm Bd) Algorithm A and C are asymptotically similar in performancee) None of the above

CS2004 Exam Revision Slide 23

Example Multiple Choice – Part 3Which of the following data structures would be appropriate for storing a list of names and addresses?

a) A stackb) A queuec) An arrayd) A linked liste) It depends on the application

CS2004 Exam Revision Slide 24

Example Multiple Choice – Part 4Which of the following is not part of a graph:

a) An arrowb) A nodec) An edged) A vertexe) An arc

CS2004 Exam Revision Slide 25

Example Multiple Choice – Part 5What is the fastest time that the following list of number could be sorted in descending order:

{10,9,8,7,6,5,4,3,2,1}a) No time – it is already sortedb) O(n)c) O(nln(n))d) O(n2)e) It depends on the algorithm

CS2004 Exam Revision Slide 26

Example Multiple Choice – Part 6A* search can be used for:

a) Finding the shortest route between two stations on the undergroundb) Optimising the parameters of a Neural Networkc) Organising similar objects into mutually exclusive setsd) Creating compilation CDse) Decision Planning

CS2004 Exam Revision Slide 27

Example Multiple Choice – Part 7Which of the following Heuristic Search Methods is the odd one out?

a) Random Restart Hill Climbingb) Stochastic Hill Climbingc) Genetic Algorithmd) Hill Climbinge) Simulated Annealing

CS2004 Exam Revision Slide 28

Example Multiple Choice – Part 8Which of the following statements is true regarding bin packing and data clustering?

a) Bin packing is slower than data clusteringb) Bin packing is used on small sized objects whilst data clustering is used on large objectsc) Both algorithms arrange objects into groupsd) Bin packing is used on 1-dimensional data whilst data clustering can be used on n-dimensional datae) They perform the same function

CS2004 Exam Revision Slide 29

Example Essay Type – Part 1Consider the following algorithm:

Algorithm 1. MaxArray(A)Input: An n row by m column Array A1) Let max = element 1,1 (A(1,1)) of Array A2) For i = 1 to n3) For j = 1 to m4) If A(i,j) > max Then5) Let max = A(i,j)6) End If7) End For8) End ForOutput: max- the largest element in array A

CS2004 Exam Revision Slide 30

Example Essay Type – Part 2For a total of 15 marks:Describe the algorithm in words (not pseudo code)Compute T(n) for algorithm 1Compute O(n) for algorithm 1How would you modify the algorithm to create a new algorithm MinArray?

CS2004 Exam Revision Slide 31

Example Essay Type – Part 3The algorithm is designed to locate the largest value in an array, passed as a parameter. It assumes that the maximum is equal to the first element (1,1) and then systematically examines each element in turn, updating the maximum if the current item under scrutiny is larger than the maximum. This maximum value is then returned by the algorithm.

CS2004 Exam Revision Slide 32

Example Essay Type – Part 4For T(n) count all variables reads, writes and operators, note we have two input sizes n and m

Algorithm 1. MaxArray(A)Input: An n row by m column Array A1) Let max = element 1,1 (A(1,1)) of Array A 22) For i = 1 to n n 3) For j = 1 to m n (m)4) If A(i,j) > max Then n m (5) [Assume

worse]5) Let max = A(i,j) n m (4)6) End If none7) End For none8) End For noneOutput: max- the largest element in array A none

CS2004 Exam Revision Slide 33

Example Essay Type – Part 5For T(n) continued...

2nnm5nm4nmT(n) = 10nm + n + 2

O(n) = nmTo create algorithm MinArray

We change the > on line 4 to a <We would also rename the algorithm name and results variable

CS2004 Exam Revision Slide 34

Example Essay Type – Part 6

B.3.1) Describe the main similarities and differences between a Genetic Algorithm (GA) and an Evolutionary Program (EP). [4 marks]

Taken from Part B of the CS2004 past exam paper 2011-2012

CS2004 Exam Revision Slide 35

Example Essay Type – Part 7Both maintain a populationOnly a GA has crossoverBoth have mutation but an EP has a much more complex mutation operator

All individual mutate in an EPOnly some in a GA

Both have selectionHowever a GA usually uses the Roulette Wheel which allows an individual to be selected zero or more timesAn EP uses Tournament Selection which allows an individual to be selected zero times or once

CS2004 Exam Revision Slide 36

Next Topic

There is none!Hopefully see you next yearGood luck!Any questions…