graph models 3

Upload: adalb3rt

Post on 18-Jul-2015

25 views

Category:

Documents


0 download

TRANSCRIPT

Jo Ellis-Monaghan St. Michaels College, Colchester, VT05439 e-mail: [email protected] website: http://academics.smcvt.edu/jellis-monaghan A Graph or Networkis a set of vertices (dots) with edges (lines) connecting them.

Two vertices are adjacent if there is a line between them.The vertices A and B above are adjacent because the edge AB is between them. An edge is incident to each of the vertices which are its end points.

The degree of a vertex is the number of edges sticking out from it. Graphs and Networks A B C D A B C D A multiple edge A loop A B C D The Kevin Bacon Game or 6 Degrees of separation http://www.spub.ksu.edu/issues/v100/FA/n069/fea-making-bacon-fuqua.html Bacon Number # of People 0 11 17662 1418403 3856704 935985 73046 9207 1158 61Total number of linkable actors: 631275 Weighted total of linkable actors: 1860181 Average Bacon number: 2.947Connery Number # of people 0 11 22162 2042693 3305914 328575 29486 4097 468 8 Average Connery Number:2.706Kevin Bacon is not even among the top 1000 most connected actors in Hollywood (1222th). Data from The Oracle of Bacon at UVA Maximal Matchings in Bipartite Graphs Start with any matching Find an alternating path Start at an unmatched vertex on the left End at an unmatched vertex on the right Switch matching to nonmatching and vice versa A maximal matching! A Bipartite Graph The small world phenomenon Stanley Milgram sent a series of traceable letters from people in the Midwest to one of two destinations in Boston. The letters could be sent only to someone whom the current holder knew by first name. Milgram kept track of the letters and found a median chain length of about six, thus supporting the notion of "six degrees of separation."http://mathforum.org/mam/04/poster.html SocialNetworks Stock Ownership (2001 NYStock Exchange) Childrens Social Network Social Network of Sexual Contacts http://mathforum.org/mam/04/poster.html Infrastructure and Robustness MapQuest JetBlue Scale Free Distributed Number of vertices Vertex degree Number of vertices Vertex degree Rolling Blackouts inAugust 2003 http://encyclopedia.thefreedictionary.com/_/viewer.aspx?path=2/2f/&name=2003-blackout-after.jpg Some Networks are more robust than others. But how do we measure this? http://www.caida.org/tools/visualization/mapnet/Backbones/ A network modeled by a graph (electrical, communication, transportation) A functional network (can get from any vertex to any other along functioning edges) A dysfunctional network(vertices s and t cant communicate) s t Question:If each edge operates independently with probability p, what is the probability that the whole network is functional? If an edge is working (this happens with probability p), its as thought the two vertices were touchingi.e. just contract the edge: If an edge is not working (this happens with probability 1-p), it might as well not be therei.e. just delete it: Thus, if R(G;p) is the reliability of the network G where all edges function with a probability of p,and e is not a bridge nor a loop, then R(G;p) =(1-p)R(G-e;p)+p R(G/e;p)Deletion and Contraction is a Natural Reduction for Network Reliability = (1-p)p2+ p(1-p)p + p2 +p (1-p)+ p p= (1-p)p2 (1-p) +p Reliability Example Note that if every edge of the network is a bridge (i.e. the network is a disjoint union of trees), then R(G;p) = (p)E,where E is the number of edges. Also note that R(loop;p) = 1 E.g.: So R(G;p) = 3p2- 2p3 gives the probability that the network is functioning.E.g. R(G; .5)=.5625 Bothersome question:Does the order in which the edges are deleted and contracted matter? Conflict Scheduling Draw edges between classes with conflicting times Color so that adjacent vertices have different colors.Minimum number of colors = minimum required classrooms. A E DC B A E DC B Coloring Algorithm The Chromatic Polynomial counts the ways to vertex color a graph: C(G, n ) = # proper vertex colorings of G in n colors. += G G\e G-e Recursively:Let e be an edge of G .Then, C G n C G e n C G e n ; ( ; ) \ ; ( ) = ( ) C n n - ( ) = ;- =n(n-1)2+n(n-1)+ 0= n2(n-1) n(n-1)2++ = = Frequency Assignment Assign frequencies to mobile radios and other users of the electromagnetic spectrum. Two customers that are sufficiently close must be assigned different frequencies, while those that are distant can share frequencies. Minimize the number of frequencies. Vertices:users of mobile radios Edges:between users whosefrequencies might interfere Colors:assignments of different frequencies Need at least as many frequencies as the minimum number of colors required! Conflict Scheduling Register Allocation Assign variables to hardware registers during program execution. Variables conflict with each other if one is used both before and after the other within a short period of time (for instance, within a subroutine). Minimize the use of non-register memory. Vertices:the different variables Edges:between variables which conflict with each other Colors: assignment of registers Need at least as many registers as the minimum number of colors required! The Ising Model Consider a sheet of Metal: It has the property that at low temperatures it is magnetized, but as the temperature increases, the magnetism melts away. We would like to model this behavior.We make some simplifying assumptions to do so The individual atoms have a spin, i.e., they act like little bar magnets, and can either point up (a spin of +1), or down (a spin of 1). Neighboring atoms with different spins have an interaction energy, which we will assume is constant. The atoms are arranged in a regular lattice. At low temperature coalescing states are more probable and there is non-zero magnetization As the temperature rises, the states become more random, and the magnetization melts away Applet by Peter Young at http://bartok.ucsc.edu/peter/java/ising/keep/ising.html Magnetization =1iNs,Energy =1i jNs s where N is the number of lattice points. Critcal Temperature is2 ln(1 2) +Lattice and Hamiltonian A choice of spins at each point gives what is called a state of the lattice: ( )( ),i jH w f s s = The Hamiltonian (total energy) of a statew iswhere the sum is over all adjacent points, and f is 0 if the spins are the same and 1 if they are different. H(w) is just the total number of edges in the state with different spins on their endpoints. ( )( )all states H wH wwee||A Little Thermodynamics The probability of a state occurring is: Here1kT| = , where T is the temperature and k is the Boltzman constantjoules/Kelvin.

231.38 10The numerator is easy.The denominator, called the partition functionis the interesting (hard) piece. It has a deletion-contraction reduction! Let( )( )all states ;H wwP G e ||= .Then ( ) ( )( )( ); ; 1 / ; P G P G e e P G e|| | | = + . IBMs objective is to check a chips design and find all occurrences of a simple pattern to: Find possible error spots Check for already patented segments Locate particular devices for updating Rectilinear pattern recognition joint work with J. Cohn (IBM), R. Snapp and D. Nardi (UVM)The Haystack The Needle Pre-Processing Algorithm is cutting edge, and not currently used for this application in industry. BEGIN /*GULP2A CALLED ON THU FEB 21 15:08:23 2002 */ EQUIV11000MICRON+X,+Y MSGPER -1000000-10000001000000100000000 HEADER GYMGL1'OUTPUT 2002/02/21/14/47/12/cohn' LEVEL PC LEVEL RX CNAME ULTCB8AD CELL ULTCB8AD PRIME PGON N RX146792378030014681807803001468180780600 + 1469020780600146902078030014691817803001469181 + 781710146902078171014690207814001468180781400 + 14681807817101467923781710 PGON N PC146850078210014683007821001468300781700 + 1468260781700146826078030014685007803001468500 + 780500146838078050014683807815001468500781500 RECT N PC14678007803451503298 ENDMSG Two different layers/rectangles are combined into one layer that contains three shapes; one rectangle (purple) and two polygons (red and blue) (Raw data format) Both target pattern and entire chip are encoded like this, with the vertices also holding geometric information about the shape they represent.Then we do a depth-first search for the target subgraph.The addition information in the vertices reduces the search to linear time, while the entire chip encoding is theoretically N2 in the number of faces, but practically NlogN. Linear time subgraph search for target Netlist Layout (joint work with J. Cohn, A. Dean, P. Gutwin, J. Lewis, G. Pangborn)How do we convert this into this? A set S of vertices ( the pins)hundreds of thousands. A partition P1 of the pins (the gates) 2 to 1000 pins per gate, average of about 3.5. A partition P2 of the pins (the wires)again 2 to 1000 pins per wire, average of about 3.5. A maximum permitted delay between pairs of pins. Netlist Example Gate Pin Wire The Wires Placement layer-gates/pins go here Vias (vertical connectors) Horizontal wiring layer Vertical wiring layer Up to 12 or so layers The Wiring Space The general idea Place the pins so that pins are in their gates on the placement layer with non-overlapping gates. Place the wiresin the wiring space so that the delay constrains on pairs of pins are met, where delay is proportional to minimum distance within the wiring, andvia delay is negligibleLots of ProbLems. Identify Congestion Identify dense substructures from the netlist Develop a congestion metric A B C D F G EH Congested area Congested area What often happens What would be good Automate Wiring Small Configurations Some are easy to place and route Simple left to right logic No / few loops (circuits) Uniform, low fan-out Statistical models work Someare very difficult E.g. Crossbar Switches Many loops (circuits) Non-uniform fan-out Statistical models dont work SPRING EMBEDDING Random layout Spring embedded layout Nano-Origami: Scientists At Scripps Research Create Single, Clonable Strand Of DNA That Folds Into An Octahedron A group of scientists at The Scripps Research Institute has designed, constructed, and imaged a single strand of DNA that spontaneously folds into a highly rigid, nanoscale octahedron that is several million times smaller than the length of a standard ruler and about the size of several other common biological structures, such as a small virus or a cellular ribosome. Biomolecular constructions http://www.sciencedaily.com/releases/2004/02/040212082529.htm DNA Strands Forming a Cube http://seemanlab4.chem.NYU.edu Assuring cohesion A problem from biomolecular computingphysically constructing graphs by zipping together single strands of DNA (not allowed) N. Jonoska, N. Saito, 02A Characterization A theorem of C. Thomassen specifies precisely when a graph may be constructed from a single strand of DNA, and theorems of Hongbing and Zhu to characterize graphs that require at least m strands of DNA in their construction. Theorem:A graph G may be constructed from a single strand of DNA if and only if G is connected, has no vertex of degree 1, and has a spanning tree T such that every connected component ofG E(T)has an even number of edges or a vertex v with degree greater than 3. Oriented Walk Double Covering and Bidirectional Double Tracing Fan Hongbing, Xuding Zhu, 1998 The authors of this paper came across the problem of bidirectional double tracing by considering the so called garbage collecting problem, where a garbage collecting truck needs to traverse each side of every street exactly once, making as few U-turns (retractions) as possible. L. M. Adleman, Molecular Computation of Solutions to Combinatorial Problems.Science, 266 (5187) Nov. 11 (1994) 1021-1024. DNA sequencing (joint work with I. Sarmiento) AGGCTC AGGCT GGCTC TCTAC CTCTA TTCTA CTACT It is very hard in general to read off the sequence of a long strand of DNA.Instead, researchers probe for snippets of a fixed length, and read those. The problem then becomes reconstructing the original long strand of DNA from the set of snippets. Enumerating the reconstructions This leads to a directed graph with the same number of in-arrows as out arrows at each vertex. The number of reconstructions is then equal to the number of paths through the graph that traverse all the edges in the direction of their arrows. Graph Polynomials Encodethe Enumeration A very fancy polynomial, the interlace polynomial, of Arratia, Bollobs, and Sorkin ,2000,encodes the number of ways to reassemble the original strand of DNA. It is related, with a lot of work, to the contraction-deletion approach of the Chromatic and Reliability polynomials. a a b b c b a c c d d d A chord diagramThe associated circle graph a b c d The interlace polynomial is computed, not on the snippet graph, but on an associated circle graph. The snippet graph Pendant Duplicate Graphs v v v' v' v v v' v' Adding a pendant vertex to v. Duplicating the vertex v. Effect of adding a pendant vertex or duplicating a vertex a b c v v a b c v v Theorem A set of subsequences of DNA permits exactly two reconstructions iff the circle graph associated to any Eulerian circuit of the snippet graph is a pendant-duplicate graph. Side note to the cognesci:Pendant-duplicate graphs correspond to series-parallel graphs via a medial graph construction, so the two reconstructions is actually a new interpretation of the beta invariant.