data structures – p.1/38 - gabriel istrate -...
TRANSCRIPT
Organizational
• Last time: linked list.
• Today: doubly linked list, stack, queue, circular linked list. Lists in STL. Startadvanced topic: skip list.
• Also: First project.
• Not given today, no Monday/Tuesday deadlines (so that you can attend class).
• posted Wednesday evening on the webpage.
• Due two weeks from Wednesday (Thursday morning at 10AM)
Data Structures – p.1/38
Last time: Deleting from head of the list
int List::deletefromHead() throw (string) {
if(isEmpty())
throw string("Empty");
node *tmp = head;
int info = head->getInfo();
if (head==tail)
head = tail=0;
else
head = head->getNextNode();
delete tmp;
return info;
}
Data Structures – p.2/38
Catching exceptions
void f()
{
. . . . . .
try{
n= list.deleteFromHead();
// do something with n;
}catch(char *s){
cerr << "Error: "<< s << endl;
}
. . . . . .
}
Data Structures – p.3/38
Deleting the successor of a node pointed to
by pointer r
• Have to test whether pointer in 0.
• Also: if the successor is 0.
• simply modify the value of the field next;
• want to reclaim the memory allocated for the successor node;
• C++: delete;
• in the absence of deallocation: memory leaks.
• Program size grows continuously. Eventually will result in blocking your computer.
Data Structures – p.4/38
Deleting the successor of a node pointed to
by pointer r
void List::deleteNextNode(Node *r) throw(string){
if (!r)
throw string("Null pointer in deleteNextNode");
if (r->getNextNode()== 0)
throw string("attempt to delete nonexisting node");
Node *s = r->getNextNode();
r->setNextNode(s->getNextNode());
delete s;
}
Data Structures – p.5/38
Better solution for throwing exceptions
• Can throw complex objects as exceptions.
• Allows much better recovery (e.g. examining what went wrong).
• Solution: Define your own exception types(classes)
• Multiple catch blocks, multiple object types.
• String (char *): last one/default exception type.
Data Structures – p.6/38
Doubly linked lists
• Functions PredecessorNode() and deleteFromTail() expose a problem: no efficientway to go "backwards" in a singly linked list.
• Solution: doubly linked list. Nodes hold two pointers: one to predecessor node,one to successor node.
• Update schemes change slightly.
Data Structures – p.7/38
Doubly linked list: implementation
class Dllnode{
public:
Dllnode(){
next = prev=0;
}
Dllnode(int i, Dllnode *n=0, Dllnode *p=0){
info = i;
next = n;
prev = p;
}
. . . . . .
int getInfo(){return info;};
Dllnode *getNextNode(){return next;};
Dllnode *getPrevNode(){return prev;};
private:
int info;
Dllnode* next,prev;
};Data Structures – p.8/38
Doubly linked list: implementation (II)
class Dllist{
public:
Dllist();
∼Dllist();
. . .
void addToDllTail(int);
int deleteFromDllTail();
bool isEmpty();
private:
Dllnode* head;
Dllnode* tail;
}
Data Structures – p.9/38
Adding to the Tail of a Doubly Linked List
void DllList::addToDllTail(int info){
if (tail != 0){
tail = new Dllnode(info, 0, tail);
(tail-> getPrevNode())->setNextNode(tail); // step *}
else head = tail = new Dllnode (info);
}
Data Structures – p.10/38
Deleting from the tail of a Doubly Linked List
int DllList::deleteFromDllTail(){
assert(!isEmpty());
int i = tail->info;
if (head == tail){ // only one element in the list
delete head;
head = tail = 0;
}
else { // more than one element in the list
tail = tail-> getPrevNode();
delete tail-> getNextNode();
tail-> setNextNode(0);
}
return i;
}
Data Structures – p.11/38
Correctness
• Why is step * in function addtoDllTail correct ?
• Code executed when parameter tail ! =0.
• Newly created node has parameter prev set to this value of tail.
• Thus tail− > prev points to a nonempty node, thus pointer next can beaccessed.
• Deletion: special case is single node list. In this case by deletion it becomes empty.
• Otherwise: store last value in the list. Move parameter tail to previous node(directly accessed). Delete the last node (now accessed through tail->next). Alsoset the next pointer of the last node to zero.
Data Structures – p.12/38
STACKS, QUEUES, DEQUEUES
• STACK: insert/delete elements at the tail.
• QUEUE: insert at the end, delete at the front.
• DEQUEUE: insert/delete both at front and back.
Data Structures – p.13/38
STACKS, QUEUES, DEQUEUES
• delete from the end of the list: deleteNextNode(predecesorNode(tail)); or youcan implement it directly as member function deleteFromTail.
• STACK: interface exposes functions addToTail and deleteFromTail (usuallycalled push and pop).
• QUEUE: interface exposes addToTail and deleteFromHead.
• DEQUEUE: both addToHead and addToTail, also deleteFromHead,deleteFromTail.
Data Structures – p.15/38
Destructor
List::∼List(){
for (node *p; !isEmpty(); ){
p = head->getNextNode();
delete head;
head = p;
}
}
• nodes: contain dynamically allocated memory. Has to be freed.
• REMEMBER: second condition in the for loop functions as a WHILE test.
• We keep a temporary pointer to the node next to the head, delete head, andadvance the head pointer.
Data Structures – p.16/38
Circular lists
• A list in which nodes form a ring: list is finite and every node has a successor.
• E.g. several processes using a shared resource for the same amount of time (timesharing), and we make processes take turn.
• Processes are put on a circular list accessed by pointer current.
• After process take its time, pointer advances to the next process.
• node class: same as the one for singly linked list from last course.
• Implementation: can use singly linked list and make the last node point to firstrather than to zero.
• Interface can be the same, functions (e.g. inserting, deleting nodes) will be different.
• Only need one pointer, tail.
Data Structures – p.17/38
Implementing insertion at the end of the
circular list
void CLList::addToTail(int el){
if (isEmpty())
{
// handle addition to an empty list separately
tail = new node(el);
tail->setNextNode(tail);
}
else
{
tail->setNextNode(new node(el, tail->getNextNode()));
}
}
Data Structures – p.18/38
Issues with previous implementation
• Deleting the tail node requires looping around the tail so that the predecessor’ssuccessor node can be updated.
• Delete tail node complexity O(n).
• Processing data in reversed order not efficient: O(n2).
• Alternative: circular doubly-linked list.
Data Structures – p.19/38
Lists in the STL
#include <list>
// list class library
using namespace std;
// Create a "list" object, specifying its content as "int".
// The "list" class does not have the same "random access" capability
// as the "vector" class, but it is possible to add elements at
// the end of the list and take them off the front.
list<int> list1;
// Add some values at the end of the list, which is initially empty.
// The member function "push back" adds at item at the end of the list.
int value1 = 10;
int value2 = -3;
list1.push back (value1);
list1.push back (value2);
list1.push back (5);
list1.push back (1);
Data Structures – p.20/38
Lists in the STL (II)
// Output the list values, by repeatedly getting the item from
// the "front" of the list, outputting it, and removing it
// from the front of the list.
// cout << endl << "List values:" << endl;
// Loop as long as there are still elements in the list.
while (list1.size() > 0)
{
// Get the value of the "front" list item.
int value = list1.front();
// Output the value.
cout << value << endl;
// Remove the item from the front of the list ("pop front"
// member function).
list1.pop front();
}
Data Structures – p.21/38
Lists in the STL
Commonly used member functions:
• size_type size() const; returns number of elements in the list.
• bool empty() const; returns TRUE if list is empty, FALSE otherwise.
• void push_back(const T& x); void push_front(const T& x); insert element x at thefront (back) of the list.
• T& front(); T& back (); return references to the front/back elements;
• begin(), end(). Iterators to the beginning/end of the list.
• iterator insert(iterator position, const T& x); insert element x before the element (ifany) pointed out by the iterator.
• void clear(); clear the list.
• void remove (const T& value); remove all elements equal to value. Type T mustpermit operator==.
• void sort(); void reverse();
Data Structures – p.22/38
Lists in the STL
Other member functions:
• c.assign(n, elem): assign n copies of element elem to list.
• c.resize(num):Modifies the container so that it has exactly n elements, insertingelements at the end or erasing elements from the end if necessary. values: defaultconstructor (for integers, 0).
• c.unique() : removes duplicates of consecutive elements with the same value.
• iterator previous(iterator pos) returns an iterator to the position before that pointedby pos.
• list: Doubly-linked lists. For singly-linked lists use slist.
Data Structures – p.23/38
Iterators in STL
• "Smart" generalizations of pointers. Allow access to a data structure (list,vector,etc.).
• Input/output iterators: first only allow "dereferencing" but not change of value.Second only guarantee write access: it is possible to assign a value through anOutput Iterator, but not necessarily possible to refer to that value.
• Forward/backward iterators: allow multiple passes, but in one direction only.
• Both single-pass iterators. Bidirectional iterators: can increment and decrement.
• References: google, STL documentation (webpage), Scott Meyers "Effective STL"(available in Romanian from Teora). Also his other books on C++.
Data Structures – p.24/38
Case Study: concordance problem
• GIVEN: text (i.e. a sequence of words).
• TO DO: parse the text and discover the words. For each word examine if this itsfirst occurrence.
• If this is the case memorize it.
• Otherwise increment a counter associated to the number of occurrences of thegiven word.
• Similar issues encountered in analyzing natural language (natural languageprocessing). What is the text about ? Statistical language processing.
• Also in compilers.
• First phase in a compiler: lexical analysis.
• Identifies words. Represents program by lexical tokens.
• Example: x = 5; [IDENT "x"] [EQL] [NUMBER 5] [SEMICOLON].
• To represent x: <identifier>. Also: pointer to a list (table) of identifiers.
Data Structures – p.25/38
Solution
• Create a linked list of words.
• For each new word:
• If not found in the list add it.
• Otherwise increment a counter associated to the given word.
Data Structures – p.26/38
Implementation: interfaces
I am showing only more significant elements in the interfaces;
struct Token{
std::string word;
int count;
};
struct TokenNode{
Token info;
TokenNode * next;
};
class TokenNodeList{
public:
TokenNodeList();
void insertOrIncreaseCount (string);
private:
TokenNode *head;
TokenNode *tail;
} Data Structures – p.27/38
Comments on implementation
• C++: a struct is a class with all members public. It allows e.g. constructors.Though I don’t recommend public data members, it’s a better alternative to structs.
• Useful to maintain list in sorted order: to insert a new word we have to determineanyway that it doesn’t appear in the list.
• Order on words: lexicographic order. The way words are listed in a dictionary:compare first letter first, then second letter, etc. Prefixes are smaller.
• InsertOrIncreaseCount(): search until you find element or larger one. If foundincrease count. Reuse code insertInfoBefore(), insertInfo() (Course 3).
int compareStrings(string a, string b){
int i=0; int l1=a.length(); int l2=b.length();
for (int i=0;i<min(l1,l2);i++)
if (a[i]<b[i])
return 0;
else
if (a[i]>b[i])
return 0;
return 1;
}
Data Structures – p.28/38
Implementing functions
void TokenList::insertOrIncreaseCount(string s){
assert(!isEmpty());
node *first=head;
node *second = 0;
while (lessThan(first->getInfo(),s))
{
second = first;
first = first->getNextNode();
};
if (first->getInfo() == s)
{ // string found in list;
first->count++;
}else
insertInfo(second,s);
}
Data Structures – p.29/38
Concordance: main function
#include<fstream>
int main(int argc, char *argv[]) {
std::string nextItem;
TokenNodeList tl;
fstream file op(argv[1],ios::in);
while (file op >> nextItem)
tl.insertOrIncreaseCount(nextItem);
// do something with token list
. . . . . .
return 0;
}
Data Structures – p.30/38
Advanced topic: Skip lists
• Drawback with linked list: require sequential access to locate a searched-for element.
• Ordering: can speed up searching, but sequential search still required.
• Solution: lists that allow skipping some elements to speed up search.
• Skip lists: variant of ordered linked lists that make such search possible.
• More advanced data structure (W. Pugh "Skip lists: a Probabilistic Alternative toBalanced Trees", Communication of the ACM 33(1990), pp. 668-676.)
• If anyone curious/interested in data structures/algorithms, can give paper to read; taste how a
research article looks like.
Data Structures – p.31/38
Skip lists: implementation
• k = 1, . . . , ⌊log2(n)⌋, 1 ≤ i ≤ ⌊n/2k−1⌋ − 1.
• Item 2k−1 · i points to item 2k−1 · (i + 1).
• every second node points to positions two node ahead,
• every fourth node points to positions four nodes ahead,
• every eigth node points to positions eigth nodes ahead,
• . . . . . ., and so on.
• Different number of pointers in different nodes in the list !
• half the nodes only one pointer.
• a quarter of the nodes two pointers,
• an eigth of the nodes four pointers,
• . . . . . ., and so on.
• Approximately how many times more pointers than in a simply-linked list?.
Data Structures – p.33/38
Algorithm
• If you guessed log2(n)/2 times you guessed right:
• n/2 · 1 + n/4 · 2 + n/8 · 4 + . . ..
• Each product in the sum is n/2.
• How many terms in the sum ? ⌈log2 n⌉. Total approximately n ·log
2(n)
2.
• The number of pointers: the level of the node in the tree.
• Levels: from 1 to ⌊log2(n)⌋ + 1.
• To search: first follow pointers on the higher level until a larger element is found orthe list is exhausted.
• If a larger element is found, restart search from its predecessor, this time on alower level.
• Continue doing this until element found, or you reach the first level and a largerelement or the end of the list.
Data Structures – p.34/38
Pseudocode
find(element el){
p = the nonnull list on the highest level i;
while (el not found and i ≥ 0)
if (p->key > el)
p = a sublist that begins in the predecessor of p
on the level −− i;
else
if (p->key < el)
if p is the last element on the level i
p = a nonnull sublist that begins in p
on the highest level < i;
i = the number of this level;
else
p = p− > next;
}
Data Structures – p.35/38
Inserting and deleting nodes
• Problem: when inserting/deleting a node pointers of following nodes have to berestructured.
• Solution: rather than equal spacing, random spacing on a level.
• Number of nodes on each level approximately preserved.
• Level numbering: start with zero.
• New node inserted: probability 1/2 on first level, 1/4 second level, 1/8 third level,. . ., etc.
• Function chooseLevel: chooses the level of the new node.
• Generate random number. If in [0,1/2] level 1, [1/2,3/4] level 2, etc.
• Construct for "typical" case.
• Use randomness to simplify constructions.
Data Structures – p.36/38
Skip list node: interface
#define MAXLEVEL 4
class SkipListNode(){
public:
SkipListNode(){}
int key;
SkipListNode ** next;
};
class SkipList{
public:
SkipList();
bool isEmpty() const;
void choosePowers();
int chooseLevel();
int * SkipListSearch(int);
void SkipListInsert(int);
private:
typedef SkipListNode* nodePtr;
nodePtr root[maxlevel];
int powers[MAXLEVEL];
}Data Structures – p.37/38