handout on a.i

81
GENERAL OVERVIEW OF COMPUTER The first counting device was the abacus, originally from Asia. It worked on a place-value notion meaning that the place of a bead or rock on the apparatus determined how much it was worth. 1600s : John Napier discovers logarithms. Robert Bissaker invents the slide rule which will remain in popular use until 19??. • 1642 : Blaise Pascal, a French mathematician and philosopher, invents the first mechanical digital calculator using gears, called the Pascaline. Although this machine could perform addition and subtraction on whole numbers, it was too expensive and only Pascal himself could repare it. 1804 : Joseph Marie Jacquard used punch cards to automate a weaving loom. • 1812 : Charles P. Babbage, the "father of the computer", discovered that many long calculations involved many similar, repeated operations. Therefore, he designed a machine, the difference engine which would be steam-powered,

Upload: saboga147

Post on 20-Jul-2016

36 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: HANDOUT ON A.I

GENERAL OVERVIEW OF COMPUTER

The first counting device was the abacus, originally from Asia. It worked on a place-

value notion meaning that the place of a bead or rock on the apparatus determined how

much it was worth.

• 1600s : John Napier discovers logarithms.

Robert Bissaker invents the slide rule which will remain in popular use until 19??.

• 1642 : Blaise Pascal, a French mathematician and philosopher, invents the first

mechanical digital calculator using gears, called the Pascaline. Although this machine

could perform addition and subtraction on whole numbers, it was too expensive and only

Pascal himself could repare it.

• 1804 : Joseph Marie Jacquard used punch cards to automate a weaving loom.

• 1812 : Charles P. Babbage, the "father of the computer", discovered that many long

calculations involved many similar, repeated operations. Therefore, he designed a

machine, the difference engine which would be steam-powered, fully automatic and

commanded by a fixed instruction program. In 1833, Babbage quit working on this

machine to concentrate on the analytical engine.

• 1840s: Augusta Ada. "The first programmer" suggested that a binary system should be

used for storage rather than a decimal system.

• 1850s : George Boole developed Boolean logic which would later be used in the

design of computer circuitry.

• 1890: Dr. Herman Hollerith introduced the first electromechanical, punched-card

data-processing machine which was used to compile information for the 1890 U.S.

census. Hollerith's tabulator became so successful that he started his own business to

Page 2: HANDOUT ON A.I

market it. His company would eventually become International Business Machines

(IBM).

• 1906 : The vacuum tube is invented by American physicist Lee De Forest.

• 1939 : Dr. John V. Atanasoff and his assistant Clifford Berry build the first

electronic digital computer. Their machine, the Atanasoff-Berry-Computer (ABC)

provided the foundation for the advances in electronic digital computers.

• 1941 : Konrad Zuse (recently deceased in January of 1996), from Germany,

introduced the first programmable computer designed to solve complex engineering

equations. This machine, called the Z3, was also the first to work on the binary system

instead of the decimal system.

• 1943 : British mathematician Alan Turing developed a hypothetical device, the

Turing machine which would be designed to perform logical operation and could read

and write. It would presage programmable computers. He also used vacuum technology

to build British Colossus, a machine used to counteract the German code scrambling

device, Enigma.

• 1944 : Howard Aiken, in collaboration with engineers from IBM, constructed a large

automatic digital sequence-controlled computer called the Harvard Mark I. This

computer could handle all four arithmetic opreations, and had special built-in programs

for logarithms and trigonometric functions.

• 1945 : Dr. John von Neumann presented a paper outlining the stored-program

concept.

• 1947 : The giant ENIAC (Electrical Numerical Integrator and Calculator) machine was

developed by John W. Mauchly and J. Presper Eckert, Jr. at the University of

Page 3: HANDOUT ON A.I

Pennsylvania. It used 18, 000 vacuums, punch-card input, weighed thirty tons and

occupied a thirty-by-fifty-foot space. It wasn't programmable but was productive from

1946 to 1955 and was used to compute artillery firing tables. That same year, the

transistor was invented by William Shockley, John Bardeen and Walter Brattain of Bell

Labs. It would rid computers of vacuum tubes and radios.

• 1949 : Maurice V. Wilkes built the EDSAC (Electronic Delay Storage Automatic

Computer), the first stored-program computer. EDVAC (Electronic Discrete Variable

Automatic Computer), the second stored-program computer was built by Mauchly,

Eckert, and von Neumann. An Wang developped magnetic-core memory which Jay

Forrester would reorganize to be more efficient.

• 1950 : Turing built the ACE, considered by some to be the first programmable digital

computer.

The First Generation (1951-1959)

• 1951: Mauchly and Eckert built the UNIVAC I, the first computer designed and sold

commercially, specifically for business data-processing applications.

• 1950s : Dr. Grace Murray Hopper developed the UNIVAC I compiler.

• 1957 : The programming language FORTRAN (FORmula TRANslator) was designed

by John Backus, an IBM engineer.

• 1959 : Jack St. Clair Kilby and Robert Noyce of Texas Instruments manufactured the

first integrated circuit, or chip, which is a collection of tiny little transistors.

The Second Generation (1959-1965)

• 1960s : Gene Amdahl designed the IBM System/360 series of mainframe (G)

computers, the first general-purpose digital computers to use intergrated circuits.

Page 4: HANDOUT ON A.I

• 1961: Dr. Hopper was instrumental in developing the COBOL (Common Business

Oriented Language) programming language.

• 1963 : Ken Olsen, founder of DEC, produced the PDP-I, the first minicomputer (G).

• 1965 : BASIC (Beginners All-purpose Symbolic Instruction Code) programming

language developped by Dr. Thomas Kurtz and Dr. John Kemeny.

The Third Generation (1965-1971)

• 1969 : The Internet is started.

• 1970 : Dr. Ted Hoff developed the famous Intel 4004 microprocessor (G) chip.

• 1971 : Intel released the first microprocessor, a specialized integrated circuit which was

ale to process four bits of data at a time. It also included its own arithmetic logic unit.

PASCAL, a structured programming language, was developed by Niklaus Wirth.

The Fourth Generation (1971-Present)

• 1975 : Ed Roberts, the "father of the microcomputer" designed the first

microcomputer, the Altair 8800, which was produced by Micro Instrumentation and

Telemetry Systems (MITS). The same year, two young hackers, William Gates and Paul

Allen approached MITS and promised to deliver a BASIC compiler. So they did and

from the sale, Microsoft was born.

• 1976 : Cray developed the Cray-I supercomputer (G). Apple Computer, Inc was

founded by Steven Jobs and Stephen Wozniak.

• 1977 : Jobs and Wozniak designed and built the first Apple II microcomputer.

• 1970 : 1980: IBM offers Bill Gates the opportunity to develop the operating system for

its new IBM personal computer. Microsoft has achieved tremendous growth and success

today due to the development of MS-DOS. Apple III was also released.

Page 5: HANDOUT ON A.I

• 1981 : The IBM PC was introduced with a 16-bit microprocessor.

• 1982 : Time magazine chooses the computer instead of a person for its "Machine of the

Year."

• 1984 : Apple introduced the Macintosh computer, which incorporated a unique

graphical interface, making it easy to use. The same year, IBM released the 286-AT.

• 1986 : Compaq released the DeskPro 386 computer, the first to use the 80036

microprocessor.

• 1987 : IBM announced the OS/2 operating-system technology.

• 1988 : A nondestructive worm was introduced into the Internet network bringing

thousands of computers to a halt.

• 1989 : The Intel 486 became the world's first 1,000,000 transistor microprocessor.

• 1993s: The Energy Star program, endorsed by the Environmental Protection Agency

(EPA), encouraged manufacturers to build computer equipment that met power

consumpton guidelines. When guidelines are met, equipment displays the Energy Star

logo. The same year, Several companies introduced computer systems using the Pentium

microprocessor from Intel that contains 3.1 million transistors and is able to perform 112

million instructions per second (MIPS)

Page 6: HANDOUT ON A.I

MODULE TWO

NATURAL LANGUAGE

2.0 PREAMBLE

In the philosophy of language, a natural language (or ordinary language) is any

language which arises in an unpremeditated fashion as the result of the innate facility for

language possessed by the human intellect. A natural language is typically used for

communication, and may be spoken, signed, or written. Natural language is distinguished

from constructed languages and formal languages such as computer-programming

languages or the "languages" used in the study of formal logic, especially mathematical

logic.

Defining natural language

Though the exact definition varies between scholars, natural language can broadly be

defined in contrast on the one hand to artificial or constructed languages, such as

computer programming languages like Python and international auxiliary languages

like Esperanto, and on the other hand to other communication systems in nature, such as

the waggle dance of bees. Although there are a variety of natural languages, any

cognitively normal human infant is able to learn any natural language. By comparing the

different natural languages, scholars hope to learn something about the nature of human

intelligence and the innate biases and constraints that shape natural language, which are

sometimes called universal grammar.

The term "natural language" refers only a language that has developed naturally and

hence to actual speech, rather than prescribed speech.

Hence, unstandardized speech such as African American Vernacular English) is natural,

whereas standardized speech such as Standard American English, which is in part

prescribed, is somewhat artificial.

2.1 ORIGINS OF NATURAL LANGUAGE

There is disagreement among anthropologists on when language was first used by

humans (or their ancestors). Estimates range from about two million (2,000,000) years

ago, during the time of Homo habilis, to as recently as forty thousand (40,000) years ago,

Page 7: HANDOUT ON A.I

during the time of Cro-Magnon man. However recent evidence suggests modern human

language was invented or evolved in Africa prior to the dispersal of humans from Africa

around 50,000 years ago. Since all people including the most isolated indigenous groups

such as the Andamanese or the Tasmanian aboriginals possess language, then it was

presumably present in the ancestral populations in Africa before the human population

split into various groups to inhabit the rest of the world.

Controlled languages

Controlled natural languages are subsets of natural languages whose grammars and

dictionaries have been restricted in order to reduce or eliminate both ambiguity and

complexity (for instance, by cutting down on rarely used superlative or adverbial forms

or irregular verbs). The purpose behind the development and implementation of a

controlled natural language typically is to aid non-native speakers of a natural language

in understanding it, or to ease computer processing of a natural language. An example of

a widely used controlled natural language is Simplified English, which was originally

developed for aerospace industry maintenance manuals.

Modalities

Natural language manifests itself in modalities other than speech.

Sign languages

A sign language is a language which conveys meaning through visual rather than acoustic

patterns—simultaneously combining hand shapes, orientation and movement of the

hands, arms or body, and facial expressions to express a speaker's thoughts. Sign

languages are natural languages which have developed in Deaf communities, which can

include interpreters and friends and families of deaf people as well as people who are

deaf or hard of hearing themselves.

In contrast, a manually coded language (or signed oral language) is a constructed sign

system combining elements of a sign language and an oral language. For example,

Signed Exact English (SEE) did not develop naturally in any population, but was "created

by a committee of individuals".

Page 8: HANDOUT ON A.I

Written languages

In a sense, written language should be distinguished from natural language. Until recently

in the developed world, it was common for many people to be fluent in spoken and yet

remain illiterate; this is still the case in poor countries today. Furthermore, natural

language acquisition during childhood is largely spontaneous, while literacy must usually

be intentionally acquired

2.2 NATURAL LANGUAGE PROCESSING

Natural language processing (NLP) is a field of computer science, artificial

intelligence, and linguistics concerned with the interactions between computers and

human (natural) languages. As such, NLP is related to the area of human–computer

interaction. Many challenges in NLP involve natural language understanding -- that is,

enabling computers to derive meaning from human or natural language input.

An automated online assistant providing customer service on a web page, an example of

an application where natural language processing is a major component.

2.2.1 History

The history of NLP generally starts in the 1950s, although work can be found from earlier

periods. In 1950, Alan Turing published his famous article "Computing Machinery and

Intelligence" which proposed what is now called the Turing test as a criterion of

intelligence. This criterion depends on the ability of a computer program to impersonate a

human in a real-time written conversation with a human judge, sufficiently well that the

judge is unable to distinguish reliably — on the basis of the conversational content

alone — between the program and a real human.

The Georgetown experiment in 1954 involved fully automatic translation of more than

sixty Russian sentences into English. The authors claimed that within three or five years,

machine translation would be a solved problem. However, real progress was much

slower, and after the ALPAC report in 1966, which found that ten years long research

had failed to fulfill the expectations, funding for machine translation was dramatically

reduced. Little further research in machine translation was conducted until the late 1980s,

when the first statistical machine translation systems were developed.

Page 9: HANDOUT ON A.I

Some notably successful NLP systems developed in the 1960s were SHRDLU, a natural

language system working in restricted "blocks worlds" with restricted vocabularies, and

ELIZA, a simulation of a Rogerian psychotherapist, written by Joseph Weizenbaum

between 1964 to 1966. Using almost no information about human thought or emotion,

ELIZA sometimes provided a startlingly human-like interaction. When the "patient"

exceeded the very small knowledge base, ELIZA might provide a generic response, for

example, responding to "My head hurts" with "Why do you say your head hurts?".

During the 70's many programmers began to write 'conceptual ontologies', which

structured real-world information into computer-understandable data. Examples are

MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin

(Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units

(Lehnert 1981). During this time, many chatterbots were written including PARRY,

Racter, and Jabberwacky.

Up to the 1980s, most NLP systems were based on complex sets of hand-written rules.

Starting in the late 1980s, however, there was a revolution in NLP with the introduction

of machine learning algorithms for language processing. This was due both to the steady

increase in computational power resulting from Moore's Law and the gradual lessening of

the dominance of Chomskyan theories of linguistics (e.g. transformational grammar),

whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies

the machine-learning approach to language processing. Some of the earliest-used

machine learning algorithms, such as decision trees, produced systems of hard if-then

rules similar to existing hand-written rules. Increasingly, however, research has focused

on statistical models, which make soft, probabilistic decisions based on attaching real-

valued weights to the features making up the input data. The cache language models upon

which many speech recognition systems now rely are examples of such statistical models.

Such models are generally more robust when given unfamiliar input, especially input that

contains errors (as is very common for real-world data), and produce more reliable results

when integrated into a larger system comprising multiple subtasks.

Many of the notable early successes occurred in the field of machine translation, due

especially to work at IBM Research, where successively more complicated statistical

Page 10: HANDOUT ON A.I

models were developed. These systems were able to take advantage of existing

multilingual textual corpora that had been produced by the Parliament of Canada and the

European Union as a result of laws calling for the translation of all governmental

proceedings into all official languages of the corresponding systems of government.

However, most other systems depended on corpora specifically developed for the tasks

implemented by these systems, which was (and often continues to be) a major limitation

in the success of these systems. As a result, a great deal of research has gone into

methods of more effectively learning from limited amounts of data.

Recent research has increasingly focused on unsupervised and semi-supervised learning

algorithms. Such algorithms are able to learn from data that has not been hand-annotated

with the desired answers, or using a combination of annotated and non-annotated data.

Generally, this task is much more difficult than supervised learning, and typically

produces less accurate results for a given amount of input data. However, there is an

enormous amount of non-annotated data available (including, among other things, the

entire content of the World Wide Web), which can often make up for the inferior results.

2.2.2 NLP using machine learning

Modern NLP algorithms are based on machine learning, especially statistical machine

learning. The paradigm of machine learning is different from that of most prior attempts

at language processing. Prior implementations of language-processing tasks typically

involved the direct hand coding of large sets of rules. The machine-learning paradigm

calls instead for using general learning algorithms — often, although not always,

grounded in statistical inference — to automatically learn such rules through the analysis

of large corpora of typical real-world examples. A corpus (plural, "corpora") is a set of

documents (or sometimes, individual sentences) that have been hand-annotated with the

correct values to be learned.

Many different classes of machine learning algorithms have been applied to NLP tasks.

These algorithms take as input a large set of "features" that are generated from the input

data. Some of the earliest-used algorithms, such as decision trees, produced systems of

hard if-then rules similar to the systems of hand-written rules that were then common.

Increasingly, however, research has focused on statistical models, which make soft,

Page 11: HANDOUT ON A.I

probabilistic decisions based on attaching real-valued weights to each input feature. Such

models have the advantage that they can express the relative certainty of many different

possible answers rather than only one, producing more reliable results when such a model

is included as a component of a larger system.

Systems based on machine-learning algorithms have many advantages over hand-

produced rules:

The learning procedures used during machine learning automatically focus on the

most common cases, whereas when writing rules by hand it is often not obvious at

all where the effort should be directed.

Automatic learning procedures can make use of statistical inference algorithms to

produce models that are robust to unfamiliar input (e.g. containing words or

structures that have not been seen before) and to erroneous input (e.g. with

misspelled words or words accidentally omitted). Generally, handling such input

gracefully with hand-written rules — or more generally, creating systems of hand-

written rules that make soft decisions — is extremely difficult, error-prone and

time-consuming.

Systems based on automatically learning the rules can be made more accurate

simply by supplying more input data. However, systems based on hand-written

rules can only be made more accurate by increasing the complexity of the rules,

which is a much more difficult task. In particular, there is a limit to the complexity

of systems based on hand-crafted rules, beyond which the systems become more

and more unmanageable. However, creating more data to input to machine-

learning systems simply requires a corresponding increase in the number of man-

hours worked, generally without significant increases in the complexity of the

annotation process.

The subfield of NLP devoted to learning approaches is known as Natural Language

Learning (NLL) and its conference CoNLL and peak body SIGNLL are sponsored by

ACL, recognizing also their links with Computational Linguistics and Language

Acquisition. When the aims of computational language learning research is to understand

Page 12: HANDOUT ON A.I

more about human language acquisition, or psycholinguistics, NLL overlaps into the

related field of Computational Psycholinguistics.

2.2.3 Major tasks in NLP

The following is a list of some of the most commonly researched tasks in NLP. Note that

some of these tasks have direct real-world applications, while others more commonly

serve as subtasks that are used to aid in solving larger tasks. What distinguishes these

tasks from other potential and actual NLP tasks is not only the volume of research

devoted to them but the fact that for each one there is typically a well-defined problem

setting, a standard metric for evaluating the task, standard corpora on which the task can

be evaluated, and competitions devoted to the specific task.

Automatic summarization: Produce a readable summary of a chunk of text. Often

used to provide summaries of text of a known type, such as articles in the financial

section of a newspaper.

Coreference resolution: Given a sentence or larger chunk of text, determine

which words ("mentions") refer to the same objects ("entities"). Anaphora

resolution is a specific example of this task, and is specifically concerned with

matching up pronouns with the nouns or names that they refer to. For example, in

a sentence such as "He entered John's house through the front door", "the front

door" is a referring expression and the bridging relationship to be identified is the

fact that the door being referred to is the front door of John's house (rather than of

some other structure that might also be referred to).

Discourse analysis: This rubric includes a number of related tasks. One task is

identifying the discourse structure of connected text, i.e. the nature of the

discourse relationships between sentences (e.g. elaboration, explanation, contrast).

Another possible task is recognizing and classifying the speech acts in a chunk of

text (e.g. yes-no question, content question, statement, assertion, etc.).

Machine translation: Automatically translate text from one human language to

another. This is one of the most difficult problems, and is a member of a class of

problems colloquially termed "AI-complete", i.e. requiring all of the different

Page 13: HANDOUT ON A.I

types of knowledge that humans possess (grammar, semantics, facts about the real

world, etc.) in order to solve properly.

Morphological segmentation: Separate words into individual morphemes and

identify the class of the morphemes. The difficulty of this task depends greatly on

the complexity of the morphology (i.e. the structure of words) of the language

being considered. English has fairly simple morphology, especially inflectional

morphology, and thus it is often possible to ignore this task entirely and simply

model all possible forms of a word (e.g. "open, opens, opened, opening") as

separate words. In languages such as Turkish, however, such an approach is not

possible, as each dictionary entry has thousands of possible word forms.

Named entity recognition (NER): Given a stream of text, determine which items

in the text map to proper names, such as people or places, and what the type of

each such name is (e.g. person, location, organization). Note that, although

capitalization can aid in recognizing named entities in languages such as English,

this information cannot aid in determining the type of named entity, and in any

case is often inaccurate or insufficient. For example, the first word of a sentence is

also capitalized, and named entities often span several words, only some of which

are capitalized. Furthermore, many other languages in non-Western scripts (e.g.

Chinese or Arabic) do not have any capitalization at all, and even languages with

capitalization may not consistently use it to distinguish names. For example,

German capitalizes all nouns, regardless of whether they refer to names, and

French and Spanish do not capitalize names that serve as adjectives.

Natural language generation: Convert information from computer databases into

readable human language.

Natural language understanding: Convert chunks of text into more formal

representations such as first-order logic structures that are easier for computer

programs to manipulate. Natural language understanding involves the

identification of the intended semantic from the multiple possible semantics which

can be derived from a natural language expression which usually takes the form of

organized notations of natural languages concepts. Introduction and creation of

Page 14: HANDOUT ON A.I

language metamodel and ontology are efficient however empirical solutions. An

explicit formalization of natural languages semantics without confusions with

implicit assumptions such as closed world assumption (CWA) vs. open world

assumption, or subjective Yes/No vs. objective True/False is expected for the

construction of a basis of semantics formalization.

Optical character recognition (OCR): Given an image representing printed text,

determine the corresponding text.

Part-of-speech tagging: Given a sentence, determine the part of speech for each

word. Many words, especially common ones, can serve as multiple parts of

speech. For example, "book" can be a noun ("the book on the table") or verb ("to

book a flight"); "set" can be a noun, verb or adjective; and "out" can be any of at

least five different parts of speech. Note that some languages have more such

ambiguity than others. Languages with little inflectional morphology, such as

English are particularly prone to such ambiguity. Chinese is prone to such

ambiguity because it is a tonal language during verbalization. Such inflection is

not readily conveyed via the entities employed within the orthography to convey

intended meaning.

Parsing: Determine the parse tree (grammatical analysis) of a given sentence. The

grammar for natural languages is ambiguous and typical sentences have multiple

possible analyses. In fact, perhaps surprisingly, for a typical sentence there may be

thousands of potential parses (most of which will seem completely nonsensical to

a human).

Question answering: Given a human-language question, determine its answer.

Typical questions have a specific right answer (such as "What is the capital of

Canada?"), but sometimes open-ended questions are also considered (such as

"What is the meaning of life?").

Relationship extraction: Given a chunk of text, identify the relationships among

named entities (e.g. who is the wife of whom).

Sentence breaking (also known as sentence boundary disambiguation): Given a

chunk of text, find the sentence boundaries. Sentence boundaries are often marked

Page 15: HANDOUT ON A.I

by periods or other punctuation marks, but these same characters can serve other

purposes (e.g. marking abbreviations).

Sentiment analysis: Extract subjective information usually from a set of

documents, often using online reviews to determine "polarity" about specific

objects. It is especially useful for identifying trends of public opinion in the social

media, for the purpose of marketing.

Speech recognition: Given a sound clip of a person or people speaking, determine

the textual representation of the speech. This is the opposite of text to speech and

is one of the extremely difficult problems colloquially termed "AI-complete" (see

above). In natural speech there are hardly any pauses between successive words,

and thus speech segmentation is a necessary subtask of speech recognition (see

below). Note also that in most spoken languages, the sounds representing

successive letters blend into each other in a process termed coarticulation, so the

conversion of the analog signal to discrete characters can be a very difficult

process.

Speech segmentation: Given a sound clip of a person or people speaking, separate

it into words. A subtask of speech recognition and typically grouped with it.

Topic segmentation and recognition: Given a chunk of text, separate it into

segments each of which is devoted to a topic, and identify the topic of the

segment.

Word segmentation: Separate a chunk of continuous text into separate words. For

a language like English, this is fairly trivial, since words are usually separated by

spaces. However, some written languages like Chinese, Japanese and Thai do not

mark word boundaries in such a fashion, and in those languages text segmentation

is a significant task requiring knowledge of the vocabulary and morphology of

words in the language.

Word sense disambiguation: Many words have more than one meaning; we have

to select the meaning which makes the most sense in context. For this problem, we

are typically given a list of words and associated word senses, e.g. from a

dictionary or from an online resource such as WordNet.

Page 16: HANDOUT ON A.I

In some cases, sets of related tasks are grouped into subfields of NLP that are often

considered separately from NLP as a whole. Examples include:

Information retrieval (IR): This is concerned with storing, searching and

retrieving information. It is a separate field within computer science (closer to

databases), but IR relies on some NLP methods (for example, stemming). Some

current research and applications seek to bridge the gap between IR and NLP.

Information extraction (IE): This is concerned in general with the extraction of

semantic information from text. This covers tasks such as named entity

recognition, Coreference resolution, relationship extraction, etc.

Speech processing: This covers speech recognition, text-to-speech and related

tasks.

Other tasks include:

Stemming

Text simplification

Text-to-speech

Text-proofing

Natural language search

Query expansion

Automated essay scoring

Truecasing

2.3 STATISTICAL NLP

Statistical natural-language processing uses stochastic, probabilistic and statistical

methods to resolve some of the difficulties discussed above, especially those which arise

because longer sentences are highly ambiguous when processed with realistic grammars,

yielding thousands or millions of possible analyses. Methods for disambiguation often

involve the use of corpora and Markov models. Statistical NLP comprises all quantitative

approaches to automated language processing, including probabilistic modeling,

information theory, and linear algebra. The technology for statistical NLP comes mainly

Page 17: HANDOUT ON A.I

from machine learning and data mining, both of which are fields of artificial intelligence

that involve learning from data.

2.4 EVALUATION OF NATURAL LANGUAGE PROCESSING

The goal of NLP evaluation is to measure one or more qualities of an algorithm or

a system, in order to determine whether (or to what extent) the system answers the goals

of its designers, or meets the needs of its users. Research in NLP evaluation has received

considerable attention, because the definition of proper evaluation criteria is one way to

specify precisely an NLP problem, going thus beyond the vagueness of tasks defined only

as language understanding or language generation. A precise set of evaluation criteria,

which includes mainly evaluation data and evaluation metrics, enables several teams to

compare their solutions to a given NLP problem.

2.5 DIFFERENT TYPES OF EVALUATION

Depending on the evaluation procedures, a number of distinctions are traditionally

made in NLP evaluation.

Intrinsic vs. extrinsic evaluation

Intrinsic evaluation considers an isolated NLP system and characterizes its performance

mainly with respect to a gold standard result, pre-defined by the evaluators. Extrinsic

evaluation, also called evaluation in use considers the NLP system in a more complex

setting, either as an embedded system or serving a precise function for a human user.

The extrinsic performance of the system is then characterized in terms of its utility with

respect to the overall task of the complex system or the human user. For example,

consider a syntactic parser that is based on the output of some new part of speech (POS)

tagger. An intrinsic evaluation would run the POS tagger on some labelled data, and

compare the system output of the POS tagger to the gold standard (correct) output. An

extrinsic evaluation would run the parser with some other POS tagger, and then with the

new POS tagger, and compare the parsing accuracy.

Black-box vs. glass-box evaluation

Page 18: HANDOUT ON A.I

Black-box evaluation requires one to run an NLP system on a given data set and to

measure a number of parameters related to the quality of the process (speed, reliability,

resource consumption) and, most importantly, to the quality of the result (e.g. the

accuracy of data annotation or the fidelity of a translation). Glass-box evaluation looks at

the design of the system, the algorithms that are implemented, the linguistic resources it

uses (e.g. vocabulary size), etc. Given the complexity of NLP problems, it is often

difficult to predict performance only on the basis of glass-box evaluation, but this type of

evaluation is more informative with respect to error analysis or future developments of a

system.

Automatic vs. Manual Evaluation

In many cases, automatic procedures can be defined to evaluate an NLP system by

comparing its output with the gold standard (or desired) one. Although the cost of

producing the gold standard can be quite high, automatic evaluation can be repeated as

often as needed without much additional costs (on the same input data). However, for

many NLP problems, the definition of a gold standard is a complex task, and can prove

impossible when inter-annotator agreement is insufficient. Manual evaluation is

performed by human judges, which are instructed to estimate the quality of a system, or

most often of a sample of its output, based on a number of criteria. Although, thanks to

their linguistic competence, human judges can be considered as the reference for a

number of language processing tasks, there is also considerable variation across their

ratings. This is why automatic evaluation is sometimes referred to as objective evaluation,

while the human kind appears to be more "subjective."

Standardization in NLP

An ISO sub-committee is working in order to ease interoperability between lexical

resources and NLP programs. The sub-committee is part of ISO/TC37 and is called

ISO/TC37/SC4. Some ISO standards are already published but most of them are under

construction, mainly on lexicon representation (see LMF), annotation and data category

registry.

Page 19: HANDOUT ON A.I

MODULE THREE

(PROGRAMMING LANGUAGE – LISP)

3.0 PREAMBLE

Lisp (historically, LISP) is a family of computer programming languages with a

long history and a distinctive, fully parenthesized Polish prefix notation. Originally

specified in 1958, Lisp is the second-oldest high-level programming language in

widespread use today; only Fortran is older (by one year).

Like Fortran, Lisp has changed a great deal since its early days, and a number of dialects

have existed over its history. Today, the most widely known general-purpose Lisp

dialects are Common Lisp and Scheme.

Lisp was originally created as a practical mathematical notation for computer programs,

influenced by the notation of Alonzo Church's lambda calculus. It quickly became the

favored programming language for artificial intelligence (AI) research.

As one of the earliest programming languages, Lisp pioneered many ideas in computer

science, including tree data structures, automatic storage management, dynamic typing,

and the self-hosting compiler.

The name LISP derives from "LISt Processing". Linked lists are one of Lisp language's

major data structures, and Lisp source code is itself made up of lists. As a result, Lisp

programs can manipulate source code as a data structure, giving rise to the macro systems

that allow programmers to create new syntax or even new domain-specific languages

embedded in Lisp.

The interchangeability of code and data also gives Lisp its instantly recognizable syntax.

All program code is written as s-expressions, or parenthesized lists. A function call or

syntactic form is written as a list with the function or operator's name first, and the

arguments following; for instance, a function f that takes three arguments might be called

using (f arg1 arg2 arg3).

Page 20: HANDOUT ON A.I

3.1 HISTORY

Lisp was invented by John McCarthy in 1958 while he was at the Massachusetts

Institute of Technology (MIT). McCarthy published its design in a paper in

Communications of the ACM in 1960, entitled "Recursive Functions of Symbolic

Expressions and Their Computation by Machine, Part I" ("Part II" was never published).

He showed that with a few simple operators and a notation for functions, one can build a

Turing-complete language for algorithms.

Information Processing Language was the first AI language, from 1955 or 1956, and

already included many of the concepts, such as list-processing and recursion, which came

to be used in Lisp.

McCarthy's original notation used bracketed "M-expressions" that would be translated

into S-expressions. As an example, the M-expression car[cons[A,B]] is equivalent to the

S-expression (car (cons A B)). Once Lisp was implemented, programmers rapidly chose

to use S-expressions, and M-expressions were abandoned. M-expressions surfaced again

with short-lived attempts of MLISP by Horace Enea and CGOL by Vaughan Pratt.

Lisp was first implemented by Steve Russell on an IBM 704 computer. Russell had read

McCarthy's paper, and realized (to McCarthy's surprise) that the Lisp eval function could

be implemented in machine code. The result was a working Lisp interpreter which could

be used to run Lisp programs, or more properly, 'evaluate Lisp expressions.'

Two assembly language macros for the IBM 704 became the primitive operations for

decomposing lists: car (Contents of the Address part of Register number) and cdr

(Contents of the Decrement part of Register number).

From the context, it is clear that the term "Register" is used here to mean "Memory

Register", nowadays called "Memory Location". Lisp dialects still use car and cdr (pron.:

/ˈkɑr/ and /ˈkʊdər/) for the operations that return the first item in a list and the rest of the

list respectively.

The first complete Lisp compiler, written in Lisp, was implemented in 1962 by Tim Hart

and Mike Levin at MIT. This compiler introduced the Lisp model of incremental

compilation, in which compiled and interpreted functions can intermix freely. The

Page 21: HANDOUT ON A.I

language used in Hart and Levin's memo is much closer to modern Lisp style than

McCarthy's earlier code.

Lisp was a difficult system to implement with the compiler techniques and stock

hardware of the 1970s. Garbage collection routines, developed by then-MIT graduate

student Daniel Edwards, made it practical to run Lisp on general-purpose computing

systems, but efficiency was still a problem. This led to the creation of Lisp machines:

dedicated hardware for running Lisp environments and programs. Advances in both

computer hardware and compiler technology soon made Lisp machines obsolete.

During the 1980s and 1990s, a great effort was made to unify the work on new Lisp

dialects (mostly successors to Maclisp like ZetaLisp and NIL (New Implementation of

Lisp)) into a single language. The new language, Common Lisp, was somewhat

compatible with the dialects it replaced (the book Common Lisp the Language notes the

compatibility of various constructs).

3.1.1 Connection to artificial intelligence

Since its inception, Lisp was closely connected with the artificial intelligence research

community, especially on PDP-10 systems. Lisp was used as the implementation of the

programming language Micro Planner which was used in the famous AI system

SHRDLU. In the 1970s, as AI research spawned commercial offshoots, the performance

of existing Lisp systems became a growing issue.

3.1.2 Genealogy and variants

Over its fifty-year history, Lisp has spawned many variations on the core theme of an

S-expression language. Moreover, each given dialect may have several implementations

—for instance, there are more than a dozen implementations of Common Lisp.

Differences between dialects may be quite visible—for instance, Common Lisp uses the

keyword defun to name a function, but Scheme uses define. Within a dialect that is

standardized, however, conforming implementations support the same core language, but

with different extensions and libraries.

3.1.3 Historically significant dialects

Page 22: HANDOUT ON A.I

LISP 1 – First implementation.

LISP 1.5 – First widely distributed version, developed by McCarthy and others

at MIT. So named because it contained several improvements on the original

"LISP 1" interpreter, but was not a major restructuring as the planned LISP 2

would be.

Stanford LISP 1.6 – This was a successor to LISP 1.5 developed at the

Stanford AI Lab, and widely distributed to PDP-10 systems running the TOPS-

10 operating system. It was rendered obsolete by Maclisp and InterLisp.

MACLISP – developed for MIT's Project MAC (no relation to Apple's

Macintosh, nor to McCarthy), direct descendant of LISP 1.5. It ran on the

PDP-10 and Multics systems. (MACLISP would later come to be called

Maclisp, and is often referred to as MacLisp.)

InterLisp – developed at BBN Technologies for PDP-10 systems running the

Tenex operating system, later adopted as a "West coast" Lisp for the Xerox

Lisp machines as InterLisp-D. A small version called "InterLISP 65" was

published for Atari's 6502-based computer line. For quite some time Maclisp

and InterLisp were strong competitors.

Franz Lisp – originally a Berkeley project; later developed by Franz Inc. The

name is a humorous deformation of the name "Franz Liszt", and does not refer

to Allegro Common Lisp, the dialect of Common Lisp sold by Franz Inc., in

more recent years.

XLISP, which AutoLISP was based on.

Standard Lisp and Portable Standard Lisp were widely used and ported,

especially with the Computer Algebra System REDUCE.

ZetaLisp, also known as Lisp Machine Lisp – used on the Lisp machines,

direct descendant of Maclisp. ZetaLisp had big influence on Common Lisp.

LeLisp is a French Lisp dialect. One of the first Interface Builders was written

in LeLisp.

Common Lisp (1984), as described by Common Lisp the Language – a

consolidation of several divergent attempts (ZetaLisp, Spice Lisp, NIL, and S-1

Page 23: HANDOUT ON A.I

Lisp) to create successor dialects to Maclisp, with substantive influences from

the Scheme dialect as well. This version of Common Lisp was available for

wide-ranging platforms and was accepted by many as a de facto standard until

the publication of ANSI Common Lisp (ANSI X3.226-1994).

Dylan was in its first version a mix of Scheme with the Common Lisp Object

System.

EuLisp – attempt to develop a new efficient and cleaned-up Lisp.

ISLISP – attempt to develop a new efficient and cleaned-up Lisp. Standardized

as ISO/IEC 13816:1997 and later revised as ISO/IEC 13816:2007 Information

technology – Programming languages, their environments and system software

interfaces – Programming language ISLISP.

IEEE Scheme – IEEE standard, 1178–1990 (R1995)

ANSI Common Lisp – an American National Standards Institute (ANSI)

standard for Common Lisp, created by subcommittee X3J13, chartered[18] to

begin with Common Lisp: The Language as a base document and to work

through a public consensus process to find solutions to shared issues of

portability of programs and compatibility of Common Lisp implementations.

Although formally an ANSI standard, the implementation, sale, use, and

influence of ANSI Common Lisp has been and continues to be seen

worldwide.

ACL2 or "A Computational Logic for Applicative Common Lisp", an

applicative (side-effect free) variant of Common LISP. ACL2 is both a

programming language in which you can model computer systems and a tool to

help proving properties of those models.

3.1.4 Major dialects

The two major dialects of Lisp used for general-purpose programming today are

Common Lisp and Scheme. These languages represent significantly different design

choices.

Common Lisp is a successor to MacLisp. The primary influences were Lisp Machine

Lisp, MacLisp, NIL, S-1 Lisp, Spice Lisp, and Scheme.[25] It has many of the features of

Page 24: HANDOUT ON A.I

Lisp Machine Lisp (a large Lisp dialect used to program Lisp Machines), but was

designed to be efficiently implementable on any personal computer or workstation.

Common Lisp has a large language standard including many built-in data types,

functions, macros and other language elements, as well as an object system (Common

Lisp Object System or shorter CLOS). Common Lisp also borrowed certain features from

Scheme such as lexical scoping and lexical closures.

Scheme (designed earlier) is a more minimalist design, with a much smaller set of

standard features but with certain implementation features (such as tail-call optimization

and full continuations) not necessarily found in Common Lisp.

Scheme is a statically scoped and properly tail-recursive dialect of the Lisp programming

language invented by Guy Lewis Steele Jr. and Gerald Jay Sussman. It was designed to

have exceptionally clear and simple semantics and few different ways to form

expressions. A wide variety of programming paradigms, including imperative, functional,

and message passing styles, find convenient expression in Scheme. Scheme continues to

evolve with a series of standards (Revisedn Report on the Algorithmic Language Scheme)

and a series of Scheme Requests for Implementation.

Clojure is a recent dialect of Lisp that principally targets the Java Virtual Machine, as

well as the CLR, the Python VM, the Ruby VM YARV, and compiling to JavaScript. It is

designed to be a pragmatic general-purpose language. Clojure draws considerable

influences from Haskell and places a very strong emphasis on immutability.

Clojure is a compiled language, as it compiles directly to JVM bytecode, yet remains

completely dynamic. Every feature supported by Clojure is supported at runtime. Clojure

provides access to Java frameworks and libraries, with optional type hints and type

inference, so that calls to Java can avoid reflection and enable fast primitive operations.

In addition, Lisp dialects are used as scripting languages in a number of applications,

with the most well-known being Emacs Lisp in the Emacs editor, AutoLisp and later

Visual Lisp in AutoCAD, Nyquist in Audacity. The small size of a minimal but useful

Scheme interpreter makes it particularly popular for embedded scripting. Examples

include SIOD and TinyScheme, both of which have been successfully embedded in the

GIMP image processor under the generic name "Script-fu".[27] LIBREP, a Lisp interpreter

Page 25: HANDOUT ON A.I

by John Harper originally based on the Emacs Lisp language, has been embedded in the

Sawfish window manager.[28] The Guile interpreter is used in GnuCash. Within GCC, the

MELT plugin provides a Lisp-y dialect, translated into C, to extend the compiler by

coding additional passes (in MELT).

3.2 LANGUAGE INNOVATIONS

Lisp was the first homoiconic programming language: the primary representation

of program code is the same type of list structure that is also used for the main data

structures. As a result, Lisp functions can be manipulated, altered or even created within

a Lisp program without extensive parsing or manipulation of binary machine code. This

is generally considered one of the primary advantages of the language with regard to its

expressive power, and makes the language amenable to metacircular evaluation.

The ubiquitous if-then-else structure, now taken for granted as an essential element of any

programming language, was invented by McCarthy for use in Lisp, where it saw its first

appearance in a more general form (the cond structure). It was inherited by ALGOL,

which popularized it.

Lisp deeply influenced Alan Kay, the leader of the research on Smalltalk, and then in turn

Lisp was influenced by Smalltalk, by adopting object-oriented programming features

(classes, instances, etc.) in the late 1970s. The Flavours object system (later CLOS)

introduced multiple inheritance.

Lisp introduced the concept of automatic garbage collection, in which the system walks

the heap looking for unused memory. Most of the modern sophisticated garbage

collection algorithms such as generational garbage collection were developed for Lisp.

Largely because of its resource requirements with respect to early computing hardware

(including early microprocessors), Lisp did not become as popular outside of the AI

community as Fortran and the ALGOL-descended C language. Newer languages such as

Java and Python have incorporated some limited versions of some of the features of Lisp,

but are necessarily unable to bring the coherence and synergy of the full concepts found

in Lisp. Because of its suitability to complex and dynamic applications, Lisp is currently

enjoying some resurgence of popular interest.

Page 26: HANDOUT ON A.I

3.3 SYNTAX AND SEMANTICS

Symbolic expressions (S-expressions)

Lisp is an expression-oriented language. Unlike most other languages, no

distinction is made between "expressions" and "statements"; all code and data are written

as expressions. When an expression is evaluated, it produces a value (in Common Lisp,

possibly multiple values), which then can be embedded into other expressions. Each

value can be any data type.

McCarthy's 1958 paper introduced two types of syntax: S-expressions (Symbolic

expressions, also called "sexps"), which mirror the internal representation of code and

data; and M-expressions (Meta Expressions), which express functions of S-expressions.

M-expressions never found favor, and almost all Lisps today use S-expressions to

manipulate both code and data.

The use of parentheses is Lisp's most immediately obvious difference from other

programming language families. As a result, students have long given Lisp nicknames

such as Lost In Stupid Parentheses, or Lots of Irritating Superfluous Parentheses.

However, the S-expression syntax is also responsible for much of Lisp's power: the

syntax is extremely regular, which facilitates manipulation by computer. However, the

syntax of Lisp is not limited to traditional parentheses notation. It can be extended to

include alternative notations. XMLisp, for instance, is a Common Lisp extension that

employs the metaobject-protocol to integrate S-expressions with the Extensible Markup

Language (XML).

The reliance on expressions gives the language great flexibility. Because Lisp functions

are themselves written as lists, they can be processed exactly like data. This allows easy

writing of programs which manipulate other programs (metaprogramming).

Many Lisp dialects exploit this feature using macro systems, which enables extension of

the language almost without limit.

3.4 LISTS

Page 27: HANDOUT ON A.I

A Lisp list is written with its elements separated by whitespace, and surrounded by

parentheses. For example, (1 2 foo) is a list whose elements are three atoms: the values 1,

2, and foo. These values are implicitly typed: they are respectively two integers and a

Lisp-specific data type called a "symbolic atoms", and do not have to be declared as such.

The empty list () is also represented as the special atom nil. This is the only entity in Lisp

which is both an atom and a list.

Expressions are written as lists, using prefix notation. The first element in the list is the

name of a form, i.e., a function, operator, macro, or "special operator" (see below.) The

remainder of the list are the arguments. For example, the function list returns its

arguments as a list, so the expression

(list '1 '2 'foo)

evaluates to the list (1 2 foo). The "quote" before the arguments in the preceding example

is a "special operator" which prevents the quoted arguments from being evaluated (not

strictly necessary for the numbers, since 1 evaluates to 1, etc.). Any unquoted expressions

are recursively evaluated before the enclosing expression is evaluated. For example,

(list 1 2 (list 3 4))

evaluates to the list (1 2 (3 4)). Note that the third argument is a list; lists can be nested.

3.5 OPERATORS

Arithmetic operators are treated similarly. The expression

(+ 1 2 3 4)

evaluates to 10. The equivalent under infix notation would be "1 + 2 + 3 + 4". Arithmetic

operators in Lisp are variadic (or n-ary), able to take any number of arguments.

"Special operators" (sometimes called "special forms") provide Lisp's control structure.

For example, the special operator if takes three arguments. If the first argument is non-

nil, it evaluates to the second argument; otherwise, it evaluates to the third argument.

Thus, the expression

(if nil

(list 1 2 "foo")

(list 3 4 "bar"))

Page 28: HANDOUT ON A.I

evaluates to (3 4 "bar"). Of course, this would be more useful if a non-trivial expression

had been substituted in place of nil.

3.5.1 Lambda expressions

Another special operator, lambda, is used to bind variables to values which are then

evaluated within an expression. This operator is also used to create functions: the

arguments to lambda are a list of arguments, and the expression or expressions to which

the function evaluates (the returned value is the value of the last expression that is

evaluated). The expression

(lambda (arg) (+ arg 1))

evaluates to a function that, when applied, takes one argument, binds it to arg and returns

the number one greater than that argument. Lambda expressions are treated no differently

from named functions; they are invoked the same way. Therefore, the expression

((lambda (arg) (+ arg 1)) 5)

evaluates to 6.

3.6 ATOMS

In the original LISP there were two fundamental data types: atoms and lists. A list

was a finite ordered sequence of elements, where each element is in itself either an atom

or a list, and an atom was a number or a symbol. A symbol was essentially a unique

named item, written as an Alphanumeric string in source code, and used either as a

variable name or as a data item in symbolic processing. For example, the list (FOO

(BAR 1) 2) contains three elements: the symbol FOO, the list (BAR 1), and the number

2.

The essential difference between atoms and lists was that atoms were immutable and

unique. Two atoms that appeared in different places in source code but were written in

exactly the same way represented the same object, whereas each list was a separate object

that could be altered independently of other lists and could be distinguished from other

lists by comparison operators.

Page 29: HANDOUT ON A.I

As more data types were introduced in later Lisp dialects, and programming styles

evolved, the concept of an atom lost importance. Many dialects still retained the predicate

atom for legacy compatibility, defining it true for any object which is not a cons.

3.7 CONSES AND LISTS

Box-and-pointer diagram for the list (42 69 613)

A Lisp list is a singly linked list. Each cell of this list is called a cons (in Scheme, a pair),

and is composed of two pointers, called the car and cdr. These are equivalent to the data

and next fields discussed in the article linked list, respectively.

Of the many data structures that can be built out of cons cells, one of the most basic is

called a proper list. A proper list is either the special nil (empty list) symbol, or a cons in

which the car points to a datum (which may be another cons structure, such as a list), and

the cdr points to another proper list.

If a given cons is taken to be the head of a linked list, then its car points to the first

element of the list, and its cdr points to the rest of the list. For this reason, the car and cdr

functions are also called first and rest when referring to conses which are part of a linked

list (rather than, say, a tree).

Thus, a Lisp list is not an atomic object, as an instance of a container class in C++ or Java

would be. A list is nothing more than an aggregate of linked conses. A variable which

refers to a given list is simply a pointer to the first cons in the list. Traversal of a list can

be done by "cdring down" the list; that is, taking successive cdrs to visit each cons of the

list; or by using any of a number of higher-order functions to map a function over a list.

Because conses and lists are so universal in Lisp systems, it is a common misconception

that they are Lisp's only data structures. In fact, all but the most simplistic Lisps have

other data structures – such as vectors (arrays), hash tables, structures, and so forth.

Page 30: HANDOUT ON A.I

3.7.1 S-expressions represent lists

Parenthesized S-expressions represent linked list structures. There are several ways to

represent the same list as an S-expression. A cons can be written in dotted-pair notation

as (a . b), where a is the car and b the cdr. A longer proper list might be written (a . (b .

(c . (d . nil)))) in dotted-pair notation. This is conventionally abbreviated as (a b c d) in

list notation. An improper list[30] may be written in a combination of the two – as (a b c .

d) for the list of three conses whose last cdr is d (i.e., the list (a . (b . (c . d))) in fully

specified form).

3.7.2 List-processing procedures

Lisp provides many built-in procedures for accessing and controlling lists. Lists can be

created directly with the list procedure, which takes any number of arguments, and

returns the list of these arguments.

(list 1 2 'a 3)

;Output: (1 2 a 3)

(list 1 '(2 3) 4)

;Output: (1 (2 3) 4)

Because of the way that lists are constructed from cons pairs, the [[cons]] procedure can

be used to add an element to the front of a list. Note that the cons procedure is

asymmetric in how it handles list arguments, because of how lists are constructed.

(cons 1 '(2 3))

;Output: (1 2 3)

(cons '(1 2) '(3 4))

;Output: ((1 2) 3 4)

The [[append]] procedure appends two (or more) lists to one another. Because Lisp lists

are linked lists, appending two lists has asymptotic time complexity

(append '(1 2) '(3 4))

;Output: (1 2 3 4)

(append '(1 2 3) '() '(a) '(5 6))

;Output: (1 2 3 a 5 6)

Page 31: HANDOUT ON A.I

3.7.3 Shared structure

Lisp lists, being simple linked lists, can share structure with one another. That is to say,

two lists can have the same tail, or final sequence of conses. For instance, after the

execution of the following Common Lisp code:

(setf foo (list 'a 'b 'c))

(setf bar (cons 'x (cdr foo)))

the lists foo and bar are (a b c) and (x b c) respectively. However, the tail (b c) is the

same structure in both lists. It is not a copy; the cons cells pointing to b and c are in the

same memory locations for both lists.

Sharing structure rather than copying can give a dramatic performance improvement.

However, this technique can interact in undesired ways with functions that alter lists

passed to them as arguments. Altering one list, such as by replacing the c with a goose,

will affect the other:

(setf (third foo) 'goose)

This changes foo to (a b goose), but thereby also changes bar to (x b goose) – a possibly

unexpected result. This can be a source of bugs, and functions which alter their

arguments are documented as destructive for this very reason.

Aficionados of functional programming avoid destructive functions. In the Scheme

dialect, which favors the functional style, the names of destructive functions are marked

with a cautionary exclamation point, or "bang"—such as set-car! (read set car bang),

which replaces the car of a cons. In the Common Lisp dialect, destructive functions are

commonplace; the equivalent of set-car! is named rplaca for "replace car." This function

is rarely seen however as Common Lisp includes a special facility, setf, to make it easier

to define and use destructive functions. A frequent style in Common Lisp is to write code

functionally (without destructive calls) when prototyping, then to add destructive calls as

an optimization where it is safe to do so.

3.8 SELF-EVALUATING FORMS AND QUOTING

Lisp evaluates expressions which are entered by the user. Symbols and lists

evaluate to some other (usually, simpler) expression – for instance, a symbol evaluates to

Page 32: HANDOUT ON A.I

the value of the variable it names; (+ 2 3) evaluates to 5. However, most other forms

evaluate to themselves: if you enter 5 into Lisp, it returns 5.

Any expression can also be marked to prevent it from being evaluated (as is necessary for

symbols and lists). This is the role of the quote special operator, or its abbreviation ' (a

single quotation mark). For instance, usually if you enter the symbol foo you will get

back the value of the corresponding variable (or an error, if there is no such variable). If

you wish to refer to the literal symbol, you enter (quote foo) or, usually, 'foo.

Both Common Lisp and Scheme also support the backquote operator (known as

quasiquote in Scheme), entered with the ` character. This is almost the same as the plain

quote, except it allows expressions to be evaluated and their values interpolated into a

quoted list with the comma and comma-at operators. If the variable snue has the value

(bar baz) then `(foo ,snue) evaluates to (foo (bar baz)), while `(foo ,@snue) evaluates to

(foo bar baz). The backquote is most frequently used in defining macro expansions.

Self-evaluating forms and quoted forms are Lisp's equivalent of literals. It may be

possible to modify the values of (mutable) literals in program code. For instance, if a

function returns a quoted form, and the code that calls the function modifies the form,

this may alter the behavior of the function on subsequent iterations.

(defun should-be-constant ()

'(one two three))

(let ((stuff (should-be-constant)))

(setf (third stuff) 'bizarre)) ; bad!

(should-be-constant) ; returns (one two bizarre)

Modifying a quoted form like this is generally considered bad style, and is defined by

ANSI Common Lisp as erroneous (resulting in "undefined" behavior in compiled files,

because the file-compiler can coalesce similar constants, put them in write-protected

memory, etc.).

Lisp's formalization of quotation has been noted by Douglas Hofstadter (in Gödel,

Escher, Bach) and others as an example of the philosophical idea of self-reference.

Page 33: HANDOUT ON A.I

3.9 SCOPE AND CLOSURE

The modern Lisp family splits over the use of dynamic or static (aka lexical)

scope. Clojure, Common Lisp and Scheme make use of static scoping by default, while

Newlisp, Picolisp and the embedded languages in Emacs and AutoCAD use dynamic

scoping.

3.9.1 List structure of program code; exploitation by macros and compilers

A fundamental distinction between Lisp and other languages is that in Lisp, the textual

representation of a program is simply a human-readable description of the same internal

data structures (linked lists, symbols, number, characters, etc.) as would be used by the

underlying Lisp system.

Lisp uses this to implement a very powerful macro system. Like other macro languages

such as C, a macro returns code that can then be compiled. However, unlike C macros,

the macros are Lisp functions and so can exploit the full power of Lisp.

Further, because Lisp code has the same structure as lists, macros can be built with any of

the list-processing functions in the language. In short, anything that Lisp can do to a data

structure, Lisp macros can do to code. In contrast, in most other languages, the parser's

output is purely internal to the language implementation and cannot be manipulated by

the programmer.

This feature makes it easy to develop efficient languages within languages. For example,

the Common Lisp Object System can be implemented cleanly as a language extension

using macros. This means that if an application requires a different inheritance

mechanism, it can use a different object system. This is in stark contrast to most other

languages; for example, Java does not support multiple inheritance and there is no

reasonable way to add it.

In simplistic Lisp implementations, this list structure is directly interpreted to run the

program; a function is literally a piece of list structure which is traversed by the

interpreter in executing it. However, most substantial Lisp systems also include a

compiler. The compiler translates list structure into machine code or bytecode for

execution. This code can run as fast as code compiled in conventional languages such as

C.

Page 34: HANDOUT ON A.I

Macros expand before the compilation step, and thus offer some interesting options. If a

program needs a precomputed table, then a macro might create the table at compile time,

so the compiler need only output the table and need not call code to create the table at run

time. Some Lisp implementations even have a mechanism, eval-when, that allows code to

be present during compile time (when a macro would need it), but not present in the

emitted module.

Evaluation and the read–eval–print loop

Lisp languages are frequently used with an interactive command line, which may be

combined with an integrated development environment. The user types in expressions at

the command line, or directs the IDE to transmit them to the Lisp system. Lisp reads the

entered expressions, evaluates them, and prints the result. For this reason, the Lisp

command line is called a "read–eval–print loop", or REPL.

The basic operation of the REPL is as follows. This is a simplistic description which

omits many elements of a real Lisp, such as quoting and macros.

The read function accepts textual S-expressions as input, and parses them into an internal

data structure. For instance, if you type the text (+ 1 2) at the prompt, read translates this

into a linked list with three elements: the symbol +, the number 1, and the number 2. It so

happens that this list is also a valid piece of Lisp code; that is, it can be evaluated. This is

because the car of the list names a function—the addition operation.

Note that a foo will be read as a single symbol. 123 will be read as the number 123. "123"

will be read as the string "123".

The eval function evaluates the data, returning zero or more other Lisp data as a result.

Evaluation does not have to mean interpretation; some Lisp systems compile every

expression to native machine code. It is simple, however, to describe evaluation as

interpretation: To evaluate a list whose car names a function, eval first evaluates each of

the arguments given in its cdr, then applies the function to the arguments. In this case, the

function is addition, and applying it to the argument list (1 2) yields the answer 3. This is

the result of the evaluation.

The symbol foo evaluates to the value of the symbol foo. Data like the string "123"

evaluates to the same string. The list (quote (1 2 3)) evaluates to the list (1 2 3).

Page 35: HANDOUT ON A.I

It is the job of the print function to represent output to the user. For a simple result such

as 3 this is trivial. An expression which evaluated to a piece of list structure would

require that print traverse the list and print it out as an S-expression.

To implement a Lisp REPL, it is necessary only to implement these three functions and

an infinite-loop function. (Naturally, the implementation of eval will be complicated,

since it must also implement all special operators like if or lambda.) This done, a basic

REPL itself is but a single line of code: (loop (print (eval (read)))).

The Lisp REPL typically also provides input editing, an input history, error handling and

an interface to the debugger.

Lisp is usually evaluated eagerly. In Common Lisp, arguments are evaluated in

applicative order ('leftmost innermost'), while in Scheme order of arguments is undefined,

leaving room for optimization by a compiler.

3.10 CONTROL STRUCTURES

Lisp originally had very few control structures, but many more were added during

the language's evolution. (Lisp's original conditional operator, cond, is the precursor to

later if-then-else structures.)

Programmers in the Scheme dialect often express loops using tail recursion. Scheme's

commonality in academic computer science has led some students to believe that tail

recursion is the only, or the most common, way to write iterations in Lisp, but this is

incorrect. All frequently seen Lisp dialects have imperative-style iteration constructs,

from Scheme's do loop to Common Lisp's complex loop expressions. Moreover, the key

issue that makes this an objective rather than subjective matter is that Scheme makes

specific requirements for the handling of tail calls, and consequently the reason that the

use of tail recursion is generally encouraged for Scheme is that the practice is expressly

supported by the language definition itself. By contrast, ANSI Common Lisp does not

require the optimization commonly referred to as tail call elimination. Consequently, the

fact that tail recursive style as a casual replacement for the use of more traditional

iteration constructs (such as do, dolist or loop) is discouraged in Common Lisp is not just

a matter of stylistic preference, but potentially one of efficiency (since an apparent tail

Page 36: HANDOUT ON A.I

call in Common Lisp may not compile as a simple jump) and program correctness (since

tail recursion may increase stack use in Common Lisp, risking stack overflow).

Some Lisp control structures are special operators, equivalent to other languages'

syntactic keywords. Expressions using these operators have the same surface appearance

as function calls, but differ in that the arguments are not necessarily evaluated—or, in the

case of an iteration expression, may be evaluated more than once.

In contrast to most other major programming languages, Lisp allows the programmer to

implement control structures using the language itself. Several control structures are

implemented as Lisp macros, and can even be macro-expanded by the programmer who

wants to know how they work.

Both Common Lisp and Scheme have operators for non-local control flow.

The differences in these operators are some of the deepest differences between the two

dialects. Scheme supports re-entrant continuations using the call/cc procedure, which

allows a program to save (and later restore) a particular place in execution. Common

Lisp does not support re-entrant continuations, but does support several ways of

handling escape continuations.

Frequently, the same algorithm can be expressed in Lisp in either an imperative or a

functional style. As noted above, Scheme tends to favor the functional style, using tail

recursion and continuations to express control flow. However, imperative style is still

quite possible. The style preferred by many Common Lisp programmers may seem more

familiar to programmers used to structured languages such as C, while that preferred by

Schemers more closely resembles pure-functional languages such as Haskell.

Because of Lisp's early heritage in list processing, it has a wide array of higher-order

functions relating to iteration over sequences. In many cases where an explicit loop

would be needed in other languages (like a for loop in C) in Lisp the same task can be

accomplished with a higher-order function. (The same is true of many functional

programming languages.)

A good example is a function which in Scheme is called map and in Common Lisp is

called mapcar. Given a function and one or more lists, mapcar applies the function

successively to the lists' elements in order, collecting the results in a new list:

Page 37: HANDOUT ON A.I

(mapcar #'+ '(1 2 3 4 5) '(10 20 30 40 50))

This applies the + function to each corresponding pair of list elements, yielding the result

(11 22 33 44 55).

Examples

Here are examples of Common Lisp code.

The basic "Hello world" program:

(print "Hello world")

Lisp syntax lends itself naturally to recursion. Mathematical problems such as the

enumeration of recursively defined sets are simple to express in this notation.

Evaluate a number's factorial:

(defun factorial (n)

(if (<= n 1)

1

(* n (factorial (- n 1)))))

An alternative implementation, often faster than the previous version if the Lisp system

has tail recursion optimization:

(defun factorial (n &optional (acc 1))

(if (<= n 1)

acc

(factorial (- n 1) (* acc n))))

Contrast with an iterative version which uses Common Lisp's loop macro:

(defun factorial (n)

(loop for i from 1 to n

for fac = 1 then (* fac i)

finally (return fac)))

The following function reverses a list. (Lisp's built-in reverse function does the same

thing.)

(defun -reverse (list)

(let ((return-value '()))

(dolist (e list) (push e return-value))

Page 38: HANDOUT ON A.I

return-value))

Object systems

Various object systems and models have been built on top of, alongside, or into Lisp,

including:

The Common Lisp Object System, CLOS, is an integral part of ANSI Common

Lisp. CLOS descended from New Flavors and Common LOOPS. ANSI Common

Lisp was the first standardized object-oriented programming language (1994,

ANSI X3J13).

ObjectLisp or Object Lisp, used by Lisp Machines Incorporated and early

versions of Macintosh Common Lisp

LOOPS (Lisp Object-Oriented Programming System) and the later

CommonLOOPS

Flavors, built at MIT, and its descendant New Flavors (developed by Symbolics).

KR (short for Knowledge Representation), a constraints-based object system

developed to aid the writing of Garnet, a GUI library for Common Lisp.

KEE used an object system called UNITS and integrated it with an inference

engine and a truth maintenance system (ATMS).

Page 39: HANDOUT ON A.I

MODULE FOUR

(PROGRAMMING LANGUAGE – PROLOG)

4.0 PREAMBLE

Prolog is a general purpose logic programming language associated with artificial

intelligence and computational linguistics.

Prolog has its roots in first-order logic, a formal logic, and unlike many other

programming languages, Prolog is declarative: the program logic is expressed in terms of

relations, represented as facts and rules. A computation is initiated by running a query

over these relations.

The language was first conceived by a group around Alain Colmerauer in Marseille,

France, in the early 1970s and the first Prolog system was developed in 1972 by

Colmerauer with Philippe Roussel.

Prolog was one of the first logic programming languages, and remains the most popular

among such languages today, with many free and commercial implementations available.

While initially aimed at natural language processing, the language has since then

stretched far into other areas like theorem proving, expert systems games, automated

answering systems, ontologies and sophisticated control systems. Modern Prolog

environments support creating graphical user interfaces, as well as administrative and

networked applications.

4.1 SYNTAX AND SEMANTICS

In Prolog, program logic is expressed in terms of relations, and a computation is

initiated by running a query over these relations. Relations and queries are constructed

using Prolog's single data type, the term. Relations are defined by clauses. Given a query,

the Prolog engine attempts to find a resolution refutation of the negated query. If the

negated query can be refuted, i.e., an instantiation for all free variables is found that

makes the union of clauses and the singleton set consisting of the negated query false, it

follows that the original query, with the found instantiation applied, is a logical

consequence of the program. This makes Prolog (and other logic programming

Page 40: HANDOUT ON A.I

languages) particularly useful for database, symbolic mathematics, and language parsing

applications. Because Prolog allows impure predicates, checking the truth value of

certain special predicates may have some deliberate side effect, such as printing a value

to the screen. Because of this, the programmer is permitted to use some amount of

conventional imperative programming when the logical paradigm is inconvenient. It has a

purely logical subset, called "pure Prolog", as well as a number of extralogical features.

4.2 DATA TYPES

Prolog's single data type is the term. Terms are either atoms, numbers, variables

or compound terms.

An atom is a general-purpose name with no inherent meaning. Examples of atoms

include x, blue, 'Burrito', and 'some atom'.

Numbers can be floats or integers.

Variables are denoted by a string consisting of letters, numbers and underscore

characters, and beginning with an upper-case letter or underscore. Variables

closely resemble variables in logic in that they are placeholders for arbitrary terms.

A compound term is composed of an atom called a "functor" and a number of

"arguments", which are again terms. Compound terms are ordinarily written as a

functor followed by a comma-separated list of argument terms, which is contained

in parentheses. The number of arguments is called the term's arity. An atom can be

regarded as a compound term with arity zero. Examples of compound terms are

truck_year('Mazda', 1986) and 'Person_Friends'(zelda,[tom,jim]).

Special cases of compound terms:

A List is an ordered collection of terms. It is denoted by square brackets with the

terms separated by commas or in the case of the empty list, []. For example [1,2,3]

or [red,green,blue].

Strings: A sequence of characters surrounded by quotes is equivalent to a list of

(numeric) character codes, generally in the local character encoding, or Unicode if

the system supports Unicode. For example, "to be, or not to be".

Page 41: HANDOUT ON A.I

4.3 RULES AND FACTS

Prolog programs describe relations, defined by means of clauses. Pure Prolog is

restricted to Horn clauses. There are two types of clauses: facts and rules. A rule is of the

form

Head :- Body.

and is read as "Head is true if Body is true". A rule's body consists of calls to predicates,

which are called the rule's goals. The built-in predicate ,/2 (meaning a 2-arity operator

with name ,) denotes conjunction of goals, and ;/2 denotes disjunction. Conjunctions and

disjunctions can only appear in the body, not in the head of a rule.

Clauses with empty bodies are called facts.

An example of a fact is:

cat(tom).

which is equivalent to the rule:

cat(tom) :- true.

The built-in predicate true/0 is always true.

Given the above fact, one can ask:

is tom a cat?

?- cat(tom).

Yes

what things are cats?

?- cat(X).

X = tom

Clauses with bodies are called rules. An example of a rule is:

animal(X):- cat(X).

If we add that rule and ask what things are animals?

?- animal(X).

X = tom

Due to the relational nature of many built-in predicates, they can typically be used in

several directions. For example, length/2 can be used to determine the length of a list

(length(List, L), given a list List) as well as to generate a list skeleton of a given length

Page 42: HANDOUT ON A.I

(length(X, 5)), and also to generate both list skeletons and their lengths together

(length(X, L)). Similarly, append/3 can be used both to append two lists (append(ListA,

ListB, X) given lists ListA and ListB) as well as to split a given list into parts (append(X,

Y, List), given a list List). For this reason, a comparatively small set of library predicates

suffices for many Prolog programs.

As a general purpose language, Prolog also provides various built-in predicates to

perform routine activities like input/output, using graphics and otherwise communicating

with the operating system. These predicates are not given a relational meaning and are

only useful for the side-effects they exhibit on the system. For example, the predicate

write/1 displays a term on the screen.

4.4 EVALUATION

Execution of a Prolog program is initiated by the user's posting of a single goal,

called the query. Logically, the Prolog engine tries to find a resolution refutation of the

negated query. The resolution method used by Prolog is called SLD resolution.

If the negated query can be refuted, it follows that the query, with the appropriate

variable bindings in place, is a logical consequence of the program. In that case, all

generated variable bindings are reported to the user, and the query is said to have

succeeded. Operationally, Prolog's execution strategy can be thought of as a

generalization of function calls in other languages, one difference being that multiple

clause heads can match a given call. In that case, the system creates a choice-point,

unifies the goal with the clause head of the first alternative, and continues with the goals

of that first alternative. If any goal fails in the course of executing the program, all

variable bindings that were made since the most recent choice-point was created are

undone, and execution continues with the next alternative of that choice-point. This

execution strategy is called chronological backtracking. For example:

mother_child(trude, sally).

father_child(tom, sally).

father_child(tom, erica).

father_child(mike, tom).

Page 43: HANDOUT ON A.I

sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y).

parent_child(X, Y) :- father_child(X, Y).

parent_child(X, Y) :- mother_child(X, Y).

This results in the following query being evaluated as true:

?- sibling(sally, erica).

Yes

This is obtained as follows: Initially, the only matching clause-head for the query

sibling(sally, erica) is the first one, so proving the query is equivalent to proving the body

of that clause with the appropriate variable bindings in place, i.e., the conjunction

(parent_child(Z,sally), parent_child(Z,erica)). The next goal to be proved is the leftmost

one of this conjunction, i.e., parent_child(Z, sally). Two clause heads match this goal.

The system creates a choice-point and tries the first alternative, whose body is

father_child(Z, sally). This goal can be proved using the fact father_child(tom, sally), so

the binding Z = tom is generated, and the next goal to be proved is the second part of the

above conjunction: parent_child(tom, erica). Again, this can be proved by the

corresponding fact. Since all goals could be proved, the query succeeds. Since the query

contained no variables, no bindings are reported to the user. A query with variables, like:

?- father_child(Father, Child).

enumerates all valid answers on backtracking.

Notice that with the code as stated above, the query ?- sibling(sally, sally). also succeeds.

One would insert additional goals to describe the relevant restrictions, if desired.

4.4.1 Loops and recursion

Iterative algorithms can be implemented by means of recursive predicates.

4.4.2 Negation

The built-in Prolog predicate \+/1 provides negation as failure, which allows for non-

monotonic reasoning. The goal \+ legal(X) in the rule

illegal(X) :- \+ legal(X).

is evaluated as follows: Prolog attempts to prove the legal(X). If a proof for that goal can

be found, the original goal (i.e., \+ legal(X)) fails. If no proof can be found, the original

goal succeeds. Therefore, the \+/1 prefix operator is called the "not provable" operator,

Page 44: HANDOUT ON A.I

since the query ?- \+ Goal. succeeds if Goal is not provable. This kind of negation is

sound if its argument is "ground" (i.e. contains no variables). Soundness is lost if the

argument contains variables and the proof procedure is complete. In particular, the query

?- illegal(X). can now not be used to enumerate all things that are illegal.

4.5 EXAMPLES

Here follow some example programs written in Prolog.

4.5.1 Hello world

An example of a query:

?- write('Hello world!'), nl.

Hello world!

true.

?-

4.5.2 Compiler optimization

Any computation can be expressed declaratively as a sequence of state transitions. As an

example, an optimizing compiler with three optimization passes could be implemented as

a relation between an initial program and its optimized form:

program_optimized(Prog0, Prog) :-

optimization_pass_1(Prog0, Prog1),

optimization_pass_2(Prog1, Prog2),

optimization_pass_3(Prog2, Prog).

or equivalently using DCG notation:

program_optimized --> optimization_pass_1, optimization_pass_2, optimization_pass_3.

4.5.3 Quicksort

The Quicksort sorting algorithm, relating a list to its sorted version:

partition([], _, [], []).

partition([X|Xs], Pivot, Smalls, Bigs) :-

( X @< Pivot ->

Smalls = [X|Rest],

partition(Xs, Pivot, Rest, Bigs)

Page 45: HANDOUT ON A.I

; Bigs = [X|Rest],

partition(Xs, Pivot, Smalls, Rest)

). quicksort([]) --> [].

quicksort([X|Xs]) -->

{ partition(Xs, X, Smaller, Bigger) },

quicksort(Smaller), [X], quicksort(Bigger).

4.6 DESIGN PATTERNS

A design pattern is a general reusable solution to a commonly occurring problem

in software design. In Prolog, design patterns go under various names: skeletons and

techniques, cliches, program schemata, and logic description schemata. An alternative to

design patterns is higher order programming.

4.7 HIGHER-ORDER PROGRAMMING

By definition, first-order logic does not allow quantification over predicates.

A higher-order predicate is a predicate that takes one or more other predicates as

arguments. Prolog already has some built-in higher-order predicates such as call/1, call/2,

call/3, findall/3, setof/3, and bagof/3. Furthermore, since arbitrary Prolog goals can be

constructed and evaluated at run-time, it is easy to write higher-order predicates like

maplist/2, which applies an arbitrary predicate to each member of a given list, and

sublist/3, which filters elements that satisfy a given predicate, also allowing for currying.

To convert solutions from temporal representation (answer substitutions on backtracking)

to spatial representation (terms), Prolog has various all-solutions predicates that collect

all answer substitutions of a given query in a list. This can be used for list

comprehension. For example, perfect numbers equal the sum of their proper divisors:

perfect(N) :-

between(1, inf, N), U is N // 2,

findall(D, (between(1,U,D), N mod D =:= 0), Ds),

sumlist(Ds, N).

Page 46: HANDOUT ON A.I

This can be used to enumerate perfect numbers, and also to check whether a number is

perfect.

4.7.1 Modules

For programming in the large, Prolog provides a module system. The module system is

standardised by ISO. However, not all Prolog compilers support modules and there are

compatibility problems between the module systems of the major Prolog compilers. [18]

Consequently, modules written on one Prolog compiler will not necessarily work on

others.

4.7.2 Parsing

There is a special notation called definite clause grammars (DCGs). A rule defined via --

>/2 instead of :-/2 is expanded by the preprocessor (expand_term/2, a facility analogous

to macros in other languages) according to a few straightforward rewriting rules,

resulting in ordinary Prolog clauses. Most notably, the rewriting equips the predicate with

two additional arguments, which can be used to implicitly thread state around, analogous

to monads in other languages. DCGs are often used to write parsers or list generators, as

they also provide a convenient interface to difference lists.

4.7.3 Meta-interpreters and reflection

Prolog is a homoiconic language and provides many facilities for reflection. Its implicit

execution strategy makes it possible to write a concise meta-circular evaluator (also

called meta-interpreter) for pure Prolog code. Since Prolog programs are themselves

sequences of Prolog terms (:-/2 is an infix operator) that are easily read and inspected

using built-in mechanisms (like read/1), it is easy to write customized interpreters that

augment Prolog with domain-specific features.

4.8 IMPLEMENTATION

4.8.1 ISO Prolog

The ISO Prolog standard consists of two parts. ISO/IEC 13211-1, published in 1995,

aims to standardize the existing practices of the many implementations of the core

elements of Prolog. It has clarified aspects of the language that were previously

ambiguous and leads to portable programs. There are two corrigenda: Cor.1:2007 and

Page 47: HANDOUT ON A.I

Cor.2:2012 ISO/IEC 13211-2, published in 2000, adds support for modules to the

standard. The standard is maintained by the ISO/IEC JTC1/SC22/WG17 working group.

ANSI X3J17 is the US Technical Advisory Group for the standard.

4.8.2 Compilation

For efficiency, Prolog code is typically compiled to abstract machine code, often

influenced by the register-based Warren Abstract Machine (WAM) instruction set.

Some implementations employ abstract interpretation to derive type and mode

information of predicates at compile time, or compile to real machine code for high

performance. Devising efficient implementation methods for Prolog code is a field of

active research in the logic programming community, and various other execution

methods are employed in some implementations. These include clause binarization and

stack-based virtual machines.

4.8.3 Tail recursion

Prolog systems typically implement a well-known optimization method called tail call

optimization (TCO) for deterministic predicates exhibiting tail recursion or, more

generally, tail calls: A clause's stack frame is discarded before performing a call in a tail

position. Therefore, deterministic tail-recursive predicates are executed with constant

stack space, like loops in other languages.

4.8.4 Term indexing

Finding clauses that are unifiable with a term in a query is linear in the number of

clauses. Term indexing uses a data structure that enables sublinear-time lookups.

Indexing only affects program performance, it does not affect semantics.

4.8.5 Tabling

Some Prolog systems, (BProlog, XSB and Yap), implement a memoization method called

tabling, which frees the user from manually storing intermediate results.

4.9 IMPLEMENTATION IN HARDWARE

Page 48: HANDOUT ON A.I

During the Fifth Generation Computer Systems project, there were attempts to

implement Prolog in hardware with the aim of achieving faster execution with dedicated

architectures. Furthermore, Prolog has a number of properties that may allow speed-up

through parallel execution. A more recent approach has been to compile restricted Prolog

programs to a field programmable gate array. However, rapid progress in general-purpose

hardware has consistently overtaken more specialized architectures.

4.10 CRITICISM

Although Prolog is widely used in research and education, Prolog and other logic

programming languages have not had a significant impact on the computer industry in

general. Most applications are small by industrial standards, with few exceeding 100,000

lines of code. Programming in the large is considered to be complicated because not all

Prolog compilers support modules, and there are compatibility problems between the

module systems of the major Prolog compilers. Portability of Prolog code across

implementations has also been a problem, but developments since 2007 have meant: "the

portability within the family of Edinburgh/Quintus derived Prolog implementations is

good enough to allow for maintaining portable real-world applications.

Software developed in Prolog has been criticized for having a high performance penalty

compared to conventional programming languages. However, advances in

implementation methods have reduced the penalties to as little as 25%-50% for some

applications.

Prolog is not purely declarative: because of the cut operator, a procedural reading of a

Prolog program is needed to understand it. The order of clauses in a Prolog program is

significant. Other logic programming languages, such as Datalog, are truly declarative:

clauses can be given in any order.

4.11 EXTENSIONS

Various implementations have been developed from Prolog to extend logic

programming capabilities in numerous directions. These include types, modes, constraint

logic programming (CLP), object-oriented logic programming (OOLP), concurrency,

Page 49: HANDOUT ON A.I

linear logic (LLP), functional and higher-order logic programming capabilities, plus

interoperability with knowledge bases:

TYPES

Prolog is an untyped language. Attempts to introduce types date back to the 1980s, and as

of 2008 there are still attempts to extend Prolog with types. Type information is useful

not only for type safety but also for reasoning about Prolog programs.

Modes

The syntax of Prolog does not specify which arguments of a predicate are inputs and

which are outputs. However, this information is significant and it is recommended that it

be included in the comments. Modes provide valuable information when reasoning about

Prolog programs and can also be used to accelerate execution.

Constraints

Constraint logic programming extends Prolog to include concepts from constraint

satisfaction. A constraint logic program allows constraints in the body of clauses, such

as: A(X,Y) :- X+Y>0. It is suited to large-scale combinatorial optimization problems, and

is thus useful for applications in industrial settings, such as automated time-tabling and

production scheduling. Most Prolog systems ship with at least one constraint solver for

finite domains, and often also with solvers for other domains like rational numbers.

4.12 HIGHER-ORDER PROGRAMMING

HiLog and λProlog extend Prolog with higher-order programming features. ISO

Prolog now supports the built-in predicates call/2, call/3, ... which facilitate higher-order

programming and lambda abstractions.

maplist(_Cont, [], []).

maplist(Cont, [X1|X1s], [X2|X2s]) :-

call(Cont, X1, X2),

maplist(Cont, X1s, X2s).

4.13 OBJECT-ORIENTATION

Page 50: HANDOUT ON A.I

Logtalk is an object-oriented logic programming language that can use most

Prolog implementations as a back-end compiler. As a multi-paradigm language, it

includes support for both prototypes and classes.

Oblog is a small, portable, object-oriented extension to Prolog by Margaret McDougall of

EdCAAD, University of Edinburgh.

Objlog was a frame-based language combining objects and Prolog II from CNRS,

Marseille, France.

Graphics

Prolog systems that provide a graphics library are SWI-prolog, Visual-prolog, LPA

Prolog for Windows and B-Prolog.

Concurrency

Prolog-MPI is an open-source SWI-Prolog extension for distributed computing over the

Message Passing Interface. Also there are various concurrent Prolog programming

languages.

Web programming

Some Prolog implementations, notably SWI-Prolog and Ciao, support server-side web

programming with support for web protocols, HTML and XML. There are also

extensions to support semantic web formats such as RDF and OWL. Prolog has also been

suggested as a client-side language.

Adobe Flash

Cedar is a free and basic Prolog interpreter. From version 4 and above Cedar has a FCA

(Flash Cedar App) support. This provides a new platform to programming in Prolog

through ActionScript.

Other

F-logic extends Prolog with frames/objects for knowledge representation.

OW Prolog has been created in order to answer Prolog's lack of graphics and

interface.

4.14 INTERFACES TO OTHER LANGUAGES

Page 51: HANDOUT ON A.I

Frameworks exist which can bridge between Prolog and other languages:

The LPA Intelligence Server allows the embedding of LPA Prolog within C,

C#, C++, Java, VB, Delphi, .Net, Lua, Python and other languages. It is

exploits the dedicated string data-type which LPA Prolog provides

The Logic Server API allows both the extension and embedding of Prolog in

C, C++, Java, VB, Delphi, .NET and any language/environment which can call

a .dll or .so. It is implemented for Amzi! Prolog Amzi! Prolog + Logic Server

but the API specification can be made available for any implementation.

JPL is a bi-directional Java Prolog bridge which ships with SWI-Prolog by

default, allowing Java and Prolog to call each other (recursively). It is known

to have good concurrency support and is under active development.

InterProlog, a programming library bridge between Java and Prolog,

implementing bi-directional predicate/method calling between both languages.

Java objects can be mapped into Prolog terms and vice-versa. Allows the

development of GUIs and other functionality in Java while leaving logic

processing in the Prolog layer. Supports XSB, with support for SWI-Prolog

and YAP planned for 2013.

Prova provides native syntax integration with Java, agent messaging and

reaction rules. Prova positions itself as a rule-based scripting (RBS) system for

middleware. The language breaks new ground in combining imperative and

declarative programming.

PROL An embeddable Prolog engine for Java. It includes a small IDE and a

few libraries.

GNU Prolog for Java is an implementation of ISO Prolog as a Java library

(gnu.prolog)

Ciao provides interfaces to C, C++, Java, and relational databases.

C#-Prolog is a Prolog interpreter written in (managed) C#. Can easily be

integrated in C# programs. Characteristics: reliable and fairly fast interpreter,

command line interface, Windows-interface, builtin DCG, XML-predicates,

Page 52: HANDOUT ON A.I

SQL-predicates, extendible. The complete source code is available, including a

parser generator that can be used for adding special purpose extensions.

Jekejeke Prolog API provides tightly coupled concurrent call-in and call-out

facilities between Prolog and Java or Android, with the marked posibility to

create individual knowledge base objects. It can be used to embed the ISO

Prolog interpreter in standalones, applets, servlets, APKs, etc..

A Warren Abstract Machine for PHP A Prolog compiler and interpreter in PHP

5.3. A library that can be used standalone or within Symfony2.1 framework