blogic. (iswc 2009 invited talk)

ISWC 2009

BLOGICPat Hayes Florida IHMC

Web Logic = Blogic

(Please do not ask me about Alogic.)

Two talks...Blogic + RDF Redux (I couldn't decide which one to give, so you are getting both.)

Blogic theme: when logic gets on the Web, it really needs to be re-thought from the ground up. It is a new subject.

RDF Redux theme: with hindsight, we can do RDF better, making it into a fully expressive Common Logic dialect.

Revised RDF could be a real logic for the SWeb, the first true blogic.

What y'all might be thinking at this point.

Please, enough with the logic already. We have way too many web logics, a positive zoo of endangered OWL species, so don't give us another one.

Even if you are right, its too late. You and Guha tried, Pat, but nobody was interested in LBase and only about four people have read the Common Logic spec. Give up on it, the RDF/OWL train has left the station.

Why isn't Blogic just Logic?

The web portability principle.

Names and identification.

The Horatio principle.

SameAs not the same as.

Death by Layering.

Say what you are going to say next.

Say it.

Say what it was you just said.

Say it again.

Speaking

Web portability

Some Logic StuffSome other Logic

StuffEntails

The same Logic Stuff

but somewhere else

Entails

HTTP HTTP

The same other Logic Stuff

but somewhere else

This diagram should always commute:

Web portability

:Married rdf:type :ConjugalRelatio

n:Jill :Married :Jack

:Married rdf:type :ConjugalRelatio

n:Jill :Married :Jack

HTTP

_:x rdf:type :ConjugalRelatio

n:Jill _:x :Jack

Entails

Web portability

RDF is portable. ISO Common Logic is portable.

OWL-DL, classical FOL syntax are not portable

OWL 2 is not portable but is better than OWL. Maybe OWL 3 will be portable. Sigh.

Names

Names are central in blogic. They are global in scope. They have structure. They link blogical content to other meaningful things, including other blogical content. They embody human/social meanings as well as being conduits and route maps for information transfer. In many ways, the Web is constituted by the links which are the blogic names, and the logical content which we write using those names is only one component, perhaps a minor one, of the whole social and technical structure which determines their meanings. And yet seen from the perspective of the logic, these IRIs are merely "logical names", elements of an arbitrary set of meaningless character strings. In AI/KR, we teach our students that the names are irrelevant, because one can replace them all with gensyms without changing the logical meaning.

Clearly, there is something unsatisfactory about this picture, a serious disconnect between the classical logical view of names as simply uninterpreted strings waiting in a kind of blank innocence to have their possible interpretations controlled by the pure semantic power of the axioms that use them, and the reality of the almost unrestricted referential power that these names actually have in the dynamics of the Web. Think of the concern and attention that is devoted to their choice, who owns them, who is responsible for maintaining and controlling them, and the ways they are decomposed and used in the planet-wide machinery called the Internet, none of which has very much at all to do with logical assertions. Another way to put it: IRIs are *identifiers*, not mere logical names. Unfortunately, nobody seems to be able to say what in God's name that can possibly mean.

HTTP-range-14 is just one symptom of this disconnect.

Names and RDF

RDF semantic interpretations are stated as mappings on a 'vocabulary' = a set of names. Textbook logic stuff, but wrong for a blogic like RDF.

Redux: we should have said that every interpretation is a mapping from all URI references. There are no 'local' names in RDF.

Horatio principle

There are more things in heaven and earth, Horatio, than are dreamt of in your ontology.

So you cannot say "forall x...", only "forall x in class C,... "

SameAs not the same as

A sameAs B doesn't mean that anything you say about A is also true of B (when referred to by that name). The name you use matters.

Sodium

You say 3D, and I say 4D

Linked data needs to be able to express co-reference without implying acceptance of an entire conceptualization. Is there a "degree of ontological commitment" (?)

"Nearly same as" = "slightly pregnant"

Lynn Stein: social roles for names and descriptors. Speech-act semantics??

Death by Layering

The 'layer cake' diagram is good computer architecture but really, really bad semantic architecture. Blogical forms do not naturally layer, because names have a different logical status at different levels.

OWL/RDF is layered on RDF in this way, which is why SPARQL cannot know what 'entailed' means.

The same piece of logical text has several different entailment regimes applying to it, with no way to communicate which one is intended, destroying portability.

This is a mess, which will get worse. It will not fix itself. We need to provide blogic as a single layer with one notion of entailment. It can have subcases, but not layers.

OK, part two: RDF Redux

With hindsight comes wisdom.

There are many things wrong with RDF which should be done better.

Literals allowed in subject position; naming graphs; a nuanced version of importing; not having plain literals and xsd:string as a datatype; reification; containers; etc.. Details, details.

But there is a much more fundamental glitch in the RDF conceptual model, one that we simply missed. Fixing that one properly makes RDF simpler, more rational, more useful and vastly more expressive.

The matter of the blank node

Blank nodes in current RDF are broken.

Why are blank nodes so hard to get right?

RDF abstract syntax is a node-arc diagram

Blank nodes are just nodes that have no label.

That seems pretty obvious.

blank nodes are just nodes that have no label.

That seems pretty obvious.

But its not so obvious how to say this mathematically.

The RDF spec uses set language: it says that an RDF graph is a set of triples, and that blank nodes are elements of a set of items disjoint from URIs and literals.

But there is something fundamentally wrong with this 'set' style of describing syntax.


Mathematical sets aren't the right kind of thing to make syntax out of.

Sets exist in a Platonic universe of abstractions. There is no type/token distinction. You can't copy a set. You can't write or transmit a set. You can't put a set on a Web server.

There are unresolved puzzles. Is any set of triples an RDF graph?

The same blank node might be in several graphs (why not, when a graph is just a set?) Hence we get union versus merge, etc..


What is missing in RDF concepts is something to capture the intuition that an RDF graph is like a node-arc diagram. (Not a 'mathematical' graph!)

RDF graphs are drawn on surfaces. Blank nodes are marks on the surface. Intuitively, think of a surface as a piece of paper, or a screen, or a document.

Surfaces provide the missing type/token distinction. Putting the same graph onto a new surface is like making a copy. But copying a graph onto a new surface always gets you new blank nodes, because a mark can only be on one surface. Aha!

A blank node is a mark on a surface.

Formally. Take the RDF concepts as published, add a set of surfaces, disjoint from all the others, and a functional property of being on between blank nodes (call them marks for emphasis) and surfaces. Call the set of marks on a surface the graffiti of the surface. Define a graph to be a pair of an RDF graph G and a surface S such that the blank nodes of G are a subset of the graffiti of S. The triples of a graph are the triples of the RDF graph. We will say that the triples of the graph, and the URIs and literals which occur in the RDF graph are on the surface.

A blank node is a mark, on a surface.

(From now on, 'graph' means RDF-graph + surface.)

A graph can have extra marks, but they don't mean anything so are harmless (technically, they say that something exists.)

A surface can have more than one graph on it, but a graph cannot be split over multiple surfaces. (Contrast RDF graph.)

Even with no blank nodes, each graph is on a single surface.

A copy of a graph <G, S> is a graph <G', S'> such that there is a 1:1 map m from the marks of S to those of S' and G'=m(G)

A blank node is a mark, on a surface.

Surfaces make sense of RDF syntax, while keeping it abstract. They also provide a neat abstraction for some Webbish notions.

Surfaces provide the missing type/token distinction, and make sense of the ideas of copying and transmitting (= copying onto a distant surface) RDF graphs.

Surfaces get rid of the merge/union distinction. A conjunction of two graphs is a graph got by copying them both onto a single surface. (No need to "standardize apart")

Surfaces provide a way to define syntactic scope in RDF. Graphs have a natural 'boundary'.

The URI of a named graph identifies a graph. (Not an RDF graph!)

Surfaces provide a way to track 'dynamic' RDF graphs. The surface retains its identity through RDF graph changes. Makes sense of SPARQL "update".

Surfaces handily resolve tricky bnode-scoping issues e.g. in SPARQL. The query, the reference graph and the answers are all on distinct surfaces: end of story.

Surfaces are a good idea.

Think colored paper.

Positive surfaces claim that an RDF graph on them is true. This is the current RDF default assumption.

Negative surfaces claim that an RDF graph on them is false.

Neutral surfaces simply make no claims at all about their graphs. (Good place to put eg. RDF collection triples in OWL/RDF.)

We can imagine others (deprecating surfaces?) but this will do for now.

Kinds of surface.

(If we only allow positive surfaces, this is just current RDF but with a cleaner conceptual model.)

By allowing different kinds of surface, we can encode different assertional modes. For example, the surface can assert the graph or deny the graph or just display the graph without making claims about its truth either way. None of this changes the RDF semantics of RDF graphs!

Once we have denial and scoping, we have negation. RDF already has conjunction and the existential quantifier (blank nodes). This gives a graphical syntax for full first-order logic, if we have the freedom to combine them properly.

Surfaces are a very good idea.

Using a graph syntax for logic is one the oldest ideas (C.S.Peirce, 1885) and very well understood. http://www.flickr.com/photos/lilitupili/260552781/

((p => a) & (q => b)) => ((p & q) => (a & b))

http://www.flickr.com/photos/lilitupili/260552781/

Because RDF graphs retain their current RDF semantics, marks on a negative surface are more like universally quantified variables.

DeMorgan's law: (not (exists x ...)) = (forall x (not ...))

Kinds of surface.

_:x rdf:type ex:oddities_:x rdf:type ex:oddities

oddities existnot(oddities exist)everything is not an

oddity

In order to get the full power of logic, we need a way to include surfaces inside other surfaces.

Extend the abstract RDF-surface model to allow surfaces, as well as nodes and triples, to be on a surface.

A finite set of surfaces tree-ordered by on is a codex. Extending RDF to allow graphs on codices instead of (simple) surfaces makes it into Pierce conceptual graph notation, giving it the power of full FOL (in fact, of ISO Common Logic.)

Surfaces on surfaces: RDF codices.


Every city is a human community.

Some non-city is a human community.


Putting RDF graphs on a codex requires that we are precise about exactly which surface each node of each triple in the graph is on. This is easy to do graphically:

Not( exists something which is a City and Not(a HumanCommunity))

Every City is a HumanCommunity

ex:City rdfs:subClassOf ex:HumanCommunity .


ex:City rdfs:subClassOf ex:HumanCommunity .

This graph now has its RDFS meaning in RDF already. The RDF semantics defines the RDFS meaning. It is not a "semantic extension", and there is no layering involved, only abbreviation, AKA syntactic sugar.

With just this much extra apparatus, RDF is all the logic we need.

Graphical convention (used already): an RDF triple is attached to a surface by its property arc label. The subject and object nodes might be on other surfaces.


Text convention: add 'surface parentheses' and explicit bnode binding syntax to Ntriples or Turtle.


%not[ _:x_:x rdf:type ex:city .%not[ _:x rdf:type ex:HumanCommunity .%]%]

Abbreviations may not be very easy to read, but they work.aaa rdfs:range bbb .

==>>

%not[ _:x _:y_:x aaa _:y .%not[ _:y rdf:type bbb .%]]

aaa is bbb owl:allValuesFrom ccc .

==>>

%not[ _:x _:y_:x rdf:type aaa ._:x bbb _:y .%not[ _:y rdf:type ccc .%]]

%not[ _:x%not[ _:y _:x bbb _:y .%not[_:y rdf:type ccc .%]]%not[_:x rdf:type aaa .%]]

Semantic OWL/RDF

Currently, OWL has its semantics and so does RDF and this is a problem.

With surfaces, we can simply encode OWL meanings directly in RDF, using the RDF semantics rather than trying to avoid it. Then, OWL (and much of RIF) are simply organized collections of RDF abbreviations and restrictions. There is no layering and no extra semantics: the only requirement for the 'higher-level' specs is to define the 'upper' notations as syntactic sugar.

The RDF+surfaces conceptual model provides a single, universal interchange format for (nearly) all SWeb languages, with a single, uniform semantic model. A true blogic, in fact.

A bigger base for the layer cake.

Some of RIF is outside normal logic. SPARQL is a law unto itself. The rest is (revized) RDF with

syntactic sugar and restrictions.

Resources

Piercian graphical logic has been widely used, see http://conceptualgraphs.org/, and even standardized (ISO 24707 App. B) .

John Sowa has been very active in this area, and I have used his ideas at key places. See http://www.jfsowa.com/cg/index.htm

http://conceptualgraphs.org/

http://www.jfsowa.com/cg/index.htm

blogic. (iswc 2009 invited talk)

Technology