march, 2007rco llc, rco text analysis technologies for information extraction and business...

9
March, 2007 RCO LLC, http://www.rco.ru RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about the text!

Upload: nathan-holmes

Post on 18-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

March, 2007 RCO LLC, Indexers Typical scheme of analytical department operations, scenario 2 Facts: Primary knowledge Business report Forecasts Dossier RCO Fact Extractor Unstructured text

TRANSCRIPT

Page 1: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

RCO Text Analysis Technologies

for information extraction and business intelligence

We can tell you everything about the text!

Page 2: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Analytical departmentAnalytical department

Typical scheme of analytical department operations, scenario 1

Search engine

RCO Fact Extractor

Unstructured text

Businessreport

Forecasts

Dossier

Page 3: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Indexers

Typical scheme of analytical department operations, scenario 2

Facts: Primary

knowledge

Businessreport

Forecasts

Dossier

RCO Fact Extractor

Unstructured text

Page 4: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

special entities (date, address, phone, monetary amount, credit card and account numbers, vehicle and passport numbers, different measures, …)

proper named entities (persons, organizations, geography, goods, …)

entities named by noun phrases

relationships between entities

events, facts and their participants

topics of text on which the author’s attention was focused

RCO information extraction key features

Page 5: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Result of parsing sentence: On September 7th, 2006 John Smith accepted conditions of the contract with New Design Ltd. for reconstruction of his family castle.

Semantic network: the RCO way to represent content of text

Page 6: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Semantic templates: the RCO way to extract facts

This template can extract ‘contract’ facts from different texts, e.g.: On September 7th, 2006 John Smith has accepted conditions of a long-term agreement with New Design Ltd. for reconstruction of his family castle.

Result of ‘contract’ fact extraction: Signer1 = ‘John Smith’Signer2 = ‘New Design’Contract = ‘long-term agreement’Subject = ‘reconstruction of family castle’Event = ‘accept’Date = ‘On September 7th, 2006’

Page 7: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Events, facts, and related participants extracted from text

Extracted information about facts “to have an

agreement”

Sentences from source that describe

facts “to have an agreement”

RCO Fact Extractor English: text analyzer for business intelligence

Page 8: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

Objects to bemonitored:companies

and persons

Extracted event participant:

companies which were bought by MDM financial

group

Sentences from source that describe extracted facts

RCO Fact Extractor Russian: text analyzer for business intelligence

Page 9: March, 2007RCO LLC,  RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about

March, 2007 RCO LLC, http://www.rco.ru

RCO team has already developed text analyzers for English, Russian, and Ukrainian languages.

RCO team has rich experience in tuning of linguistic algorithms for different languages.

RCO team is always open for new business partners, new languages, and new challenges.