cebit2009new
DESCRIPTION
alKawarizmy Language Software 2009TRANSCRIPT
15 Years of Work before Deciding to Establish "alKhawarizmy"
alKhawarizmy Language Software" (established in January 2006)
In spite of the recency of the company, the roots of the concept of the company go back 15 years
The founder of the company Dr. Hossam ElDin Mahgoub
together with a team of researchers, developers and linguists, were engaged in NLP research, applied to the Arabic language.
Dr. Hossam established the company in order to invest his experience and research in the NLP area, applied to benefit the Arabic language and the Arab user.
The greatest challenge was to try to make the computer "understand" the Arabic language and to process it as simply as possible, in spite of its unique and special features.
KSearch, from alKhawarizmy is an Arabic search engine for websites, companies and organizations, that is capable of searching through thousands of Arabic web pages or documents
thereby benefiting your business through the following features:
1. Speed: KSearch indexes web pages and documents at a rate of about 20,000 words/sec.
2. Automatic Indexing: KSearch's indexing engine is capable of automatically indexing web pages and documents, based on a period which you select
3. Accuracy: KSearch's primary aim is to facilitate the retrieval of information for your website's visitors or the employees in your company by providing them with fast, comprehensive and accurate information retrieval.
4. Productivity: Search accuracy, fast retrieval of results, automatic indexing... these are all features that will make your Arabic content more effective. The information you retrieve will be more reliable, since it will be more reachable than before.
Discover How KSearch can benefit your Business!
Arabic NLP ResearchArabic Applications based on NLP components
Stress on software quality (targeting ‘zero defect’ S/W).Cooperate with the community; e.g. research students at universities (forming partnerships).Promote widespread use of affordable applications that take the special features of the Arabic language into account.Effectively serve the Arab region by catering for its users’ needs impact the way an Arabic user searches.
The number of Arab Internet Users is growing22 million users in 200643 million expected in 2008
The volume of Arabic e-content is increasing (on the web and in companies’ intranets): Around 100 million Arabic web
pages About 5 million Arabic web
sites
Arabic is a highly inflected languageArabic morphology has a set of unique featuresProper Arabic e-content processing is deficientConsequently, Arab users are unable to take full advantage of Arabic e-content, compared with other languagesAs an example, considering searching through Arabic content …
Using :
- Search for “الحائزون على جوائز نوبل” produces about 238 results
Using :
- Search for “الحائزون على جائزة نوبل” produces about 684 results
Using :
- Search for “حاز على جائزة نوبل” produces about 16,700 results
When used for Arabic search, traditional search engines produce
Incomprehensive results, i.e. not all inflected forms are found => a lot of useful information is missingRedundant results, i.e. some results are inaccurate => they ‘bear no relation’ in form or in meaning to the search word(s)
An Arabic Search Model that:(A) Provides Morphological Search Comprehensive(B) Differentiates between Meanings of Arabic words
Improves Accuracy
In other words…
Let us see the same example, using KSearch…
SearchArabic Morphological Search (to produce comprehensive search results).Document, as well as Database Search.Differentiation between Word Meanings (to increase accuracy of search results, i.e. reduce redundancy).Search using Logical Operators (و – أو - ليس).Adjacency (Proximity) Search, in order of query words or not.Search using Wildcards (for proper nouns).Search words are highlighted in the results pages.
Latin character support (English words). Spell checking of query words. Stem and Thesaurus Search.
= NEW (After Incubation Funding)
IndexingArabic comprehensive dictionary of contemporary Arabic (approximately 78,000 entries).Document, as well as Database Indexing.Fast Indexing Engine (≈ 20,000-56,000 words/sec on a PC with Intel Core 2 Duo CPU running at 2.33GHz, SATA HDD, 3GB RAM).Uses 64 bit Technology => Unlimited Index Size.Comprehensive Index Management: Capability of deleting, updating and merging indexes.Following document formats are supported, including UNICODE encoded documents: Text, RTF, MS Office, PDF.
Arabic ِMorphological Analyzer
Comprehensive + Contemporary Arabic Lexicon
Arabic Data
Source(Database, Document,
etc.)
Indexing Engine
Meta Data Repository
Search Engine
Search Results
Arabic Lexical Semantic Analyzer
Component Oriented Architecture:Software Integrated in:
Websites Web Edition. Enterprises (Intranets) Enterprise Edition.Single PCs Desktop Edition.
Software as a Service (SaaS) – Future Direction:
On Dedicated Web Server.
Employs KMorph, a fast Arabic morphological analyzer.
Uses a comprehensive Arabic lexicon of contemporary words.
KSpell Engine: Provides APIs for spelling verification and correction, e.g. may be integrated with content management systems to produce correctly spelled Arabic web content.
Target Audience:
1- e- Government.2- Web Publishers (News sites, Web developers,
…etc.).3- Web Content Management (CMS, E-library
systems, Helpdesk…etc.).4- Arabic & Arabic enabled internet search sites.
Competitive Advantage:
Price
Off-the-Shelf Installation
Online Demonstration
Target Markets
Arabic Web Sites(3,000,000)
Corporations'Intranets (566,000)