face annotation
TRANSCRIPT
CHAPTER 1
INTRODUCTION
In computer science, image processing is any form of signal
processing for which the input is an image, such as a photograph or video
frame; the output of image processing may be either an image or, a set of
characteristics or parameters related to the image. Most image-processing
techniques involve treating the image as a two-dimensional signal and
applying standard signal-processing techniques to it.
Today Face recognition is a rapidly growing area in image
processing. It has many uses in the fields of biometric authentication,
security, and many other areas.
1.1 FACE DETECTION
Face detection is a computer technology that determines the
locations and sizes of human faces in arbitrary (digital) images. It detects
facial features and ignores anything else, such as buildings, trees and
bodies.
Face detection can be regarded as a specific case of object-class
detection. In object-class detection, the task is to find the locations and
sizes of all objects in an image that belong to a given class. Examples
include upper torsos, pedestrians, and cars.
Face detection can be regarded as a more general case of face
localization. In face localization, the task is to find the locations and sizes
of a known number of faces (usually one). In face detection, one does not
have this additional information.
1
Early face-detection algorithms focused on the detection of frontal
human faces, whereas newer algorithms attempt to solve the more general
and difficult problem of multi-view face detection. That is, the detection
of faces that are either rotated along the axis from the face to the observer
(in-plane rotation), or rotated along the vertical or left-right axis (out-of-
plane rotation), or both. The newer algorithms take into account
variations in the image or video by factors such as face appearance,
lighting, and pose.
Single image detection methods are classified into four categories;
some methods clearly overlap category boundaries and are discussed in
this section.
Knowledge-based methods.
Feature invariant approaches.
Template matching methods.
Appearance-based methods.
KNOWLEDGE-BASED METHODS: These rule-based methods
encode human knowledge of what constitutes a typical face. Usually, the
rules capture the relationships between facial features. These methods are
designed mainly for face localization.
FEATURE INVARIANT APPROACHES: These algorithms aim to
find structural features that exist even when the pose, viewpoint, or
lighting conditions vary, and then use these to locate faces. These
methods are designed mainly for face localization.
TEMPLATE MATCHING METHODS: Several standard patterns of a
face are stored to describe the face as a whole or the facial features
separately. The correlations between an input image and the stored
patterns are computed for detection. These methods have been used for
both face localization and detection.
2
APPEARANCE-BASED METHODS: In contrast to template
matching, the models (or templates) are learned from a set of training
images which should capture the representative variability of facial
appearance. These learned models are then used for detection. These
methods are designed mainly for face detection.
1.2 FACE RECOGNITION
Face recognition systems are progressively becoming popular as
means of extracting biometric information. Face recognition has a critical
role in biometric systems and is attractive for numerous applications
including visual surveillance and security. Because of the general public
acceptance of face images on various documents, face recognition has a
great potential to become the next generation biometric technology of
choice. Face images are also the only biometric information available in
some legacy databases and international terrorist watch-lists and can be
acquired even without subjects' cooperation.
1.3 ONLINE SOCIAL NETWORKS (OSNs):
Online Social Networks is an online service, platform, or site that
focuses on building and reflecting of social networks or social relations
among people, who, for example, share interests and/or activities. A
social network service consists of a representation of each user (often a
profile), his/her social links, and a variety of additional services. Most
social network services are web-based and provide means for users to
interact over the Internet, such as e-mail and instant messaging.
Social networking sites allow users to share ideas, activities,
events, and interests within their individual networks.
The main types of online social networks are those that contain
category places (such as former school year or classmates), means to
3
connect with friends (usually with self-description pages), and a
recommendation system linked to trust. Popular methods now combine
many of these, with Facebook, Google+ and Twitter widely used
worldwide; Facebook, Twitter, LinkedIn and Google+ are very popular in
India.
Web-based social networking services make it possible to connect
people who share interests and activities across political, economic, and
geographic borders. Through e-mail and instant messaging, online
communities are created where a gift economy and reciprocal altruism are
encouraged through cooperation. Information is particularly suited to gift
economy, as information is a nonrival good and can be gifted at
practically no cost.
Facebook and other social networking tools is increasingly the
object of scholarly research. Scholars in many fields have begun to
investigate the impact of social-networking sites, investigating how such
sites may play into issues of identity, privacy, social capital, youth
culture, and education.
1.4 FACE ANNOTATION:
Face annotation technology is important for a photo management
system. The act of labeling identities (i.e., names of individuals or
subjects) on personal photos is called face annotation or name tagging.
This feature is of considerable practical interest for Online Social
Networks.
Automatic face annotation (or tagging) facilitates improved
retrieval and organization of personal photos in online social networks.
4
The users of Online Social Networks spend enormous amounts of
time in browsing the billions of photos posted by their friends and
relatives. Normally they would go through and tag photos of friends and
themselves if they were so inclined, but now Online Social Networks are
trying to make things easier with face recognition.
Using similar algorithms for face detection that digital cameras are
now using, Online Social Networks can spot a face in an image once it’s
uploaded, and a box shows up with a space for the users to enter their or
theirs’ friend’s name, tagging them for recognition across the system.
The next step of the Online Social Network is automatic face annotation.
Typically, online social networks like Facebook and Myspace are
used for sharing or managing a photo or video collection. Users tag
photos of individuals and friends using face annotation. The thing is,
most online social networks feature manual face annotation, a time-
consuming task that can be extremely labor-intensive. The amount of
photos posted online increased astronomically each day and that’s a lot of
photos and faces to tag.
5
CHAPTER 2
LITERATURE REVIEW
2.1 SOCIAL NETWORK CONTEXT:
Most personal photos that are shared online are embedded in some
form of social network, and these social networks are a potent source of
contextual information that can be leveraged for automatic image
understanding. The utility of social network context is investigated for the
task of automatic face recognition in personal photographs. Here face
recognition scores are combined with social context in a conditional
random field (CRF) model and apply this model to label faces in photos
from the popular online social network Facebook, which is now the top
photo-sharing site on the Web with billions of photos in total.
An increasing number of personal photographs are uploaded to
online social networks, and these photos do not exist in isolation Social
networks are an important source of image annotations, and they also
provide contextual information about the social interactions among
individuals that can facilitate automatic image understanding. Here, the
specific problem of automatic face recognition in personal photographs is
focused. Photos and context drawn from the online social network
Facebook are used, which is currently the most popular photo-sharing site
on the Web. Many Facebook photos contain people, and these photos
comprise an extremely challenging dataset for automatic face recognition.
To incorporate context from the social network, a conditional
random field (CRF) model is defined for each photograph that consists of
a weighted combination of potential functions. One such potential comes
from a baseline face recognition system, and the rest represent various
aspects of social network context. The weights on these potentials are
learned by maximizing the conditional log-likelihood over many training
photos, and the model is then applied to label faces in novel photographs.
6
Drawing on a database of over one million downloaded photos, it is
shown that incorporating social network context leads to a significant
improvement in face recognition rates.
ADVANTAGES:
This method combines image data with social network context in a
conditional random field model to improve recognition performance.
DISADVANTAGES:
With larger numbers of labeled faces available, this can apply to
photos that contain arbitrary numbers of faces, including single-face
photos for which only more limited context is available.
2.2 IMAGE CONTENT AND SOCIAL CONTEXT:
MediAssist is a system which facilitates browsing, searching and
semi-automatic annotation of personal photos, using analysis of both
image content and the context in which the photo is captured. This
semiautomatic annotation includes annotation of the identity of people in
photos. Technologies for efficiently managing and organizing digital
photos assume more and more importance as users wish to efficiently
browse and search through larger and larger photo collections. It is
accepted that content-based image retrieval techniques alone have failed
to bridge the so-called semantic gap between the visual content of an
image and the semantic interpretation of that image by a person. Personal
photos differ from other images in that they have an associated context,
often having been captured by the user of the photo management system.
Users will have personal recollection about the time, place and other
context information relating to the environment of photo capture, and
digital personal photos make a certain amount of contextual metadata
available in their EXIF header, which stores the time of photo capture and
camera settings such as lens aperture and exposure time. GPS location
information is also supported by EXIF and, although not currently
7
captured by most commercial cameras, there are ways of “location-
stamping” photos using data from a separate GPS device, and camera
phones are inherently location-aware.
Systems for managing personal photo collections could thus make
use of this contextual metadata in their analysis and organization of
personal photos. One of the more important user needs for the
management of personal photo collections is the annotation of the
identities of people. They proposed the MediAssist system for personal
photo management, a context-aware photo management system that
includes person annotation technologies as one of its major features.
The system uses context- and content-based analysis to provide
powerful tools for the management of personal photo collections, and
facilitates semi-automatic person-annotation in personal photo
collections, powered by automatic analysis. Traditional face recognition
approaches do not generally cope well in an unconstrained photo capture
environment, where variations in lighting, pose and orientation represent
major challenges. It is possible to overcome some of these problems by
exploiting the contextual information that comes with personal photo
capture.
ADVANTAGES:
By combining context and content analysis, performance over
content or context alone is improved.
DISADVANTAGES:
Here No automatic face annotation without prompting the user for
confirmation.
2.3 DISTRIBUTED FACE RECOGNITION:
Distributed face recognition is one of the problems in a calibrated
camera sensor network. It is assumed that each camera is given a small
and possible different training set of face images taken under varying
8
viewpoint, expression, and illumination conditions. Each camera can
estimate the pose and identity of a new face using classical techniques
such as Eigenfaces or Tensorfaces combined with a simple classifier.
However, the pose estimates obtained by a single camera could be very
poor, due to limited computational resources, impoverished training sets,
etc., which could lead to poor recognition results.
The key contribution is to propose a distributed face recognition
algorithm in which neighboring cameras share their individual estimates
of the pose in order to achieve a “consensus” on the face pose. For this
purpose, convergent distributed consensus algorithm is used on SE (3)
that estimates the global Karcher mean of the face pose in a distributed
fashion.
In face recognition, most existing algorithms such as Eigenfaces,
Fisherfaces, ICA and Tensorfaces, operate in a centralized fashion. In this
paradigm, training is performed by a central unit, which learns a face
model from a training set of face images. Images of a new face acquired
from multiple cameras are also sent to the central unit, which performs
recognition by comparing these images to the face model. However, this
approach has several drawbacks when implemented in a smart camera
network.
First, it is not fault tolerant, because a failure of the central
processing unit implies a failure of the entire application. Second, it
requires the transmission of huge amounts of raw data. Moreover, it is not
scalable, because as the number of nodes in the network increases, so
does the amount of processing by the central unit, possibly up to a point
where it exceeds the available resources. Existing approaches perform
compression to reduce the amount of transmitted data, but the
compressed images are still processed by a central unit. In a fully
distributed paradigm, each node could perform an initial processing of the
9
images and transmit over the network only the distilled information
relevant to the task at hand. Then, the nodes could collaborate in order to
merge their local observations and come up with a global result, which is
consistent across the entire network.
This framework has several advantages over the centralized
paradigm. For instance, when a single node fails the others can still
collaborate among each other. Moreover, since each node performs its
share of the processing, the aggregate resources available in the network
grow with the number of nodes, making the solution scalable. The pose of
the object is chosen as the local information to be extracted by the nodes
and transmitted to neighboring nodes. Consensus algorithms provide a
natural distributed estimation framework for aggregating local
information over a network. While this approach is specifically designed
for recognizing faces, distributed framework could be easily applied to
other object recognition problems, as long as each camera node can
provide an estimate of the object pose.
ADVANTAGES:
The pose of an object can be represented with only six parameters,
which is obviously much less than the number of pixels in an image.
Then the pose of a face can be estimated from a single view using
existing face recognition algorithms, e.g., view based Eigenfaces . Then,
the pose of the face is a global quantity, which facilitates the definition of
a global objective that all the nodes need to achieve, e.g., finding the
average pose.
DISADVANTAGES:
Here, two to five cameras are connected in a network with a ring
topology. So, if any fault occurs in any one camera, then it will reduce the
accuracy of pose of face.
10
2.4 PERSONALIZING IMAGE SEARCH RESULTS:
The social media site Flickr allows users to upload their photos,
annotate them with tags, submit them to groups, and also to form social
networks by adding other users as contacts. Flickr offers multiple ways of
browsing or searching it. One option is tag search, which returns all
images tagged with a specific keyword. If the keyword is ambiguous,
e.g., “beetle” could mean an insect or a car, tag search results will include
many images that are not relevant to the sense the user had in mind when
executing the query. Users express their photography interests through
the metadata they add in the form of contacts and image annotations. The
photo sharing site Flickr is one of the earliest and more popular examples
of the new generation of Web sites, labeled social media, whose content
is primarily user-driven. Other examples of social media include: blogs
(personal online journals that allow users to share thoughts and receive
feedback on them), Wikipedia (a collectively written and edited online
encyclopedia), and Del.icio.us and Digg (Web sites that allow users to
share, discuss, and rank Web pages, and news stories respectively). Social
media sites share four characteristics:
(1) Users create or contribute content in a variety of media types;
(2) Users annotate content with tags;
(3) Users evaluate content, either actively by voting or passively by
using content; and
(4) Users create social networks by designating other users with
similar interests as contacts or friends.
In the process of using these sites, users are adding rich metadata in
the form of social networks, annotations and ratings. Availability of large
quantities of this metadata will lead to the development of new
algorithms to solve a variety of information processing problems, from
new recommendation to improved information discovery algorithms.
11
ADVANTAGES:
User-added metadata on Flickr can be used to improve image
search results. The approaches described here can also be applied to other
social media sites, such as Del.icio.us.
DISADVANTAGES:
Since Flickr have become extremely popular, yet the user-supplied
tags are relatively scarce compared to the number of annotated images.
The same tag can additionally be used in different senses, making the
problem even more challenging.
2.5 PERSONAL AND SOCIAL NETWORK CONTEXT
Social network context is useful as real-world activities of
members of the social network are often correlated within a specific
context. The correlation can serve as a powerful resource to effectively
increase the ground truth available for annotation. There are three main
contributions:
(a) Development of an event context framework and definition of
quantitative measures for contextual correlations based on concept
similarity in each facet of event context;
(b) Recommendation algorithms based on spreading activations
that exploit personal context as well as social network context;
(c) Experiments on real-world, everyday images that verified both
the existence of inter-user semantic disagreement and the
improvement in annotation when incorporating both the user and
social network context.
They develop a novel collaborative annotative system that exploits
the correlation in user context and the social network context. This work
enables members of a social network to effectively annotate images. The
social network context is important when different users’ annotations and
their corresponding semantics are highly correlated. The annotation
12
problem in social networks has several unique characteristics different
from the traditional annotation problem. The participants in the network
are often family, friends, or co-workers, and know each other well. They
participate in common activities – e.g. traveling, attending a seminar,
going to a film, parties etc. There is a significant overlap in their real
world activities. Social networks involve multiple users – this implies that
each user may have a distinctly different annotation scheme, and different
mechanisms for assigning labels. There may be significant semantic
disagreement amongst the users, over the same set of images.
The traditional image annotation problem is a very challenging one
– only a small fraction of the images are annotated by the user, severely
restricting the ground truth available. In our approach we define event
context – the set of facets / attributes (image, who, when, where, what)
that support the understanding of everyday events. Then we develop
measures of similarity for each event facet, as well as compute event-
event and user-user correlation. The user context is then obtained by
aggregating event contexts and is represented using a graph.
ADVANTAGES:
This method provides quantitative and qualitative results which
indicate that both personal and social context facilitates effective image
annotation. Here a novel collaborative annotative system is developed.
That system exploits the correlation in user context and the social
network context. This work enables members of a social network to
effectively annotate images.
DISADVANTAGES:
The notion of “context” has been used in many different ways
across applications. But set of contextual attributes is always application
dependent.
13
CHAPTER 3
SYSTEM ANALYSIS
3.1 EXISTING SYSTEM:
There are many problems that exist due to the many factors
that can affect the photos. When processing images one must take into
account the variations in light, image quality, the persons pose and facial
expressions along with others. In order to successfully be able to identify
individuals correctly there must be some way to account for all these
variations and be able to come up with a valid answer.
In order to recognize face many methods are available. They have
not only advantages but also disadvantages. Social network context is one
method which is used to improve face annotation. But in this method,
with larger numbers of labeled faces available, this can apply to photos
that contain arbitrary numbers of faces, including single-face photos for
which only more limited context is available. MediAssist is a system
which facilitates semi-automatic annotation of personal photos, using
analysis of both image content and the context in which the photo is
captured. Here No automatic face annotation without prompting the user
for confirmation. In a distributed face recognition algorithm neighboring
camera share their individual estimates of the pose in order to achieve a
“consensus” on the face pose. Here cameras are connected in a network
with a ring topology. So, if any fault occurs in any one camera, then it
will reduce the accuracy of pose of face. The social media site Flickr
allows users to upload their photos, annotate them with tags, submit them
to groups, and also to form social networks by adding other users as
contacts. Since Flickr have become extremely popular, yet the user-
supplied tags are relatively scarce compared to the number of annotated
14
images. The same tag can additionally be used in different senses, making
the problem even more challenging.
3.2 PROPOSED SYSTEM:
A novel collaborative face recognition (FR) framework is
proposed to improve the accuracy of face annotation by effectively
making use of multiple FR engines available in an OSN.
The proposed collaborative FR framework is constructed using
different M+1FR engines: one FR engine belongs to the current user,
while M FR engines belong to M different contacts of the current user.
This collaborative FR framework consists of two major parts:
selection of suitable FR engines and merging of multiple FR results.
To select K suitable FR engines out of M+1 FR engines, a social
graph model (SGM) is constructed that represents the social relationships
between the different contacts considered. SGM is created by utilizing the
personal photo collections shared in the collaborative FR framework.
Based on the constructed SGM, a relevance score is computed for each
FR engine. K FR engines are then selected using the relevance scores
computed for the FR engines. Next, the query face images detected in the
photos of the current user are simultaneously forwarded to the selected K
FR engines.
In order to merge the FR results returned by the different FR
engines, Bayesian Decision Rule is used. A key property of the solution
is that we are able to simultaneously account for both the relevance scores
computed for the selected FR engines and the FR result scores.
This collaborative FR framework has a low computational cost
and comes with a design that is suited for deployment in a decentralized
OSN.
15
CHAPTER 4
SYSTEM SPECIFICATION
4.1 HARDWARE REQUIREMENTS
PROCESSOR : PENTIUM IV 2.6GHZ, Intel Core 2 Duo
RAM : 512 MB DD RAM
MONITOR : 15” COLOR
HARD DISK : 320 GB
MEMORY : 2GB
4.2 SOFTWARE REQUIREMENTS
FRONT END : ASP.NET
BACK END : MSAccess2007
OPERATING SYSTEM : Windows 07 Home Basic (64-bit)
16
CHAPTER 5
TECHNOLOGY OVERVIEW
5.1 C# AND .NET
C# is a multi-paradigm programming language encompassing
strong typing, imperative, declarative, functional, generic, object-
oriented (class-based), and component-oriented programming disciplines.
It was developed by Microsoft within its .NET initiative and later
approved as a standard by Ecma (ECMA-334) and ISO (ISO/IEC
23270:2006). C# is one of the programming languages designed for
the Common Language Infrastructure.
5.1.1 INTRODUCTION
C# is intended to be a simple, modern, general-purpose, object-oriented
programming language. Its development team is led by Anders Hejlsberg.
The most recent version is C# 4.0, which was released on April 12, 2010.
1. C# language is intended to be a simple, modern, general-purpose,
object-oriented programming language.
2. The language, and implementations thereof, should provide support
for software engineering principles such as strong type checking,
array bounds checking, detection of attempts to use uninitialized
variables, and automatic garbage collection. Software robustness,
durability, and programmer productivity are important.
3. The language is intended for use in developing software
components suitable for deployment in distributed environments.
4. Source code portability is very important, as is programmer
portability, especially for those programmers already familiar with
C and C++.
5. Support for internationalization is very important.
17
6. C# is intended to be suitable for writing applications for both
hosted and embedded systems, ranging from the very large that use
sophisticated operating systems, down to the very small having
dedicated functions.
7. Although C# applications are intended to be economical with
regard to memory and processing power requirements, the
language was not intended to compete directly on performance and
size with C or assembly language.
5.1.2 FEATURES
By design, C# is the programming language that most directly reflects
the underlying Common Language Infrastructure (CLI). Most of its
intrinsic types correspond to value-types implemented by the CLI
framework. However, the language specification does not state the code
generation requirements of the compiler: that is, it does not state that a C#
compiler must target a Common Language Runtime, or generate
Common Intermediate Language (CIL), or generate any other specific
format. Theoretically, a C# compiler could generate machine code like
traditional compilers of C++ or FORTRAN.
Some notable features of C# that distinguish it from C and C++
(and Java, where noted) are:
1. It has no global variables or functions. All methods and
members must be declared within classes. Static members of
public classes can substitute for global variables and
functions.
2. Local variables cannot shadow variables of the enclosing
block, unlike C and C++. Variable shadowing is often
considered confusing by C++ texts.
18
3. C# supports a strict Boolean data type, bool. Statements that
take conditions, such as while and if, require an expression
of a type that implements the true operator, such as the
boolean type. While C++ also has a boolean type, it can be
freely converted to and from integers, and expressions such
as if(a) require only that a is convertible to bool,
allowing a to be an int, or a pointer. C# disallows this
"integer meaning true or false" approach, on the grounds that
forcing programmers to use expressions that return
exactly bool can prevent certain types of common
programming mistakes in C or C++ such as if (a = b) (use of
assignment = instead of equality ==).
4. In C#, memory address pointers can only be used within
blocks specifically marked as unsafe, and programs with
unsafe code need appropriate permissions to run. Most
object access is done through safe object references, which
always either point to a "live" object or have the well-
defined null value; it is impossible to obtain a reference to a
"dead" object (one that has been garbage collected), or to a
random block of memory. An unsafe pointer can point to an
instance of a value-type, array, string, or a block of memory
allocated on a stack. Code that is not marked as unsafe can
still store and manipulate pointers through
the System.IntPtr type, but it cannot dereference them.
5. Managed memory cannot be explicitly freed; instead, it is
automatically garbage collected. Garbage collection
addresses the problem of memory leaks by freeing the
19
programmer of responsibility for releasing memory that is no
longer needed.
6. In addition to the try...catch construct to handle exceptions,
C# has a try...finally construct to guarantee execution of the
code in the finally block.
7. Multiple inheritance is not supported, although a class can
implement any number of interfaces. This was a design
decision by the language's lead architect to avoid
complication and simplify architectural requirements
throughout CLI.
8. C#, like C++, but unlike Java, supports operator overloading.
9. C# is more type safe than C++. The only implicit
conversions by default are those that are considered safe,
such as widening of integers. This is enforced at compile-
time, during JIT, and, in some cases, at runtime. No implicit
conversions occur between booleans and integers, nor
between enumeration members and integers (except for
literal 0, which can be implicitly converted to any
enumerated type). Any user-defined conversion must be
explicitly marked as explicit or implicit, unlike C++ copy
constructors and conversion operators, which are both
implicit by default. Starting with version 4.0, C# supports a
"dynamic" data type that enforces type checking at runtime
only.
10.Enumeration members are placed in their own scope.
11.C# provides properties as syntactic sugar for a common
pattern in which a pair of methods, accessor (getter) and
20
mutator (setter)encapsulate operations on a
single attribute of a class.
12.Full type reflection and discovery is available.
13.Checked exceptions are not present in C# (in contrast to
Java). This has been a conscious decision based on the issues
of scalability and versionability.
5.1.3 ADVANTAGES OF C#
ADVANTAGES OVER C AND C++
1. It is compiled to an intermediate language (CIL) independently
of the language it was developed or the target architecture and
operating system
2. Automatic garbage collection
3. Pointers no longer needed (but optional)
4. Reflection capabilities
5. Don't need to worry about header files ".h"
6. Definition of classes and functions can be done in any order
7. Declaration of functions and classes not needed
8. Unexisting circular dependencies
9. Classes can be defined within classes
10.There are no global functions or variables, everything belongs to
a class
11.All the variables are initialized to their default values before
being used (this is automatic by default but can be done
manually using static constructors)
12.You can't use non-boolean variables (integers, floats...) as
conditions. This is much more clean and less error prone
13.Apps can be executed within a restricted sandbox
21
ADVANTAGES OVER C++ AND JAVA:
1. Formalized concept of get-set methods, so the code becomes
more legible
2. More clean events management (using delegates)
ADVANTAGES OVER JAVA:
1. Usually it is much more efficient than java and runs faster
2. CIL (Common (.NET) Intermediate Language) is a standard
language, while java byte codes aren't
3. It has more primitive types (value types), including unsigned
numeric types
4. Indexers let you access objects as if they were arrays
5. Conditional compilation
6. Simplified multithreading
7. Operator overloading. It can make development a bit trickier but
they are optional and sometimes very useful
8. (limited) use of pointers if you really need them, as when calling
unmanaged (native) libraries which doesn't run on top of the
virtual machine (CLR).
5.2 MICROSOFT ACCESS
5.2.1 INTRODUCTION
Microsoft Access is a "Relational Database Management System."
The description that follows applies to Microsoft Access 2000, Microsoft
Access 2002, Microsoft Access 2007 Microsoft Access 2007 & 2010, and
even Microsoft Access 97. In fact what follows applies to just about
every Windows database out there regardless of who makes it.
Access can store data in specific formats for sorting, querying, and
reporting. Sorting is pretty straightforward; data is simply presented to
22
you in particular orders. An example might be presenting your customer
data (customer number, name, address, city, state, zip, and total
purchases) in last name order.
Querying means that as a user of this database, you can ask Access for a
collection of information relating to location such as state or country,
price as it might relate to how much a customer spent, and date as it
might relate to when items were purchased. Querying can include sorting
as well. For example if you wanted to see the top spending customers in
the state of Florida querying would be a way to do that. A Query on data
typically returns a sub-set of the collection of data, but can return all of it
in a different order as well. Reporting is simply query results in printable
or viewable form.
5.2.2 STORING DATA
In order for Access to perform these functions data has to be stored
in the smallest possible units. These units are called fields. A field might
contain a first name, a last name, a middle name, a street address, and so
on. Notice that I do not propose that the entire name be placed in one
field. If that were done the only sorting one could perform would end up
being presented by the first name hardly useful. But if a separate field is
used for the last name, another for first, and so on, much more useful
sorting can be accomplished.
Fields are also defined as a type of data (number, text, date, date-
time, dollar, etc.). By storing data in its own specific field type, Access
(or any RDBMS for that matter) can sort that data in very tightly
controlled ways. For example one can sort numbers and alphabetic
content accurately as long as Access knows what type of sort to apply to
23
that data. Thus the field type an entire collection of fields relating to a
particular entry is called a record. The entire collection of records is
called a table.
Tables resemble spreadsheets in that they are a grid of data. A row
represents one complete record and a column a particular data field. Thus
a data table containing a collection of customer demographics might
contain the Customer Number, Name, Address, City, State, Zip,
Telephone Number, Cell Phone Number, and email Address.
Possibly the easiest way to visualize this is to imagine a data table
as a spreadsheet. Each column would be a field, each row a record, and a
collection of rows would represent the entire data table. Naturally each
row, in the case of a customer file for example, would be one customer.
By storing data in this manner it is much easier to sort and report on that
data in nearly any order you wish. All of the above could describe a
Database Management System DBMS, but Microsoft Access is also a
relational database management system.
5.2.3 RELATIONAL DATABASES
A relational database management system (RDBMS) allows for the
creation of multiple tables that can be linked together via one or more
shared field values between tables. For example the Customer Table
mentioned above could be linked to a table containing more sensitive
purchase or credit card information. As long as both tables contain a
Customer Number field with the same data size and type these tables can
be linked or related. Of course each Customer Number would be unique
to the customer.
24
In this example the secondary or child table could contain the
following information; Customer Number (identical in format to
Customer Number in the parent table), Product ID, Unit Price, Quantity,
etc. Yet another child table could contain the Customer Number, credit
card data, including the number, active til dates, and pin number.
As long as there is a common Customer Number (in this example)
the tables can be linked or kept separate depending on the level of
security required of this information. This way two of the tables could be
displayed on a form, through a query, or on a report and look as though
all the information is stored in one place.
Access, like any RDBMS, will allow these tables to be interrelated
via forms, reports, or queries. Access, as with many other RDBMS, can
use Structured Query Language (SQL) to query the table(s). Though
Microsoft Access is not as powerful (and is properly a psudo-rdbms) as
other products such as Microsoft SQL Server, Oracle, MySQL, Sybase,
or IBM DB2, it operates in much the same way and many of the SQL
statements that would work properly in Access could be also used in the
above mentioned RDBMS without modification.
Finally, databases are used almost ubiquitously.
5.2.4 TERMS
Relational Database Management System: Data that are stored and
manipulated are held in a relational manner. e.g. The tables within can be
related to each other via fields.
Field: A single data item within a data record. The field is usually
represented as a column.
25
Key Field: A field that contains like data that is used between tables to
link or relate them. Key Field usually refers to the linking field in the
parent table. In Access, this must always be the first field in the parent
table.
Foreign Key: Like the key field above a field that contains data that can
be used to link or relate two tables together. The Foreign Key usually
refers to the field in the child table; it does not have to be the first record
in the child table.
Record: The row of information representing one set of data within a
table.
Parent Table: The primary table used to coordinate and connect to child
tables. It might also be called the master table.
Child Table: Another table which can be related to the parent table. Think
of this as a "slave" table; a table that is designed to be used with the
Master table. Note: that Child Tables can also be master tables to lesser
child tables.
Table: A collection of like data arranged in a spreadsheet (rows and
columns) like fashion.
5.2.5 CREATING AND DESIGNING MICROSOFT ACCESS
DESIGNING
Before creating a database, you should plan and design it. For
example, you should define the type of database you want to create. You
should create, in writing, a list of the objects it will contain: employees,
customers, products, transactions, etc.
26
For each object, you should create a list of the pieces of
information the object will need for its functionality: name(s), contact
information, profession, etc. You should also justify why the object needs
that piece of information. You should also define how the value of that
piece of information will be given to the object. As you will see in later
lessons, some values are typed, some values must be selected from a
preset list, and some values must come from another object. In later
lessons, we will see how you can start creating the objects and their
content.
CREATING A DATABASE
If you have just started Microsoft Access, to create a database, under
File, click New. You can then use one of the links in the main (middle)
section of the interface:
To create a blank database, in the middle section, under Available
Templates, click Blank Database
To create a database using one of the samples, under Available
Templates, click a category from one of the buttons, such as
Sample Templates. Then click the desired buttons:
If you start Microsoft Access just after installing it, the File section
would display a normal menu. If you start creating or opening databases,
a list of MRUs displays under Close Database. To open a database, if you
see its name under File, you can click it.
Since a Microsoft Access database is primarily a file, if you see its
icon in a file utility such as Windows Explorer, you can double-click it.
This would launch Microsoft Access and open the database. If you
27
received a database as an email attachment, you can also open the
attachment and consequently open the database file.
CLOSING A DATABASE
Close a database without closing Microsoft Access. To do this, in
the File section, click Close Database.
Deleting a Database
If you have a database you don't need any more, you can delete it. To
delete a database in a file utility such as Windows Explorer:
Click the icon of the database to select it and press Delete
Right-click the icon and click Delete
A warning message would be presented to you to confirm what you want
to do. After you have deleted a database, it doesn't disappear from the
MRU lists of Microsoft Access. This means that, after a database has
been deleted, you may still see it in the File section. If you try opening
such a database, you would receive an error.
If a database has been deleted and you want to remove it from the MRU
lists, you can open the Registry (Start -> Run: regedit, Enter) (be careful
with the Registry; when it doubt, don't touch it). Open the following key:
Locate the deleted database and delete its key. The next time you start
Microsoft Access, the name of the deleted database would not display in
the File section.
THE SIZE OF A DATABASE
A Database is primarily a computer file, just like those created with
other applications. As such, it occupies space in the computer memory. In
28
some circumstances, you should know how much space a database is
using. This can be important when you need to back it up or when it is
time to distribute it. Also, when adding and deleting objects from your
database, its file can grow or shrink without your direct intervention.
Like any other computer file, to know the size of a database, you
can right-click it in Windows Explorer or My Computer and click
Properties. If you are already using the database, to check its size, you
can click File, position the mouse on Manage and click Database
Properties. In the Properties dialog box, click General and check the Size
label
29
CHAPTER 6
SYSTEM DESIGN
6.1 SYSTEM ARCHITECTUE
Fig 6.1:System Architecture
30
Input Image (Personal Photo)
Face Detection
Query Faces
Merging of FR Engine
FR1 FR2 FRk
Output Image (Name Tagged
Personal Photo)
6.2 DATA FLOW DIAGRAMS
USE CASE DIAGRAM
Fig 6.2.1: Use Case Diagram
31
CLASS DIAGRAM
Fig 6.2.2: Class Diagram
SEQUENCE DIAGRAM
Fig 6.2.3: Sequence Diagram
32
COLLABORATION DIAGRAM
Fig 6.2.4: Collaboration Diagram
COMPONENT DIAGRAM
Fig 6.2.5: Component Diagram
33
ACTIVITY DIAGRAM
Fig 6.2.6: Activity Diagram
34
6.3 MODULE DESCRIPTION
MODULES:
This project is divided into 3 modules:
1. Face Detection
2. Selection of Face Recognition Engines
3. Merging of Face Recognition Results
6.3.1 FACE DETECTION:
Face detection can be regarded as a specific case of object-class
detection. In object-class detection, the task is to find the locations and
sizes of all objects in an image that belong to a given class
Face detection can be regarded as a more general case of face
localization. In face localization, the task is to find the locations and sizes
of a known number of faces (usually one).
Early face-detection algorithms focused on the detection of frontal
human faces, whereas newer algorithms attempt to solve the more general
and difficult problem of multi-view face detection. That is, the detection
of faces that are either rotated along the axis from the face to the observer
(in-plane rotation), or rotated along the vertical or left-right axis (out-of-
plane rotation), or both. The newer algorithms take into account
variations in the image or video by factors such as face appearance,
lighting, and pose.
Fig 6.3.1 Face Detection
35
FEATURE TYPES AND EVALUATION:
The features employed by the detection framework universally
involve the sums of image pixels within rectangular areas. As such, they
bear some resemblance to Haar basis functions, which have been used
previously in the realm of image-based object detection. However, since
the features used by Viola and Jones all rely on more than one rectangular
area, they are generally more complex. The figure illustrates the four
different types of features used in the framework. The value of any given
feature is always simply the sum of the pixels within clear rectangles
subtracted from the sum of the pixels within shaded rectangles. As is to
be expected, rectangular features of this sort are rather primitive when
compared to alternatives such as steerable filters. Although they are
sensitive to vertical and horizontal features, their feedback is considerably
coarser. However, with the use of an image representation called
the integral image, rectangular features can be evaluated in constant time,
which gives them a considerable speed advantage over their more
sophisticated relatives. Because each rectangular area in a feature is
always adjacent to at least one other rectangle, it follows that any two-
rectangle feature can be computed in six array references, any three-
rectangle feature in eight, and any four-rectangle feature in just nine.
Fig 6.3.2: Four Different types of Features
36
LEARNING ALGORITHM:
The speed with which features may be evaluated does not adequately
compensate for their number, however. For example, in a standard 24x24
pixel sub-window, there are a total of 45,396 possible features, and it
would be prohibitively expensive to evaluate them all. Thus, the object
detection framework employs a variant of the learning
algorithm AdaBoost to both select the best features and to train classifiers
that use them.
CASCADE ARCHITECTURE:
Fig 6.3.3: Cascade Architecture
The evaluation of the strong classifiers generated by the learning process
can be done quickly, but it isn’t fast enough to run in real-time. For this
reason, the strong classifiers are arranged in a cascade in order of
complexity, where each successive classifier is trained only on those
selected samples which pass through the preceding classifiers. If at any
stage in the cascade a classifier rejects the sub-window under inspection,
no further processing is performed and continue on searching the next
sub-window (see figure at right). The cascade therefore has the form of a
degenerate tree. In the case of faces, the first classifier in the cascade –
called the attentional operator – uses only two features to achieve a false
negative rate of approximately 0% and a false positive rate of 40%. The
37
effect of this single classifier is to reduce by roughly half the number of
times the entire cascade is evaluated.
6.3.2 SELECTION OF FACE RECOGNITION ENGINES
Selection of suitable FR engines is based on the presence of social
context in personal photos collections. Social context refers to the strong
tendency that users often take photos together with friends, family
members, or co-workers.
The use of social context for selecting suitable FR engines is
motivated by two reasons. First, social context is strongly consistent in a
typical collection of personal photos. Therefore, query face images
extracted from the photos of the current user are likely to belong to close
contacts of the current user. Second, it is likely that each FR engine has
been trained with a high number of training face images and
corresponding name tags that belong to close contacts of the owner of the
FR engine. Consequently, by taking advantage of social context, the
chance increases that FR engines are selected that are able to correctly
recognize query face images.
CONSTRUCTION OF A SOCIAL GRAPH MODEL:
A Social Graph Model (SGM) is constructed for selecting suitable
FR engines. Social relationship between the current user and the OSN
members in his/her contact list are quantified by making use of the
identity occurrence and the co-occurrence probabilities of individuals in
personal photo collections.
Let luser be the identity label or the name of the current user Then, let
Sluser=lmMm=1be a set consisting of M different identity labels. These
identity labels correspond to OSN members that are contacts of the
current user. Note that lm denotes the identity label of the mth contact and,
without any loss of generality lm≠ln if m≠n.
38
A social graph is represented by a weighted graph as follows:
G = (N,E,W)
where N=nm, m=1,..MUnuser is a set of nodes that includes the current
user and his/her contacts, E=em,m=1,…Mis a set of edges connecting
the node of the current user to the node of the mth contact of the current
user, and the element wm in Wrepresents the strength of the social
relationship associated with em.
To compute wm, we estimate the identity occurrence and co-
occurrence probabilities from personal photo collections. The occurrence
probability for each contact is estimated as follows:
ΣpЄpuser δ1(lm ,P)
Proboccur(lm)= __________________________
|Puser|
where Puser denotes the entire collection of photos owned by the
current user, |.|denotes the cardinality of a set, and δ1(lm ,p)is an indicator
39
function that returns one when the identity of the mth contact is manually
tagged in photo P and zero otherwise.
In addition, the co-occurrence probability between the current user
and the mth contact is estimated as follows:
ΣpЄpOSN δ2(luser, lm ,P)
Probco-occur(lm)= _______________________________ for
lmЄSluser
|POSN|
where POSN denotes all photo collections in the OSN the current
user has access to (this includes photo collections owned by the current
user, as well as photo collections owned by his/her contacts), and δ2(luser,
lm ,P)is a pairwise indicator function that returns one if the current user
and the mth contact of the current user have both been tagged in photo
and zero otherwise.
Using the above two equations wm is computed as follows;
wm=exp(proboccur(lm)+probco-occur(luser,lm))
The use of an exponential function leads to a high weighting value
when (proboccur(lm) and probco-occur(luser,lm)) both have high values.
SELECTION OF FR ENGINES:
In order to select suitable FR engines, we need torank the FR
engines Ωm according to their ability to recognize a particular query face
image. To this end, we make use of the strength of the social relationship
between the current user and
the mth contact of the current user, represented by wm. Specifically, wm is
used to represent the relevance score of the mth FR engine
When the FR engines have been Ωm m=I,…M ranked according
to wm, two solutions can be used to select suitable FR engines. The first
solution consists of selecting the top K FR engines according to their
40
relevance score wm. The second solution consists of selecting all FR
engines with a relevance score that is higher than a certain threshold
value. In practice, the first solution is not reliable as the value of K may
significantly vary from photo collection to photo collection. Indeed, for
each photo collection, we have to determine an appropriate value for K by
relying on a heuristic process. Therefore, we adopt the second solution to
select suitable FR engines. Specifically, in our collaborative FR
framework, an FR engine is selected if its associated relevance score is
higher than the average relevance score ΣMm=1wm/M.
6.3.3 MERGING OF FACE RECOGNITION RESULTS
The purpose of merging multiple FR results retrieved from
different FR engines is to improve the accuracy of face annotation. Such
an improvement can be accomplished by virtue of a complementary
effect caused by fusing multiple classification decisions regarding the
identity of a query face image.
In an OSN, FR engines that belong to different members of the
OSN may use different FR techniques. For example,
some feature extractors may have been created using global face
representations, whereas other feature extractors may have been
created using local face representations. Such an observation holds
especially true for OSNs that have been implemented
in a decentralized way, a topic that is currently of high interest. Therefore,
we only consider fusion of multiple classifier results at measurement level
and at decision level.
FUSION USING A BAYESIAN DECISION RULE
To combine multiple FR results at measurement level, we propose
to make use of fusion based on a Bayesian decision rule (BDRF). This
kind of fusion is suitable for converting different types of distances or
confidences into a common a posteriori probability. Hence, multiple FR
41
results originating from a set of heterogeneous FR engines can be easily
combined through a Bayesian decision rule. Moreover, the use of a
Bayesian decision rule allows for optimal fusion at measurement level
To Perform collaborative FR,Q and T(n) are independently and
simultaneously submitted to K different FR engines Then, let qk and tk(n)
be a feature vector extractor from Q and T(n). Here, dk(n) can be computed
by using the NN classifier assigned to Ωk.
Distance scores calculated by different NN classifiers may not be
comparable due to the use of personalized FR engines. To map the
incomparable distance scores onto a common representation that takes the
form of a posteriori probabilities. To obtain an a posteriori probability
related to, we first convert the distance scores into corresponding
confidence values using a sigmoid activation function. This can be
expressed as
1
ck(n)=__________________
1+exp(dk(n))
It should be emphasized that dk(n)(1≤n≤G) for a particular k must
be normalized to have zero mean and unit standard deviation prior to the
computation of the confidence value ck(n). A sum normalization method is
subsequently employed to compute an a posteriori probability:
ck(n)
Fk(n)=prob(l(Q)=l(T(n))|Ωk,Q)= _______________________
∑G n=1 ck
(n)
Where 0≤ Fk(n)≤1 Here Fk
(n) represent the probability that the identification
label T(n) is assigned to that of Q, assuming that Q is forwarded to Ωkfor
42
FR purpose contribute FR result scores to the final FR result score that
have the same importance as the FR result scores contributed by highly
reliable FR engines. Therefore, we assign a weight to each that takes into
account the relevance score of the associated FR engine. The rationale
behind this weighting scheme is that FR engines with high relevance
scores are expected to be highly trained for a given query face image the
weighted FR result score for can be defined as follows:
Fk(n)=Fk
(n)+α.Fk(n).Rk
Where Rk=(wk-wmin ) / (wmax -wmin ) where wmax and the wmin are the
maximum and minimum of all values, and the parameter α reflects the
importance Rk of relative to Fk(n).. Note that the importance of becomes
higher α as increases. Thus, α=0 only the FR results are used during the
fusion process. In by properly adjusting the value of α, we can increase
the importanceof produced by FR engines with high relevancescores.
On the other hand, we can decrease the importance of produced by FR
engines with low relevance scores.To merge the computed by the sum
rule is used:
CF(n)=∑n=1K Fk
(n).
The sum rule allows for optimal fusion at measurement level,compared to
other rules such as the product and median ruleFinally, to perform face
annotation on Q , the identity label of T(n) is determined by choosing the
identity label of that achieves the highest value for CF(n) :
l(Q)=l(T(n*))and n*=arg maxn=1G CF(n)
CHAPTER 7
43
SYSTEM IMPLEMENTATION
7.1 CODING
Coding is the process of designing, writing, testing, debugging, and
maintaining the source code of computer programs. This source code is
written in one or more programming languages. The purpose of
programming is to create a set of instructions that computers use to
perform specific operations or to exhibit desired behaviors. The process
of writing source code often requires expertise in many different subjects,
including knowledge of the application domain,
specialized algorithms and formal logic.
7.2 CODING STANDARD
A comprehensive coding standard encompasses all aspects of code
construction. While developers should prudently implement a standard, it
should be adhered to whenever practical. Completed source code should
reflect a harmonized style, as if a single developer wrote the code in one
session. At the inception of a software project, establish a coding standard
to ensure that all developers on the project are working in concert. When
the software project incorporates existing source code, or when
performing maintenance on an existing software system, the coding
standard should state how to deal with the existing code base.
The readability of source code has a direct impact on how well a
developer comprehends a software system. Code maintainability refers to
how easily that software system can be changed to add new features,
modify existing features, fix bugs, or improve performance. Although
readability and maintainability are the result of many factors, one
particular facet of software development upon which all developers have
an influence is coding technique. The easiest method to ensure a team of
44
developers will yield quality code is to establish a coding standard, which
is then enforced at routine code reviews.
Using solid coding techniques and good programming practices to
create high-quality code plays an important role in software quality and
performance. In addition, if you consistently apply a well-defined coding
standard, apply proper coding techniques, and subsequently hold routine
code reviews, a software project is more likely to yield a software system
that is easy to comprehend and maintain.
7.3 SAMPLE CODING
FACE DETECTION
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
using System.Windows.Forms;
using Emgu.CV;
using Emgu.CV.Structure;
using Emgu.CV.UI;
using Emgu.CV.GPU;
using System.Data;
using System.Data.OleDb;
using System.Runtime.CompilerServices;
45
namespace FaceDetection
public partial class Form1 : Form
public static string ID;
public static string FName;
public static string LName;
OleDbConnection con;
public Form1()
InitializeComponent();
private void Form1_Load(object sender, EventArgs e)
Rectangle rect = Screen.PrimaryScreen.WorkingArea;
//Divide the screen in half, and find the center of the form to center it
this.Top = 0;
this.Left = 0;
this.Width = rect.Width;
this.Height = rect.Height;
this.SendToBack();
this.TopMost = false;
this.Activate();
con=new
OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=face_annotation.mdb;");
46
public Bitmap CropBitmap(Bitmap bitmap, Rectangle rect)
//Rectangle rect = new Rectangle();
Bitmap cropped = bitmap.Clone(rect, bitmap.PixelFormat);
return cropped;
private void openFileDialog1_FileOk(object sender,
System.ComponentModel.CancelEventArgs e)
textBox1.Text = openFileDialog1.FileName;
private void button2_Click(object sender, EventArgs e)
openFileDialog1.ShowDialog();
openFileDialog1.OpenFile();
pictureBox2.ImageLocation = textBox1.Text;
pictureBox2.Visible = true;
private void button3_Click(object sender, EventArgs e)
OleDbCommand cmd = new OleDbCommand();
OleDbCommand cmd1 = new OleDbCommand();
cmd.CommandType = CommandType.Text;
47
cmd.CommandText = "Insert into photos(EmailID,Photo)
Values('" + ID + "','" + pictureBox2.ImageLocation + "')";
con.Open();
cmd.Connection = con;
//cmd1.Connection = con;
cmd.ExecuteNonQuery();
con.Close();
Image<Bgr, Byte> image = new Image<Bgr,
byte>(textBox1.Text); //Read the files as an 8-bit Bgr image
Stopwatch watch;
String faceFileName = "haarcascade_frontalface_default.xml";
String eyeFileName = "haarcascade_eye.xml";
String contact, contacta="null", contact2;
String pname;
Double pcooccur, pcooccura;
int cooccur = 0, cooccura = 0;
int posn, posna;
Double[] weight = new Double[100];
Double[] sweight = new Double[100];
Double[] avalue = new Double[100];
Double[] wfrscore = new Double[100];
weight[0] = 0;
Double tweight = 0, tavalue = 0, cvalue = 0, frscore = 0;
String aval;
Double wmin = 0, wmax = 0, rk, aweight = 0;
if (GpuInvoke.HasCuda)
48
using (GpuCascadeClassifier face = new
GpuCascadeClassifier(faceFileName))
using (GpuCascadeClassifier eye = new
GpuCascadeClassifier(eyeFileName))
watch = Stopwatch.StartNew();
using(GpuImage<Bgr, Byte> gpuImage=new
GpuImage<Bgr, byte>(image))
using (GpuImage<Gray, Byte> gpuGray =
gpuImage.Convert<Gray, Byte>())
Rectangle[] faceRegion = face.DetectMultiScale(gpuGray,
1.1, 10, Size.Empty);
foreach (Rectangle f in faceRegion)
//draw the face detected in the 0th (gray) channel with
blue color
image.Draw(f, new Bgr(Color.Blue), 2);
using (GpuImage<Gray, Byte> faceImg =
gpuGray.GetSubRect(f))
//For some reason a clone is required.
//Might be a bug of GpuCascadeClassifier in opencv
using (GpuImage<Gray, Byte> clone =
faceImg.Clone())
Rectangle[] eyeRegion =
eye.DetectMultiScale(clone, 1.1, 10, Size.Empty);
49
foreach (Rectangle s in eyeRegion)
Rectangle eyeRect = s;
eyeRect.Offset(f.X, f.Y);
image.Draw(eyeRect, new Bgr(Color.Red), 2);
watch.Stop();
else
int faces = 1;
int x = 300, y = 300;
//Read the HaarCascade objects
using (HaarCascade face = new HaarCascade(faceFileName))
using (HaarCascade eye = new HaarCascade(eyeFileName))
watch = Stopwatch.StartNew();
using (Image<Gray, Byte> gray = image.Convert<Gray,
Byte>()) //Convert it to Grayscale
//normalizes brightness and increases contrast of the image
gray._EqualizeHist();
50
MCvAvgComp[] facesDetected = face.Detect(
gray,
1.1,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO_C
ANNY_PRUNING,
new Size(20, 20));
foreach (MCvAvgComp f in facesDetected)
//draw the face detected in the 0th (gray) channel with
blue color
image.Draw(f.rect, new Bgr(Color.Transparent), 2);
//Set the region of interest on the faces
gray.ROI = f.rect;
MCvAvgComp[] eyesDetected = eye.Detect(
gray,
1.1,
10,
Emgu.CV.CvEnum.HAAR_DETECTION_TYPE.DO
_CANNY_PRUNING,
new Size(20, 20));
gray.ROI = Rectangle.Empty;
foreach (MCvAvgComp s in eyesDetected)
Rectangle eyeRect = s.rect;
eyeRect.Offset(f.rect.X, f.rect.Y);
//image.Draw(eyeRect, new Bgr(Color.Transparent),
2);
51
Bitmap bmp = new Bitmap(image.Bitmap);
Bitmap bmpgs = ConvertToGrayscale(bmp);
PictureBox pc = new PictureBox();
pc.Image = CropBitmap(bmp, f.rect);
pc.Size = new Size(100, 100);
pc.SizeMode = PictureBoxSizeMode.StretchImage;
this.Controls.Add(pc);
Image<Bgr, Byte> img = new Image<Bgr,
byte>(bmpgs);
TextBox tb = new TextBox();
TextBox tb1 = new TextBox();
tb.Text = (img.GetAverage().Red +
img.GetAverage().Green +
mmg.GetAverage().Blue).ToString();
this.Controls.Add(tb);
tb.Location = new Point(x, y + 150);
tb.Hide();
pc.Location = new Point(x, y);
x = x + 200;
faces++;
con.Open();
string query1 = "SELECT * FROM photos";
OleDbDataAdapter adapt1 = new
OleDbDataAdapter(query1, con);
DataSet ds1 = new DataSet();
adapt1.Fill(ds1, "photos");
adapt1.Dispose();
DataView view1 = new DataView();
view1 = ds1.Tables[0].DefaultView;
52
posn = view1.Count;
string query = "SELECT * FROM friends where
EmailID='" + ID + "' and Condition='Accepted'";
OleDbDataAdapter adapt = new
OleDbDataAdapter(query, con);
DataSet ds = new DataSet();
adapt.Fill(ds, "friends");
adapt.Dispose();
DataView view = new DataView();
view = ds.Tables[0].DefaultView;
int f1 = view.Count;
int fa = f1;
foreach (DataRow dr in ds.Tables[0].Rows)
contact = view[f1 - 1].Row[1].ToString();
string query2 = "SELECT * FROM photos where
EmailID='" + ID + "'";
OleDbDataAdapter adapt2 = new
OleDbDataAdapter(query2, con);
DataSet ds2 = new DataSet();
adapt2.Fill(ds2, "photos");
adapt2.Dispose();
DataView view2 = new DataView();
view2 = ds2.Tables[0].DefaultView;
int puser = view2.Count;
string query3 = "SELECT * FROM queryface where
EmailID='" + ID + "' and Person='" + contact + "'";
OleDbDataAdapter adapt3 = new
OleDbDataAdapter(query3, con);
53
DataSet ds3 = new DataSet();
adapt3.Fill(ds3, "queryface");
adapt3.Dispose();
DataView view3 = new DataView();
view3 = ds3.Tables[0].DefaultView;
int occur = view3.Count;
Double poccur = occur / puser;
string query31 = "SELECT * FROM queryface
where Person='" + contact + "'";
OleDbDataAdapter adapt31 = new
OleDbDataAdapter(query31, con);
DataSet ds31 = new DataSet();
adapt31.Fill(ds31, "queryface");
adapt31.Dispose();
DataView view31 = new DataView();
view31 = ds31.Tables[0].DefaultView;
int occur1 = view31.Count;
foreach (DataRow dr11 in ds31.Tables[0].Rows)
pname = view31[occur1 - 1].Row[1].ToString();
string query4 = "SELECT * FROM queryface
where Person='" + FName + " " + LName + "'
and Photo='" + pname + "'";
OleDbDataAdapter adapt4 = new
OleDbDataAdapter(query4, con);
DataSet ds4 = new DataSet();
54
adapt4.Fill(ds4, "queryface");
adapt4.Dispose();
DataView view4 = new DataView();
view4 = ds4.Tables[0].DefaultView;
cooccur = view4.Count;
occur1--;
//cooccur--;
pcooccur = cooccur / posn;
weight[f1] = Math.Exp(occur) + Math.Exp(cooccur);
tweight = tweight + weight[f1];
f1--;
aweight = tweight / fa;
Label sfr = new Label();
sfr.Text = "Selected FR Engine";
sfr.Location = new Point(x, y + 150);
string query1a = "SELECT * FROM photos";
OleDbDataAdapter adapt1a = new
OleDbDataAdapter(query1a, con);
DataSet ds1a = new DataSet();
adapt1a.Fill(ds1a, "photos");
adapt1a.Dispose();
DataView view1a = new DataView();
view1a = ds1a.Tables[0].DefaultView;
posna = view1a.Count;
string querya = "SELECT * FROM friends where
EmailID='" + ID + "' and Condition='Accepted'";
55
OleDbDataAdapter adapta = new
OleDbDataAdapter(querya, con);
DataSet dsa = new DataSet();
adapta.Fill(dsa, "friends");
adapta.Dispose();
DataView viewa = new DataView();
viewa = dsa.Tables[0].DefaultView;
int f1a = viewa.Count;
int faa = f1a;
foreach (DataRow dr in dsa.Tables[0].Rows)
contacta = view[f1a - 1].Row[1].ToString();
string query2a = "SELECT * FROM photos where
EmailID='" + ID + "'";
OleDbDataAdapter adapt2a = new
OleDbDataAdapter(query2a, con);
DataSet ds2a = new DataSet();
adapt2a.Fill(ds2a, "photos");
adapt2a.Dispose();
DataView view2a = new DataView();
view2a = ds2a.Tables[0].DefaultView;
int pusera = view2a.Count;
string query3a = "SELECT * FROM queryface where
EmailID='" + ID + "' and Person='" + contacta + "'";
OleDbDataAdapter adapt3a = new
OleDbDataAdapter(query3a, con);
DataSet ds3a = new DataSet();
adapt3a.Fill(ds3a, "queryface");
adapt3a.Dispose();
56
DataView view3a = new DataView();
view3a = ds3a.Tables[0].DefaultView;
int occura = view3a.Count;
Double poccura = occura / pusera;
string query31a = "SELECT * FROM queryface
where Person='" + contacta + "'";
OleDbDataAdapter adapt31a = new
OleDbDataAdapter(query31a, con);
DataSet ds31a = new DataSet();
adapt31a.Fill(ds31a, "queryface");
adapt31a.Dispose();
DataView view31a = new DataView();
view31a = ds31a.Tables[0].DefaultView;
int occur1a = view31a.Count;
foreach (DataRow dr11 in ds31a.Tables[0].Rows)
pname=view31a[occur1a- 1].Row[1].ToString();
string query4a = "SELECT * FROM queryface
where Person='" + FName + " " + LName + "'
and Photo='" + pname + "'";
OleDbDataAdapter adapt4a=new
OleDbDataAdapter(query4a, con);
DataSet ds4a = new DataSet();
adapt4a.Fill(ds4a, "queryface");
adapt4a.Dispose();
DataView view4a = new DataView();
57
view4a = ds4a.Tables[0].DefaultView;
cooccura = view4a.Count;
occur1a--;
//cooccura--;
pcooccura = cooccura / posna;
weight[f1]=Math.Exp(occura) +
Math.Exp(cooccura);
f1a--;
if (aweight <= weight[f1a+1])
for (int i = 0; i < f1a; i++)
if (weight[i] > tweight)
for (int j = 0; j < f1a; j++)
sweight[j] = weight[i];
wmax = Math.Max(sweight[j], sweight[j
- 1]);
wmin = Math.Min(sweight[j], sweight[j -
1]);
58
watch.Stop();
fr.ID = ID;
fr.FName = FName;
fr.LName = LName;
fr.photo = image.Bitmap;
private void button4_Click(object sender, EventArgs e)
fr fr = new fr();
this.Hide();
fr.Show();
public Bitmap ConvertToGrayscale(Bitmap source)
Bitmap bm = new Bitmap(source.Width, source.Height);
for (int y = 0; y < bm.Height; y++)
for (int x = 0; x < bm.Width; x++)
59
Color c = source.GetPixel(x, y);
int luma = (int)(c.R * 0.3 + c.G * 0.59 + c.B * 0.11);
bm.SetPixel(x, y, Color.FromArgb(luma, luma, luma));
return bm;
CHAPTER 8
TESTING
8.1 INTRODUCTION
Testing is a process of executing a program with the
intent of finding an error. A good test has a high probability of finding an
as yet undiscovered error. A successful test is one that uncovers an as yet
undiscovered error. The objective is to design tests that systematically
uncover different classes of errors and do so with a minimum amount of
60
time and effort. Testing cannot show the absence of defects, it can only
show that software defects are present.
8.1.1 TEST PLAN
The test-case designer not only has to consider the white
and black box test cases, but also the timing of the data and the
parallelism of the tasks that handle the data.
In many situations, test data provided when a real system is in one
state will result in proper processing, while the same data provided when
the system is in a different state may lead to error.
The intimate relationship that exists between real-time software
and its hardware environment can cause testing problems. Software tests
must consider the impact of hardware faults of software processing. Step
strategy for real-time systems is proposed.
The first step in the testing of real-time software is to test each task
independently (i.e.) the white and black box tests are designed and
executed for each task. Each task is executed independently during these
tests. The task testing uncovers errors in logic and functions, but will not
uncover timing or behavioral errors.
VALIDATION
The process of evaluating software at the end of the software
development process to ensure compliance with software requirements. It
is actual testing of the application.
Am I building the right product?
Determining if the system complies with the requirements and
performs functions for which it is intended and meets the
61
organization’s goals and user needs. It is traditional and is
performed at the end of the project.
Performed after a work product is produced against established
criteria ensuring that the product integrates correctly into the
environment.
Determination of correctness of the final software product by a
development project with respect to the user needs and
requirements.
VERIFICATION
The process of determining whether or not the products of
a given phase of the software development cycle meet the implementation
steps and can be traced to the incoming objectives established during the
previous phase.
Verification process helps in detecting defects early, and
preventing their leakage e downstream. Thus, the higher cost of later
detection and rework is eliminated
The goal of software testing is to assess the requirements of a
project; then the tester will determine if these requirements are met. There
are many times when low memory usage and speed are more important
than making the program pretty or capable of handling errors. While
programming skills are useful when testing software, they are not
necessarily required; however it can be useful in determining the cause of
errors found in a project.
8.2 TYPES OF TESTING
There are different types of testing in the development process.
They are:
62
Unit Testing
Integration Testing
Validation Testing
System Testing
Functional Testing
Performance Testing
Beta Testing
Acceptance Testing
8.2.1 UNIT TESTING
Developers write unit tests to check their own code. Unit
testing differs from integration testing, which confirms that components
work well together, and acceptance testing, which confirms that an
application does what the customer expects it to do. Unit tests are so named
because they test a single unit of code. Unit testing focuses verification
effort on the smallest unit of software design. Each of the modules in this
project was verified individually for errors.
8.2.2 INTEGRATION TESTING
Integration testing is a systematic testing for constructing the
program structure while at the same time conducting tests to uncover errors
associated within the interface. This testing was done with sample data.
The need for integrated test is to find the overall system performances.
8.2.3 VALIDATION TESTING
Validation testing is where the requirements established as part
of the software requirements analysis are validated against the software
that has been constructed. It provides final assurance that the software
meets all functional, behavioral and performance requirements. A deviation
from the specification is uncovered and corrected. Each input field was
tested with the validation rules specified integrity.
63
8.2.4 SYSTEM TESTING
System Testing is software or hardware is testing conducted on a
complete, integrated system to evaluate the system's compliance with its
specified requirement .System testing falls within the scope of black box
testing, and as such, should require no knowledge of the inner design of
the code or logic.
8.2.5 PERFORMANCE TESTING
Performance testing covers a wide range of engineering or
functional evaluations where a material, product, system, or person is not
specified by detailed material or component specification: rather,
emphasis is on the final measurable performance characteristics. Testing
can be a qualitive or qualitative procedure.
8.2.6 BETA TESTING
In software development, a beta test is the second phase of
software testing in which a sampling of the intended audience tries the
product out. (Beta is the second letter of the Greek alphabet.) Originally,
the term alpha test meant the first phase of testing in a software
development process. The first phase includes unit testing, component
testing, and system testing. Beta testing can be considered "pre-release
testing." Beta test versions of software are now distributed to a wide
audience on the Web partly to give the program a "real-world" test and
partly to provide a preview of the next release.
8.2.7 ACCEPTANCE TESTING
Testing generally involves running a suite of tests on the
completed system. Each individual test, known as a case, exercises a
64
particular operating condition of the user's environment or feature of the
system, and will result in a pass or fail, or Boolean, outcome. There is
generally no degree of success or failure. The test environment is usually
designed to be identical, or as close as possible, to the anticipated user's
environment, including extremes of such. These test cases must each be
accompanied by test case input data or a formal description of the
operational activities (or both) to be performed—intended to thoroughly
exercise the specific case—and a formal description of the expected
results.
CHAPTER 9
SCREEN SHOTS
USER ACCOUNT:
65
FACE DETECTION
66
SELECTION OF FR ENGINES
67
CHAPTER 10
CONCLUSION
68
Our project demonstrates that the collaborative use of multiple
FR engines allows improving the accuracy of face annotation for personal
photo collections shared on OSNs. The improvement in face annotation
accuracy can mainly be attributed to the following two factors.
1) In an OSN, the number of subjects that needs to be annotated
tends to be relatively small, compared to the number of subjects
encountered in traditional FR applications. Thus, an FR engine
that belongs to an OSN member is typically highly specialized
for the task of recognizing a small group of individuals.
2) In an OSN, query face images have a higher chance of
belonging to people closely related to the photographer.
Hence, simultaneously using the FR engines assigned to these
individuals can lead to improved face annotation ac curacy as these FR
engines are expected to locally produce high-quality FR results for the
given query face images.
CHAPTER 11
FUTURE ENHANCEMENT
69
Our approach for constructing an SGM relies on the
availability of manually tagged photos to reliably estimate the
identity occurrence and co-occurrence probabilities In practice, however,
two limitations can be identified:
1) individuals are only part of the constructed SGM when they
were manually tagged in the personal photo collections;
2) our approach may not be optimal when manually tagged photos
do not clearly represent the social relations between different
members of the OSN
REFERENCES
[1]P. Wu and D. Tretter, “Close & close: Social cluster and closeness
70
from photo collections,” in Proc. ACM Int. Conf. Multimedia,
2009.
[2] A. Gursel and S. Sen, “Improving search in social networks by
agent based mining,” in Proc. Int. Joint Conf. Artificial Intelligence,
2009.
[3] J. Zhu, S. C. H. Hoi, and M. R. Lyu, “Face annotation using
transductive kernel fishder discriminant,” IEEE Trans. Multimedia, vol.
10, no. 1, pp. 86-96, 2008.
[4] Z. Stone, T. Zickler, and T. Darrell, “Autotagging Facebook:
Social network context improves photo annotation,” in Proc. IEEE
Int.Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[5] K. Choi and H. Byun, “A collaborative face recognition framework
ona social network platform,” in Proc. IEEE Int. Conf. Automatic Face
and Gesture Recognition (AFGR), 2008.
[6] N. O’Hare and A. F. Smeaton, “Context-Aware person identification
in personal photo collections,” IEEE Trans. Multimedia, vol. 11, no.
2,pp. 220-228, 2009.
[7] J. Y. Choi, W. De Neve, Y. M. Ro, and K. N. Plataniotis,
“Automatic face annotation in photo collections using context-based
unsupervised clustering and face information fusion,” IEEE Trans.
Circuits Syst. Video Technol., vol. 20, no. 10, pp. 1292-1309,
Oct.2010.
[8] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting faces in
images: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no.
1, pp.34-58, 2002.
[9]J. Y. Choi, Y. M. Ro, and K. N. Plataniotis, “Color face recognition
for degraded face images,” IEEE Trans. Syst., Man, Cybern. B, vol. 39,
no. 5, pp. 1217-1230, 2009.
[10] R. Tron and R. Vidal, “Distributed face recognition via consensus
71
on SE(3),” in Proc. 8th Workshop Omnidirectional Vision, Camera
Networks and Non-classical Cameras-OMNIVIS, 2008.
[11] K. Lerman, A. Plangprasopchok, and C. Wong, “Personalizing
image search results on flickr,” in Proc. Int. Joint Conf. Artificial
Intelligence, 2007.
[12] C. A. Yeung, L. Licaardi, K. Lu, O. Seneviratne, and T. B. Lee,
“De-centralization: The future of online social networking,” in Proc.
Int.Joint Conf. W3C Workshop, 2009.
[13] M. Naaman, R. B. Yeh, H. G. Molina, and A. Paepcke,
“Leveraging context to resolve identity in photo albums,” in Proc. ACM
Int. Conf. ACM/IEEE-CS Joint Conf. on Digital Libraries, 2005.
72