[ieee 2011 international conference on communication systems and network technologies (csnt) -...

A Framework for Discovering Internal Financial Fraud using Analytics Prabin Kumar Panigrahi

Information Systems Department, Indian Institute of Management Indore Indore, Madhya Pradesh, India

[email protected]

Abstract— In today’s knowledge based society, financial fraud has become a common phenomenon. Moreover, the growth in knowledge discovery in databases and fraud audit has made the detection of internal financial fraud a major area of research. On the other hand, auditors find it difficult to apply a majority of techniques in the fraud auditing process and to integrate their domain knowledge in this process. In this Paper a framework called “Knowledge-driven Internal Fraud Detection (KDIFD)” is proposed for detecting internal financial frauds. The framework suggests a process-based approach that considers both forensic auditor’s tacit knowledge base and computer-based data analysis and mining techniques. The proposed framework can help auditor in discovering internal financial fraud more efficiently.

Keywords- Internal financial fraud, Knowledge-driven internal fraud detection, Forensic Data Analysis, Data Mining

I. INTRODUCTION

Unknown internal fraud encompasses an array of irregularities and illegal acts characterized by intentional deception by fraudsters. Majority of known anomalies are due to weakness in internal control mechanism and in such situations fraudsters commit some frauds by exploiting the weakness. Employing accounting, auditing, and investigative skills along with mathematical, statistical and data mining models discover frauds [1]. Over the period of time a number of computer-assisted techniques has been developed and implemented in various auditing software. Both simple as well as complex techniques are available as modules in software. Data mining is the process of discovering hidden facts, or ‘red flags’, trends, patterns and relationships from multiple databases and has been used in fraud auditing process [2, 3] . In spite of all the development auditors find difficulties in discovering the frauds. There is no suggested framework that integrates auditors’ tacit knowledge base and the techniques in the auditing process. Auditors have limited knowledge of the data structures of suspicious transactions. Application of each technique depends on the underlying data structure, availability of attributes, quality of attributes, outcome attribute and number of transactions [4]. Complex techniques demand more data requirements and assumptions to be satisfied [5]. Without knowing this, it is difficult for the auditors to use auditing software or language. The number of misclassification increases. This Paper proposes a framework called “Knowledge-driven internal fraud detection (KDIFD)” for discovering internal financial frauds. A process-based approach is suggested that consider

both forensic auditor’s tacit knowledge base and computer-based data analysis and mining techniques. The data structure of each kind of possible fraudulent transactions is suggested. It is found that the proposed framework helps auditor in discovering internal financial fraud more efficiently than without a framework. In Section 2, a review of past research is presented. Various components of the proposed framework are outlined in Section 3. The data structure of suspicious transactions is also presented.

II. LITERATURE REVIEW Financial frauds are of two types: internal and external [7]. Internal financial frauds include asset misappropriation, and kickbacks etc. The fraudsters embezzle money and other important resources from the company. In case of financial statement fraud (external), the financial position of the company is misrepresented to the stakeholders. The resulted anomalies are reflected in various components of financial systems. Companies implement auditing process to identify such anomalies. In general two types of auditing [6] take place in organizations i.e. fraud audit and financial statement audit. In case of fraud or internal audit, internal controls of the organization are examined and weaknesses if any are identified. There is a chance of exploitation of the weaknesses in controls that may lead to fraud. In case of financial statement auditing company’s figures on the various financial statements are checked for any fraudulent behaviours.

Anomalies in accounting systems can be unintentional or intentional. Unintentional anomalies may be due to mistakes or errors in accounting system procedures and occur systematically in database. These types of unintentional anomalies are easily detected by traditional analysis techniques [9]. On the other hand one fraudster can commit some fraud intentionally. Rare events of this type cannot be easily detected by traditional techniques. Information technology has already been applied in detecting suspected fraudulent behaviours in financial system. One shining example is the WorldCom fraud discovery [8]. Auditor’s knowledge base along with advanced data mining techniques is required to discover such intentional fraudulent transactions.

Reviewing the literature published so far it is observed that most of the articles focus on detection of external financial fraud. Advanced analytical techniques have been proposed for detecting frauds of external kind such as

2011 International Conference on Communication Systems and Network Technologies

978-0-7695-4437-3/11 $26.00 © 2011 IEEE

DOI 10.1109/CSNT.2011.74

323

financial statement frauds. Very little work is done in the area of internal financial fraud. A review of advanced statistical methods such as artificial neural networks, genetic algorithms, rough set, fuzzy set, rule discovery, cluster analysis, and logistic regression for discovering frauds is presented in papers [10, 11]. Unfortunately the authors have not mentioned how the proposed techniques are used in business financial systems and how helpful these techniques are for auditors. In the paper [14] the authors developed a prototype system that considers outlier detection for finding irregularities in database and then applies classification technique for modeling. The prototype called Sherlock identifies irregularities from general ledgers of company. Once an irregularity is discovered, auditors further investigate it. The data mining tools and attack trees are integrated with experts’ judgments in the paper [15]. The judgment is interfaced with Delphi technique. The authors have considered meta-rules and tree based detections as part of the data mining. As anomaly detection techniques are not efficient and effective in identifying new behaviours or frauds that are perpetrated by fraudsters, experts’ judgment and experience are considered which is more effective. A framework IFR² [16] is proposed where the authors have considered data mining approach where a range of analytical techniques has been proposed. The framework focuses on the techniques of data mining without considering the experience and knowledge base of auditors. It would be difficult for internal auditors without information technology and statistical background to understand and apply the framework. Both subjective as well as objective methods have been considered in paper [17] to identify fraudulent transactions. In the integrated framework proposed, Analytical Hierarchical Process (AHP) is considered for capturing subjective knowledge of expert or group of experts and rough set has been considered for detecting the fraud in an objective way. AHP is a multi-criteria decision-making framework that captures both subjective and objective knowledge of group member(s). The framework focuses detection and prevention financial identity theft frauds only. Specialized auditing software such as ACL, IDEA, and Picalo incorporate a set of data mining techniques for identifying any irregularities in the data systems [12, 13]. Similarly modules for performing simple analysis as well as complex analysis and mining are available in numbers of statistical software available in the market such as SPSS, SAS, Polyanalyst, and Clementine etc.

The methods and techniques proposed and discussed in

various literatures in past on discovering internal financial frauds are very technical in nature from auditors point of view. Auditors find difficulties in understanding and applying the same in some context and unable to utilize and integrate their vast experiences and knowledge base with these methods and techniques. Even they find difficulties in applying simple techniques such as outlier analysis, link discovery, Benfords’s law, trend analysis, and matching. The techniques are not suitable for discovering all types of frauds, known and unknown. The auditors are not well

versed with data and its structure and hence find difficulties in preparing the data and transferring the same for further analysis. When there exists no framework, available of software with various advanced techniques is little help to auditors.

III. PROPOSED FRAMEWORK

In this Paper a framework called “Knowledge-driven Internal Fraud Detection (KDIFD)” is proposed for discovering internal financial frauds. In this framework, a process-based approach is suggested that consider forensic auditor’s tacit knowledge, experience, gut feeling, and intuition as well as computer-based data analysis and mining techniques for discovering internal financial frauds. In most of the papers published, the internal fraud discovery process is outlined without giving importance to auditor’s domain knowledge and experience. Sophisticated analytical techniques are suggested with examples but without any well-define data structure. It is not easy for auditors to apply these techniques even if auditing software is available. With the help of proposed framework, auditors are able to give justice to the internal fraud discovery process that uses data analysis and mining techniques. The steps of the proposed framework are depicted as flowchart in Figure 1 and outlined below.

Fig. 1: Knowledge-driven internal fraud detection (KDIFD) Framework

324

Step-1: Establishing the Context: The forensic auditors take the ownership of the fraud discovering process. The sub-steps are following:

- Understand and analyze the processes and controls of all the processes or any process of the engagement asked for

- Determine where the risks exist as well as the nature and extent of risk

- Identify and prioritize the possible (known) fraud schemes

- Establish the indicators for each of the identified fraud schemes

- Identify the data items or attribute(s) for each indicators (red flags)

- Identify the databases or files that are having these attributes and their respective sources.

Step-2: Sourcing the required files and database: The

relevant documents as well as the databases having desired attributes are collected from different sources, both electronic and non-electronic. Frauds are rare events. In order to discover any possible frauds the whole database (population) must be considered instead of a sample of transactions. Metadata most be collected as it gives what type of data must be available in database and help in identifying missing data if any. If required, the data from various sources are integrated using appropriate keys. In case of internal fraud detection, both accounting and non-accounting databases are collected. Accounting databases include accounts receivable, general ledger, fixed assets, accounts payable, cash payments, and invoices. The non-accounting databases include employee, vendor, customer, supplier, product, pricing, telephone logs, telephone numbers, bank statements, supplementary accounting systems, email, and voice etc. Non-electronic materials must be transferred to electronic formats. Auditors arrange these databases before visiting field for investigation.

Step-3: Data Preparation: Incomplete, inaccurate data,

inappropriate granularity and data in wrong formats affect the quality of data. Once the data is collected from various sources, it should be tested for its accuracy, appropriateness and completeness by using appropriate methods. Linking the data from various sources checks the inconsistency between data items and records. The data must be prepared for further analysis and processing, provision for legal compliances, privacy and security purposes. Research is still going on converting data structure to the desired standard structure automatically as well as interpretation of data. The desirable data structure is suggested below:

Data Structure

Data Structure-1: Transactions with atypical combinations of typical entries

Data in individual attributes are typical and acceptable

whereas some combinations of data (a subset of Cartesian

product of data in some variables) are atypical and unacceptable. Number of this type of transactions in corporate database is significant and it is very difficult to identify frauds. Association and sequencing techniques of data mining are most appropriate. Data Structure-2: Transactions where value that is atypical with respect to reference group

In this type of data structure, value of a particular entry or combination of entries is atypical with respect to comparison value(s). The reference value can be suggested by auditors or can be determined by cluster analysis. Outlier analysis can be performed to detect cases deviating from this references value. Data Structure-3: Value that is atypical of and by itself

The entry of a variable of a transaction may be atypically

high or low by itself. The data can be outlier but still acceptable value. Descriptive statistics such as range, and standard deviation can be easily applied to detect this type of errors or frauds. Outlier analysis and relative size factor techniques are used to identify deviations. Data Strucure-4: Unrelated records having same values for some fields

Two or more unrelated records can have unexpectedly the same values (excluding obvious values such as gender, and nationality etc.) for some of the attributes. Buyer and seller are having the same mailing address or slightly different addresses. Cluster analysis can be used for identifying such cases. Data Strucure-5: Multiple unrelated records having some values confirming records not unrelated

Several unrelated records having unexpectedly some values establishing a link or network between them. This characteristic indicates that these records are related in some ways and have some hidden relationship. Link analysis can be used to identify any type of ring fraud. Data Structure-6: Value of outcome attribute (past fraud) is known for specific transactions

Over the period of time forensic auditors discover some frauds. In this case both outcome attribute (occurrence or non-occurrence of fraud) as well as the all the attributes that contribute to the outcome attribute are known. A statistical or data mining model can be built from the past data and that model can be used for future transaction. Training, validation and production data are used in modeling. The importance of order of attributes can also be derived. Using some techniques can score the transactions. Predictive modeling

325

techniques such as classification, and regression are used in such cases.

Step-4: Data Cleaning and Transformation: Data in any

form must be cleaned before any transformation and further processing. Any unnecessary characters, blank spaces, data must be cleaned. If cleaning of data is not done properly many analysis techniques such as text mining matching software would not provide good result. Once the cleaning is completed, transformation of data from one form to some other form is performed so that it is amenable for data mining processing. The transformation also includes different levels of granularity of information. The forensic auditor’s knowledge would be helpful in deciding various transformations that leads to discovery of various suspicious transactions.

Step-5: Selection of Techniques: There exists is a range

of techniques for detecting fraudulent techniques. Each technique is based on some assumptions and has its limitations of applying. The application of a technique in a particular context depends on many factors that include data structure, number of attributes, size of data, whether outcome is known, information granularity (e.g. detailed or summarized), and data type (e.g. categorical, nominal, ordinal, interval, ratio) of attributes. The forensic auditors must know each technique has its advantages and disadvantages. Accordingly they must select appropriate software having appropriate techniques. For example, Benford’s law is not affected by scale invariance and helps when there are no supporting documents to prove the authenticity of the transactions. On the other hand, it cannot be applied on categorical data type (e.g. employee ID, cheque numbers and telephone numbers etc.). Not all the data follows Benford distribution. The technique is applied on entire population and on one attribute at a time.

Step-6: Forensic Data Analysis and Mining: Over the

period of time a wide range of analytical techniques has been evolved in the area of data analysis and mining. It includes techniques from statistics, computer science, data mining, and machine learning etc. The techniques can be explorative or predictive (supervised or unsupervised) depending on various factors. Online Analytical Processing (OLAP), outlier analysis, cluster analysis, Kohenen-neural network, association and sequencing techniques are explorative in nature where the results will give insights to the auditors. The predictive techniques include regression, classification, and neural network etc. in which the auditor can build a model from past data and then apply the same on new data for prediction purposes. Some of these techniques are highly sophisticated and also interpretation would be very difficult for the auditors. Auditors must know the applicability and interpretability of a technique before applying. The data analysis and mining is complementary to experience-based investigation analysis.

Although many advanced techniques are available in

literature and implemented in software, simple techniques

are useful for forensic auditors in many situations. The techniques include sorting the transactions on some attributes, finding frequency of some attribute(s), searching for some values or keywords, comparing values of attributes, identifying blanks, and sequencing the transactions. Using these techniques auditors can identify some outliers. For example, in case of vendor payment, vendor names can be sorted and then look like, and misspelling can be identified. OLAP is one of the popular techniques that auditors can use for getting insights.

Step-7: Experience-based Confirmation: Once the

suspicious transactions are identified though data analysis and mining must be further investigated and explained using auditors’ traditional investigative techniques. The suspicious transactions discovered in analysis and mining phase must be confirmed before concluding. The auditor may gather evidences, compare different sources of evidences, conduct interviews, and inspect the documents against the fraudulent transactions discovered. The investigation helps auditors in finding causes of anomalies or deviations. More the auditor’s knowledge on processes and internal controls of the organization, better the result. The learning from this phase can be captured in some way (e.g. database or case based reasoning) and used again in the data analysis and mining phase to narrow down the search space as well as knowing about frauds and its extent.

IV. CONCLUSION The proposed framework provides a systematic process

for the auditors in discovering internal financial frauds. The auditors can use their own experience and investigation skills and integrate with tools and techniques available in different software. The suggested data structures of fraudulent transactions assist the auditors in preparing the data for application of various techniques using software. The framework helps the auditors in establishing the context, sourcing appropriate data, preparing the data in desired data structure, cleaning and transferring to amenable format, selecting the technique, analyzing and mining the data and finally confirming whether there is possibility of fraud or not. The process-based approach suggested in this paper considers both forensic auditor’s tacit knowledge base and computer-based data analysis and mining techniques. The future work may include implementation of the proposed framework for real time applications.

REFERENCES

[1] Bologna, Jack and Robert J. Lindquist. Fraud Auditing and Forensic

Accounting. New York: John Wiley & Sons, 1987. [2] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Second

edition, Morgan Kaufmann Publishers, 2006, pp. 285–464. [3] E. Kirkos, C. Spathis, Y. Manolopoulos, Data mining techniques for

the detection of fraudulent financial statements, Expert Systems with Applications 32 (4) (2007) 995–1003.

326

[4] C. Phua, V. Lee, K. Smith, R. Gayler, A comprehensive survey of data mining-based fraud detection research, Artificial Intelligence Review (2005) 1–14.

[5] Silverstone, Howard and Michael Sheetz. Forensic Accounting and Fraud Investigation for Non-Experts. Hoboken, John Wiley & Sons, 2004.

[6] Thomas B, Leslee H, Debra S. (2010). A Fraud audit: Do you need one? Journal of Applied Business Research, 26, 5. 29-33.

[7] Albrecht, W. S., Albrecht, C., & Albrecht, C. C. (2008). Current Trends in Fraud and its Detection. Information Security Journal: A Global Perspective, 17(1).

[8] Lamoreaux, M. (2007). Internal Auditor Used Computer Tool to Detect WorldCom Fraud. Journal of Accountancy, 35.

[9] Albrecht, W. S., & Albrecht, C. C. (2002). Root Out Financial Deception. Journal of Accountancy, 30-33.

[10] Agyemang, M., Barker, K., & Alhajj, R. (2006). A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis, 10, 521-538.

[11] Kou, Y., Lu, C., & Sirwongwattana, S. (2004). Survey of Fraud Detection Techniques. In 2004 International Conference on Networking, Sensing, and Control, 749-754.

[12] Lehman, M. W. (2008). Join the Hunt. Journal of Accountancy, 46-49.

[13] Procedure, S. (2008). Retrieved April 02, 2011 from http://www.securityprocedure.com/downloadpicaloopensourcealternativeaclaudit.

[14] Stephen B, Krishna K, Markus G. Anderle, Rohit K, David M.(2006). Large Scale Detection of Irregularities in Accounting Data, Proceedings of the Sixth International Conference on Data Mining.

[15] Buoni A.(2010). Fraud Detection: From Basic Techniques to a Multi-Agent Approach, Management and Service Science (MASS), International Conference

[16] Mieke J., Nadine L, Koen V.,(2009). A Framework for Internal Fraud Risk Reduction at IT Integrating Business Processes: The IFR² Framework, The International Journal of Digital Accounting Research, 9, 1-29.

[17] Qian L., Tong L., Wei X., (2009). A subjective and objective integrated method for fraud detection in financial systems, Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding.

327

[ieee 2011 international conference on communication systems and network technologies (csnt) -...

Documents