wednesday 110 using digital forensics to investigate...
TRANSCRIPT
9/19/2017
1
Using Digital Forensics to Identify & Investigate Fraud
Presenter:
Damon Hacker, MBA, CCE, CISA President Vestige Digital Investigations
Ves ● tige (véŝ tĭj) n. 1. A visible trace, evidence or sign of something that has once existed but now no longer exists or appears.
9/19/2017
2
Overview
• Open Your Eyes• Real‐World Scenarios
• What Computer Forensics Can and Cannot Do
• Forensic Techniques Overview/Primer
• What You Need to Know
• Q&A
Xerox DocuColor 12 page, magnified 10x and photographed by the QX5 microscope under illumination from a Photon blue LED flashlight
9/19/2017
3
Scenarios
• Real‐World Examples• Vestige Involved
• Some Information is in Public Domain, Most is Not.• Information Changed to Protect the identities of those involved.
• May be a compilation of more than one case to make a specific point or further protect identities.
9/19/2017
4
Scenario 1
• The Scene:• Wrongful Termination lawsuit
• Sending of “over‐the‐top” e‐mails
• Two places at once?
9/19/2017
6
I.T.’s Findings
• Tracked IP Address
Deposition ends:11:15am in Detroit------------------------How can this be?
I.T.’s Findings
• Ventura, California
9/19/2017
8
Scenario 2
• Tail‐end of litigation
• Plaintiff wins matter and is awarded attorneys fees
• Disparate amounts spent between plaintiff & defendant
• Defendant’s counsel believes Plaintiff’s counsel has “stuffed” time entry
Scenario 2
• Review of Time & Billing Software• No apparent manipulation
• Chronology looked appropriate• Defendant’s analysis concluded no manipulation.
9/19/2017
9
Scenario 2
• Vestige Analysis• Review of database, behind‐the‐scenes• Time & Billing software uses off‐the‐shelf back‐end database, albeit not a common one
• Vestige tools to review data at database level• Vestige created “parsing” utility to extract and review deleted records
Scenario 2
• Analysis reviewed approximately 40% overbilling occurring• “Stuffing” time entries sequentially at end of case
• Replacement entries that were 2x‐10x the amount of the deleted entries they replaced
• Adjustment and sanctions
9/19/2017
10
Scenario 3• $30 Million shortage in commodities
• $3 Billion company• 3000+ employees
• 100s of thousands financial transactions
• No initial persons of interest
Hypothesis:
Internal controls would require collusion to pull off fraud. Individuals ought to be communicating with one another.
Scenario 31. Take backup tape from email system last year
2. Take backup tape from email system last month
3. Index every word and frequency per user
4. Import word Index, frequency per user, and frequency count into Excel
5. Using Excel calculate median frequency per user for each word
6. Identify words having frequency per user far greater than median
• Led to determination of “do you want cheese with that?”in excess of 2000% greater than median frequency for 3 individuals
9/19/2017
11
Scenario 3
• Requests for financial statements accompanied by
“Do you want cheese with that?”• Innocuous sounding word/phrase
• Not in selection set for typical keyword search
Scenario 4
• Stolen Laptops• Law Enforcement’s involvement
• Stupidity at its finest
9/19/2017
12
Scenario 5
• Wage/Hour Class Action• Non‐Exempt classified as Exempt
• Timeframe stretches back 3‐4 years
By the NumbersFraud Statistics
9/19/2017
13
Typical Fraud
• Typical organization loses 5% of annual revenues to various frauds
• $6.3 TRILLION issue worldwide
• Median loss $150,000
• In 94.5% of cases in study, perpetrator took efforts to conceal the fraud!
Source: 2016 Report to the Nations on Occupational Fraud and Abuse. The Association of Certified Fraud Examiners
Detecting Fraud
• Time to detection – median is 18 months
• How Fraud was Detected:• Tip – 39.1%
• Internal Audit – 16.5%
• Management Review – 13.4%
• By Accident – 5.6%
• Account Reconciliation – 5.5%
• External Audit – 3.8%
9/19/2017
14
Controls
• Strong linkage between anti‐fraud controls:• Significant decrease in cost
• Decrease in duration of time‐to‐detection
Perpetrator• 94.5% are first‐time offenders
• Clean employment history
• No criminal background
• 79% of cases, perpetrator exhibited “red flag” behavior• Living beyond means
• Financial Difficulties
• Unusually close association with vendors/clients
• Excessive control issues/wheeler‐dealer attitude
• Recent divorce / family problems
9/19/2017
15
Why People Commit Fraud
• The Psychology of Fraud
Rationalization
Intersection of Technology & Fraud• Opportunity
• Majority of financial transactions are Technology‐linked
• Less tangible – belief its harder to get caught
• Availability of software• Personal accounting software
• Document alteration
• Access to information• Research on techniques & cover‐up
9/19/2017
17
What you can expect
• Content• Keyword search for content/communication
• ALL correspondence
• Hidden information
• Deleted information
• Orphaned information
• Encrypted information
• Correspondence• Memos
• Emails
• Instant messages
• Faxes
• Deleted
• Old and forgotten
9/19/2017
18
• Business Records• Financial data
• Assets
• Calculations
• PRIOR DRAFTS
• DELETED DRAFTS
• Projections
• Everything you could imagine
• Every Website visited
• All pictures from those websites
• Every Website from popups and popunders
• All maps, from Mapquest for example
9/19/2017
19
Every INTERNET SEARCH & the Search Results
What you can expect
• Conceptual Analysis• How the computer was used
• IM activity – dates/times, frequency, who• E‐mails – activity• Web‐based E‐mails• Deletion activity• Wiping activity• Software installed
• File Transfers• CD/DVD burning• Attached hardware• Other networks attached• Remote Access activity• Do we have the “Right” system?
9/19/2017
20
What you can expect
• Condition of evidence• Used by others
• Formatted
• Re‐partitioned
• Damaged
• Wiped/Cleansed/Sanitized
What Digital Forensics Can’t Do• Find evidence that isn’t there
• Never was on this evidence
• May have been on this evidence but was overwritten
• Wrong Interpretations• Artifact analysis
• Example: Defragmenting
• “Who was at the keyboard?” Some analysis will allow the answer to be inferred.
9/19/2017
22
Locard’s Exchange Principle
“In forensic science, Locard’s principle holds that the perpetrator of a crime will bring something into the crime scene and leave with something from it –both of which can be used as forensic evidence.”
Sources vs Documents• Identify Appropriate “Key” Devices
• Key‐players• Expanded Key‐players
• Administrative Assistants• Other likely correspondents, etc.
• Observing Devices• Monitors, surveillance
• Pass‐Thru Devices• Routers, firewalls, servers, monitoring systems
• Passive Devices• i.e. conveyor
9/19/2017
23
Evidence Volatility
• Rate at which evidence disappears
Registers, Cache
Memory, Routing Tables, Process Tables
Temporary Files
Disk & Other “permanent” storage
Logging & Monitoring Data
Archives
Potential Sources ISP
Router
Firewall
IDS/IPS
Managed Switches
Servers
Workstations
Other monitoring devices (alarm system)
Log files
GPS
Cell Tower Data
Syslog
Honeypot
Virtual Machines
Cloud Service
General Network Sniffers
Backup tapes/disks
Replication sites
Disaster Recovery sites
Digital Scale & other Measuring Devices
RFID Data
“Black Boxes”
Video Surveillance
Payment or other Registration Info
9/19/2017
25
Acquisition
• First & Foremost: Evidence Preservation• Admissibility in Court
• Protection of All Parties Involved…even the investigator
• Avoid Contamination/Spoliation of Evidence
Acquisition
• Completeness• “The Whole Truth”
• Used & Unused (Unallocated) Space
• Active & Inactive Systems
• Seemingly “Inaccessible” Systems & Media
9/19/2017
26
Acquisition• Methodology
• Forensically‐sound Bit‐for‐Bit Clone• Copy, clone, mirror
• Write‐protect
• Place on Sterile Media
• MD5 or other authentication hash
• Chain of Custody
• Seal Evidence
Authentication• Authenticate:
• Prove “no change”
• Prove Clones ARE the Same
• Method• MD5 Hash (digital fingerprint)
• Industry‐standard, industry‐recognized
• 128‐bit
• 1 in 1x1038 chance for deceiving• 1 in 100,000,000,000,000,000,000,000,000,000,000,000,000
• DNA Evidence is 1 in 1,000,000,000
9/19/2017
27
Authenticate
• Our Methodology• MD5 Hash – Digital Fingerprint
702865f9ebd7478fbab050ed6b4612f0MD5
Authenticate• Prove copies (working) are the same
702865f9ebd7478fbab050ed6b4612f0 702865f9ebd7478fbab050ed6b4612f0
COPY
9/19/2017
28
Authenticate• Prove nothing has changed
702865f9ebd7478fbab050ed6b4612f0 702865f9ebd7478fbab050ed6b4612f0
Analysis• Leave No Stone Unturned
• Used (active) space
• Unused (inactive/unallocated) space
• Slack space
• Deleted – partially, separation of metadata and content
• Artifacts
• Printed documents
• E‐mail / IM / chat sessions
• Internet History
9/19/2017
29
Analysis• Hiding activity
• Encryption
• Mismatched document types
• Steganography
• Date Analysis• MAC Dates
• Metadata
Analysis • Installed Software
• Use of software• Installed
• Permissions
• First Use
• Last Use
• Registration
• Settings
• Number of times used / Frequency of use
• Removed Software
9/19/2017
30
Analysis
• Hardware• Installed hardware
• Removed hardware
Analysis
• Nefarious Activities• Wiping
• Encryption
• Booby Traps
• Analysis of “Trojan Defense”
9/19/2017
31
Beyond Sampling…Advanced Analytics
to Find the Needle in the Haystack
Virtualization
• Run the application as intended• Sometimes the only solution
• Allows for reverse‐engineering
• Testing of artifacts
9/19/2017
32
Behind‐the‐Scenes
• But…• The most valuable information comes from what we can gain access to apart from the constraints of the application and its interface.
Structure of Accounting Systems
• Front‐end Application
• Back‐end Data• Standard Database
• Proprietary Database
9/19/2017
33
CAATTs
• Computer Assisted Audit Tools & Techniques• Use of computers to automate process
• Specialized software tools• ACL, IDEA, Picalo
• General data manipulation software tools• Excel, Access, SQL db, SAS
Statistical Approach
9/19/2017
34
Anomalies
• Duplicates
• Threshold Analysis
• Outliers
Statistical Approach
• General Fingerprinting• Min / Max
• Mean / Median
• Standard Deviation / Distribution
• Three Tools• Data Profile
• Data Histogram
• Periodic Graph
9/19/2017
35
Statistical Approach
• 3 Tools – What do they tell us?• Data Completeness
• Negative (contra) amounts
• High proportion of high‐value / low‐value transactions
• Presence of Zero transactions
• Detect possible deviances
Example
• Simple• AP Check Register
• Data Profiling
• Missing checks
• Duplicate checks
• Threshold analysis
• Outlier analysis
9/19/2017
37
Traditional Approach vs. Analytics
• Analytics• Relating Disparate Data
• Examples:• Vendor EIN to HR database of employee’s SSN
• Addresses to employee’s addresses
• Related Disparate Sources• Examples:
• EIN numbers against known invalid SSNs (i.e. IRS’ list of invalid SSNs, deceased individuals)
9/19/2017
38
Traditional Approach vs. Analytics• Analytics
• Pattern Recognition• Transactions just under signing threshhold
• Transactions processed on Sat/Sun
• Transactions processed after H:MM
• Number of transactions processed
• Large transactions to new vendor
• Analytics• Benford’s Law
• TF/IDF
Working with Raw Data
9/19/2017
42
Analysis
• Data not available through Interface• Processing data
• Autoincrement / Index fields
• Additional Audit Trail information
• Check Register vs Check Images
Analysis
• Deleted Database Records• Consistent amongst most databases
• Until “packed” or “compressed”
• Recoverable with the right tools & know‐how
• Some applications (especially financial) may maintain multiple audit/change logs
9/19/2017
43
Analysis
• Simple Linking• Disbursement to PO Amount Comparison
• Link 2 data sources
Analysis
• More Complex• Financial System – Vendor Addresses
• HRIS – Employee Addresses
• Examples• Vendors with same TIN as EE SSN
• Vendors with addresses same as EE
• Multiple vendors from same address
9/19/2017
44
Social Network Analysis• Relationships
• Reciprocal Relationships• Weighted Relationship
• How Much
• How Frequent
• Closeness
• Density
• Connectivity
Social Network Analysis
9/19/2017
45
Finding the Needle in the Haystack
• Oftentimes there’s a “feeling” that something is amiss• No hard evidence
• Actors are unknown
• Schemes are unknown
• Scope of problem is unknown
Finding the Needle in the Haystack
• Keywords?• Just a guess
• Bad keywords• Fraud, embezzlement, steal
• “Follow the cash”• Will work – but is tedious
• Obscurity through Complexity
9/19/2017
47
What about Contextual Analysis?
• Introducing the TF/IDF• Term Frequency
• Inverse Document Frequency
• Document can be substituted with custodian
TF/IDF – Information Retrieval Theory
9/19/2017
48
Analysis
• Analysis Tools• Benford’s Analysis
• Pajek: Social Network Analysis
• Statistical Analysis: Excel, Access, mySQL
Summary• Data can exist as Active and Deleted
• Content vs Artifact
• Fraud is a huge issue and increasingly problematic
• Scope is difficult to ascertain initially
• Traditional approaches lack ability to quickly and effectively pinpoint issues
• Statistical analysis can greatly assist in this area
9/19/2017
49
Closing Thoughts
A: Involve a Digital Forensic Expert early• Ability to plan discovery strategy
• Ensure admissibility
• Work closely with client
Q: What to do when faced with Electronic Data?
And the # 1 reason…
If you think hiring an Expert is expensive…
wait until you’ve hired an amateur!
9/19/2017
50
Damon S. Hacker, MBA, CCE, CISA
Cleveland | Columbus | Pittsburgh | National Coverage800‐314‐4357
www.VestigeLtd.com
Vestige Digital Investigations
Electronic Evidence ExpertsWe Turn Digital Evidence into Intelligence™