learning sampling in financial statement audits using

Learning Sampling in Financial Statement Audits using VectorQuantised Variational Autoencoder Neural NetworksMarco Schreyer

University of St.GallenSt.Gallen, Switzerland

[email protected]

Timur SattarovDeutsche Bundesbank

Frankfurt am Main, [email protected]

Anita GierblUniversity of St.GallenSt.Gallen, [email protected]

Bernd ReimerPricewaterhouseCoopers GmbH WPG

Stuttgart, [email protected]

Damian BorthUniversity of St.GallenSt.Gallen, [email protected]

ABSTRACTThe audit of financial statements is designed to collect reasonableassurance that an issued statement is free from material misstate-ment (’true and fair presentation’). International audit standardsrequire the assessment of a statements’ underlying accounting rele-vant transactions referred to as ’journal entries’ to detect potentialmisstatements. To efficiently audit the increasing quantities of suchjournal entries, auditors regularly conduct an ’audit sampling’ i.e. asample-based assessment of a subset of these journal entries. How-ever, the task of audit sampling is often conducted early in theoverall audit process, where the auditor might not be aware of allgenerative factors and their dynamics that resulted in the journalentries in-scope of the audit. To overcome this challenge, we pro-pose the use of a Vector Quantised-Variational Autoencoder (VQ-VAE)neural networks to learn a representation of journal entries able toprovide a comprehensive ’audit sampling’ to the auditor. We demon-strate, based on two real-world city payment datasets, that suchartificial neural networks are capable of learning a quantised repre-sentation of accounting data.We show that the learned quantisationuncovers (i) the latent factors of variation and (ii) can be utilised asa highly representative audit sample in financial statement audits.

CCS CONCEPTS• Computing methodologies→ Machine learning; Unsupervisedlearning; Dimensionality reduction and manifold learning; • Infor-mation systems→ Enterprise resource planning.

KEYWORDSdeep learning, audit, autoencoder neural networks, vector quantisa-tion, audit sampling, computer assisted audit, audit data analytics,accounting information systems

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] ’20, October 15–16, 2020, New York, NY, USA© 2020 Association for Computing Machinery.ACM ISBN 978-1-4503-7584-9/20/10. . . $15.00

ACM Reference Format:Marco Schreyer, Timur Sattarov, Anita Gierbl, Bernd Reimer, and DamianBorth. 2020. Learning Sampling in Financial Statement Audits using VectorQuantised Variational Autoencoder Neural Networks. In Proceedings ofthe ACM International Conference on AI in Finance (ICAIF ’20), October 15–16, 2020, New York, NY, USA Proceedings of the First ACM InternationalConference on AI in Finance 2020, New York, NY, USA, 8 pages.

1 INTRODUCTIONThe trustworthiness of financial statements plays a fundamentalrole [2] in today’s economic decision making by investors. The au-dit of such statements, conducted by external auditors, is designedto collect reasonable assurance that an issued financial statementis free from material misstatement (’true and fair presentation’)[1, 5]. International audit standards require an assessment of thestatement’s underlying accounting relevant journal entries to detecta potential misstatement [3]. Journal entries debit and credit theseparate accounting ledgers of a financial statement evident in itsbalance sheet and profit and loss statement. Nowadays, organiza-tions collect vast quantities of such journal entries in AccountingInformation Systems (AIS) or more general Enterprise Resource Plan-ning (ERP) systems [19]. Figure 1 depicts a hierarchical view of thejournal entry recording process in designated database tables of anERP system.

To efficiently audit the increasing quantities of journal entries,auditors regularly conduct a sample-based assessment referred toas audit sampling. Formally, audit sampling is defined as the ’se-lection and evaluation of less than 100% of the entire population ofjournal entries’ [6]. While sampling increases the efficiency of afinancial audit, it also increases its sampling risk. The term sam-pling risk denotes the likelihood that the auditor’s conclusion basedon auditing a subset of entries may differ from the conclusion ofauditing the entire population [21]. Auditors are required to selecta representative sample from the population of journal entries [4]to mitigate sampling risks. The selection of such a representativesample is determined by the design of the applied sampling tech-nique. International audit standards require auditors to built theiraudit approach on either (i) non-statistical or (ii) statistical sam-pling techniques. Non-statistical or ’judgemental’ sampling denotesthe selection of audit samples based on the auditor’s experience,inherent risk assessment, and ’professional judgment’. Statisticalsampling, in contrast, aims to provide greater sampling objectivity.

ICAIF ’20, October 15–16, 2020, New York, NY, USA Schreyer and Sattarov, et al.

1

Incoming Invoice$ 1,000

Outgoing Payment$ 1,000

Expenses Liabilities BankCCD D D

$ 1,000$ 1,000 $ 1,000$ 1,000

Process

Accounting

Database

C

Accounting Information System

Company Entry ID Fiscal Year Type Date Currency Reference Text Transaction

AAA 100011 2017 SA 31.10.2016 USD 000349 Annual... SA01

AAA 100012 2017 MZ 31.10.2016 USD - Material... MM23

BBB 900124 2017 IN 01.02.2017 USD 148292 Invoice... IN04

... ... ... ... ... … … … …

Journal Entry Headers Table

Journal Entry Segments Table

Company Entry ID Sub-ID Currency Amount D/C Ledger Vendor Customer

AAA 100011 0001 USD 1’000.00 D 1000129 - -

AAA 100011 0002 USD 1’000.00 C 4010111 5890 -

BBB 900124 0001 USD 2’232.00 D 1022001 - 82304

... ... ... ... ... ... … … …

Figure 1: Hierarchical view of an Accounting InformationSystem (AIS) that records distinct layer of abstractions,namely (1) the business process, (2) the accounting and (3)technical journal entry information in designated tables.

It refers to the random selection of audit samples and the deter-mination of probabilities to evaluate the sampling results [4]. Toconduct statistical sampling and determine a representative samplesize, auditors nowadays use pre-calculated probability tables orsampling functions available in computer-assisted audit software[44], such as ACL1, IDEA2, or proprietary applications.

During an annual audit, the task of audit sampling is regularlyconducted early in the audit process. Thereby, auditors need todecide on sensitive sampling parameters before conducting thesampling, e.g. materiality levels, confidence intervals, and accept-able risk thresholds [26]. Even though the auditor might be unawareof all latent factors that generated the journal entries in-scope ofthe audit, e.g. the underlying business processes and workflows.This challenge is in particular evident in the context of new auditengagements in which auditors are mandated to audit an organi-sation’s financial statement for the first time. However, it is alsoof relevance in mature audits, especially in scenarios where thegenerative factors of an organisation vary dynamically over time[38], e.g. due to a merger or carve-out that results in organisationalprocess- and workflows-changes.

Driven by the rapid technological advances of artificial intel-ligence in recent years, techniques based on deep learning [31]have emerged into the field of finance [13, 47], and particular fi-nancial statement audits [10, 41–43]. These developments raise thequestion: Can such techniques also be utilised to learn represen-tative audit samples? And if so, can the samples be drawn in away that they are interpretable by a human auditor? In this work,we propose the application of Vector Quantised-Variational Autoen-coder (VQ-VAE) neural networks [46] to address those questions.Inspired by the idea of discrete neural dimensionality reduction,we demonstrate how VQ-VAE neural networks can be trained tolearn (i) representative and (ii) human interpretable audit samples.In summary, we present the following contributions:

1https://www.wegalvanize.com2https://idea.caseware.com

• We demonstrate that VQ-VAEs can be utilised to learn alow-dimensional representation of journal entries recordedin AIS systems that disentangle the entries latent generativefactors of variation;

• We show that the VQ-VAEs training can also be regularisedto learn a set of discrete embeddings that quantise the rep-resentations generative factors and therefore constitute arepresentative audit sample;

• We illustrate that the learned quantisation reflects the en-tries generative accounting processes and provides a startingpoint for human interpretable audit sampling and down-stream substantive audit procedures.

The remainder of this work is structured as follows: In section 2,we provide an overview of related work. Section 3 follows with adescription of the VQ-VAE architecture and presents the proposedmethodology to learn a representative sample from vast quantitiesof journal entries. The experimental setup and results are outlinedin section 4 and section 5. In section 6, the paper concludes with asummary of the current work and future research directions.

2 RELATEDWORKDue to its high relevance for the financial audit practice, the taskof audit sampling triggered a sizable body of research by academia[7, 15] and practitioners [20, 25]. In the realm of this work, we focusour literature review on the two main classes of statistical samplingtechniques, namely (i) attribute sampling, and (ii) variables sam-pling used nowadays [23] in auditing journal entries:

2.1 Attribute Sampling TechniquesThe technique of attribute sampling is applied by auditors to es-timate the percentage of journal entries that possess a specificattribute or characteristic. The different techniques of attributesampling encompass the following classes [21]:

• Random sampling in which each journal entry of the popu-lation has an equal chance of being selected.

• Systematic or sequential sampling in which the samplingstarts by selecting an entry at random and then every 𝑛-thjournal entry of an ordered sampling frame is selected.

• Proportional, block or stratified sampling in which the pop-ulation is sub-divided into homogeneous groups of journalentries to be sampled from [33].

• Haphazard sampling, in which no explicit or structured se-lection strategy is employed by the auditor. The sampling isconducted without a specific reason for including or exclud-ing journal entries [22].

In general, attribute sampling is utilised by auditors to test theeffectiveness of internal controls, e.g. the percentage of vendorinvoices that are approved in compliance with the organisation’sinternal controls, e.g. by following a ’four-eye’ approval principle.

2.2 Variable Sampling TechniquesThe technique of variable sampling is applied by auditors to esti-mate the amount or value of specific journal entry characteristics.The different techniques of variable sampling encompass the fol-lowing classes [21]:

Learning Sampling in Financial Statement Audits using Vector Quantised Variational Autoencoder Neural Networks ICAIF ’20, October 15–16, 2020, New York, NY, USA

1

Encoder Net Decoder Net

…

…

Incorrect Reconstruction

xm

Attr

ibut

e 1

x1

x2

x3

x4

xj

Attr

ibut

e 2

Attr

ibut

e m

§ Type

§ Date§ Trade

§ Limit

§ Institution§ Amount

§ Counterpart

§ Code

§ Type

§ Date§ Trade

§ Limit

§ Institution§ Amount

§ Counterpart

§ Code

✓✓✓

✓✓

✓

✘

✘

✓ Correct Reconstruction✘

InputJournal Entry

ReconstructedJournal Entry

xm

x2

x1

x3

x4

xj

Attribute 1

Attribute 2

Attribute m

… …… …… …

… …

…

…

…

…

…

…

…

… …

…

… … …

…

…

…

Embedding Space

QuantisedApproximate

Posterior

… …

…

…

…

…

…

…

…

…

ze(x)<latexit sha1_base64="KS23mFWZSwODvCIhEpQQ1tGn9DQ=">AAAB73icbVDLTgJBEOzFF+IL9ehlIjHBC9lFEz0SvXjERB4JbMjs0AsTZh/OzBpxw0948aAxXv0db/6NA+xBwUo6qVR1p7vLiwVX2ra/rdzK6tr6Rn6zsLW9s7tX3D9oqiiRDBssEpFse1Sh4CE2NNcC27FEGngCW97oeuq3HlAqHoV3ehyjG9BByH3OqDZS+6mX4qT8eNorluyKPQNZJk5GSpCh3it+dfsRSwIMNRNUqY5jx9pNqdScCZwUuonCmLIRHWDH0JAGqNx0du+EnBilT/xImgo1mam/J1IaKDUOPNMZUD1Ui95U/M/rJNq/dFMexonGkM0X+YkgOiLT50mfS2RajA2hTHJzK2FDKinTJqKCCcFZfHmZNKsV56xSvT0v1a6yOPJwBMdQBgcuoAY3UIcGMBDwDK/wZt1bL9a79TFvzVnZzCH8gfX5A8qej80=</latexit>

zq(x)<latexit sha1_base64="ffFfnVKZGqvFfv89IyUb98gNnhc=">AAAB73icbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/QLiWbZtvQJLtNsmJd+ie8eFDEq3/Hm//GtN2Dtj4YeLw3w8y8IOZMG9f9dlZW19Y3NnNb+e2d3b39wsFhQ0eJIrROIh6pVoA15UzSumGG01asKBYBp81geDP1mw9UaRbJezOOqS9wX7KQEWys1HrqpqNJ6fGsWyi6ZXcGtEy8jBQhQ61b+Or0IpIIKg3hWOu258bGT7EyjHA6yXcSTWNMhrhP25ZKLKj209m9E3RqlR4KI2VLGjRTf0+kWGg9FoHtFNgM9KI3Ff/z2okJr/yUyTgxVJL5ojDhyERo+jzqMUWJ4WNLMFHM3orIACtMjI0ob0PwFl9eJo1K2TsvV+4uitXrLI4cHMMJlMCDS6jCLdSgDgQ4PMMrvDkj58V5dz7mrStONnMEf+B8/gDc/o/Z</latexit>

q✓<latexit sha1_base64="aUZulGPS2II0KxNAue3tJMxsHVY=">AAAB8XicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/cA2lM120y7dbOLuRCih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXFtRKzucZxwP6IDJULBKFrp4bGXdXHIkU56pbJbcWcgy8TLSRly1Hulr24/ZmnEFTJJjel4boJ+RjUKJvmk2E0NTygb0QHvWKpoxI2fzS6ekFOr9EkYa1sKyUz9PZHRyJhxFNjOiOLQLHpT8T+vk2J45WdCJSlyxeaLwlQSjMn0fdIXmjOUY0so08LeStiQasrQhlS0IXiLLy+TZrXinVeqdxfl2nUeRwGO4QTOwINLqMEt1KEBDBQ8wyu8OcZ5cd6dj3nripPPHMEfOJ8/+RaRHA==</latexit>

p�<latexit sha1_base64="pXkzKEJwAH5E0V3NXq72LpArxJs=">AAAB73icbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY9FLx4r2A9ol5JNs21oNhuTrFCW/gkvHhTx6t/x5r8xbfegrQ8GHu/NMDMvVIIb63nfqLC2vrG5Vdwu7ezu7R+UD49aJkk1ZU2aiER3QmKY4JI1LbeCdZRmJA4Fa4fj25nffmLa8EQ+2IliQUyGkkecEuukjupnPTXi03654lW9OfAq8XNSgRyNfvmrN0hoGjNpqSDGdH1P2SAj2nIq2LTUSw1ThI7JkHUdlSRmJsjm907xmVMGOEq0K2nxXP09kZHYmEkcus6Y2JFZ9mbif143tdF1kHGpUsskXSyKUoFtgmfP4wHXjFoxcYRQzd2tmI6IJtS6iEouBH/55VXSqlX9i2rt/rJSv8njKMIJnMI5+HAFdbiDBjSBgoBneIU39Ihe0Dv6WLQWUD5zDH+APn8AY6GQMg==</latexit>

e1<latexit sha1_base64="Yo2MIj40WPe/fdNm6okBAlWMfVY=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGCaQttKJvtpF262YTdjVBCf4MXD4p49Qd589+4bXPQ1gcDj/dmmJkXpoJr47rfztr6xubWdmmnvLu3f3BYOTpu6SRTDH2WiER1QqpRcIm+4UZgJ1VI41BgOxzfzfz2EyrNE/loJikGMR1KHnFGjZV87OfetF+pujV3DrJKvIJUoUCzX/nqDRKWxSgNE1TrruemJsipMpwJnJZ7mcaUsjEdYtdSSWPUQT4/dkrOrTIgUaJsSUPm6u+JnMZaT+LQdsbUjPSyNxP/87qZiW6CnMs0MyjZYlGUCWISMvucDLhCZsTEEsoUt7cSNqKKMmPzKdsQvOWXV0mrXvMua/WHq2rjtoijBKdwBhfgwTU04B6a4AMDDs/wCm+OdF6cd+dj0brmFDMn8AfO5w+0fI6d</latexit>

e2<latexit sha1_base64="1F70YKTIiN4en26VjcBKv8y01GI=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGCaQttKJvtpF262YTdjVBCf4MXD4p49Qd589+4bXPQ1gcDj/dmmJkXpoJr47rfztr6xubWdmmnvLu3f3BYOTpu6SRTDH2WiER1QqpRcIm+4UZgJ1VI41BgOxzfzfz2EyrNE/loJikGMR1KHnFGjZV87Of1ab9SdWvuHGSVeAWpQoFmv/LVGyQsi1EaJqjWXc9NTZBTZTgTOC33Mo0pZWM6xK6lksaog3x+7JScW2VAokTZkobM1d8TOY21nsSh7YypGellbyb+53UzE90EOZdpZlCyxaIoE8QkZPY5GXCFzIiJJZQpbm8lbEQVZcbmU7YheMsvr5JWveZd1uoPV9XGbRFHCU7hDC7Ag2towD00wQcGHJ7hFd4c6bw4787HonXNKWZO4A+czx+2AY6e</latexit>

ei<latexit sha1_base64="vfMEPo/VtoHhwMGozBcxzTUOKEU=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGCaQttKJvtpF262YTdjVBCf4MXD4p49Qd589+4bXPQ1gcDj/dmmJkXpoJr47rfztr6xubWdmmnvLu3f3BYOTpu6SRTDH2WiER1QqpRcIm+4UZgJ1VI41BgOxzfzfz2EyrNE/loJikGMR1KHnFGjZV87Od82q9U3Zo7B1klXkGqUKDZr3z1BgnLYpSGCap113NTE+RUGc4ETsu9TGNK2ZgOsWuppDHqIJ8fOyXnVhmQKFG2pCFz9fdETmOtJ3FoO2NqRnrZm4n/ed3MRDdBzmWaGZRssSjKBDEJmX1OBlwhM2JiCWWK21sJG1FFmbH5lG0I3vLLq6RVr3mXtfrDVbVxW8RRglM4gwvw4BoacA9N8IEBh2d4hTdHOi/Ou/OxaF1zipkT+APn8wcJo47V</latexit>

ek<latexit sha1_base64="RsEu2jxVmVEgcurkGCklepNjReE=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGCaQttKJvtpF262YTdjVBCf4MXD4p49Qd589+4bXPQ1gcDj/dmmJkXpoJr47rfztr6xubWdmmnvLu3f3BYOTpu6SRTDH2WiER1QqpRcIm+4UZgJ1VI41BgOxzfzfz2EyrNE/loJikGMR1KHnFGjZV87Ofjab9SdWvuHGSVeAWpQoFmv/LVGyQsi1EaJqjWXc9NTZBTZTgTOC33Mo0pZWM6xK6lksaog3x+7JScW2VAokTZkobM1d8TOY21nsSh7YypGellbyb+53UzE90EOZdpZlCyxaIoE8QkZPY5GXCFzIiJJZQpbm8lbEQVZcbmU7YheMsvr5JWveZd1uoPV9XGbRFHCU7hDC7Ag2towD00wQcGHJ7hFd4c6bw4787HonXNKWZO4A+czx8MrY7X</latexit>

D<latexit sha1_base64="WoflC9OsYADNdO+14nzcYBdCAmo=">AAAB6HicbVDLSgNBEOyNrxhfUY9eBoPgKexGQY9BPXhMwDwgWcLspDcZMzu7zMwKIeQLvHhQxKuf5M2/cZLsQRMLGoqqbrq7gkRwbVz328mtrW9sbuW3Czu7e/sHxcOjpo5TxbDBYhGrdkA1Ci6xYbgR2E4U0igQ2ApGtzO/9YRK81g+mHGCfkQHkoecUWOl+l2vWHLL7hxklXgZKUGGWq/41e3HLI1QGiao1h3PTYw/ocpwJnBa6KYaE8pGdIAdSyWNUPuT+aFTcmaVPgljZUsaMld/T0xopPU4CmxnRM1QL3sz8T+vk5rw2p9wmaQGJVssClNBTExmX5M+V8iMGFtCmeL2VsKGVFFmbDYFG4K3/PIqaVbK3kW5Ur8sVW+yOPJwAqdwDh5cQRXuoQYNYIDwDK/w5jw6L86787FozTnZzDH8gfP5A5i3jMw=</latexit>

rzL<latexit sha1_base64="c4Ryi3nTMiiYHLQdsIjqcUQS+dA=">AAAB/3icbVDLSsNAFJ3UV62vqODGzWARXJWkCrosunHhooJ9QFPKzXTaDp1MwsxEqDELf8WNC0Xc+hvu/BsnbRbaemDgcM693DPHjzhT2nG+rcLS8srqWnG9tLG5tb1j7+41VRhLQhsk5KFs+6AoZ4I2NNOctiNJIfA5bfnjq8xv3VOpWCju9CSi3QCGgg0YAW2knn3gCfA59JKHFHsB6BEBntykPbvsVJwp8CJxc1JGOeo9+8vrhyQOqNCEg1Id14l0NwGpGeE0LXmxohGQMQxpx1ABAVXdZJo/xcdG6eNBKM0TGk/V3xsJBEpNAt9MZhHVvJeJ/3mdWA8uugkTUaypILNDg5hjHeKsDNxnkhLNJ4YAkcxkxWQEEog2lZVMCe78lxdJs1pxTyvV27Ny7TKvo4gO0RE6QS46RzV0jeqogQh6RM/oFb1ZT9aL9W59zEYLVr6zj/7A+vwBWVqWUA==</latexit>

q(z|x)<latexit sha1_base64="KpkoQlOmataBFOkJrohGDFe3/jI=">AAAB7XicbVC7TgMxENwLrxBeAUoaiwgpNNFdQIIygoYySOQhJafI5/gSE5992D5EOPIPNBQgRMv/0PE3OI8CEkZaaTSzq92dIOZMG9f9djJLyyura9n13Mbm1vZOfnevrmWiCK0RyaVqBlhTzgStGWY4bcaK4ijgtBEMLsd+454qzaS4McOY+hHuCRYygo2V6nfFx6eH406+4JbcCdAi8WakADNUO/mvdleSJKLCEI61bnlubPwUK8MIp6NcO9E0xmSAe7RlqcAR1X46uXaEjqzSRaFUtoRBE/X3RIojrYdRYDsjbPp63huL/3mtxITnfspEnBgqyHRRmHBkJBq/jrpMUWL40BJMFLO3ItLHChNjA8rZELz5lxdJvVzyTkrl69NC5WIWRxYO4BCK4MEZVOAKqlADArfwDK/w5kjnxXl3PqatGWc2sw9/4Hz+AEDejuo=</latexit>

xi<latexit sha1_base64="Fc3Z28uNYIopPA+JNC+Lkh9YNbA=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGCaQttLJvttl262YTdiVhCf4MXD4p49Qd589+4bXPQ1gcDj/dmmJkXJlIYdN1vZ2V1bX1js7BV3N7Z3dsvHRw2TJxqxn0Wy1i3Qmq4FIr7KFDyVqI5jULJm+HoZuo3H7k2Ilb3OE54ENGBEn3BKFrJf3rIxKRbKrsVdwayTLyclCFHvVv66vRilkZcIZPUmLbnJhhkVKNgkk+KndTwhLIRHfC2pYpG3ATZ7NgJObVKj/RjbUshmam/JzIaGTOOQtsZURyaRW8q/ue1U+xfBZlQSYpcsfmifioJxmT6OekJzRnKsSWUaWFvJWxINWVo8ynaELzFl5dJo1rxzivVu4ty7TqPowDHcAJn4MEl1OAW6uADAwHP8ApvjnJenHfnY9664uQzR/AHzucPJTSO5w==</latexit>

x̂i<latexit sha1_base64="qTcUxeC6LIUKTsoudSJJ64+lDtg=">AAAB8nicbVBNS8NAEN34WetX1aOXxSJ4KkkV9Fj04rGC/YA0ls120y7d7IbdiVhCfoYXD4p49dd489+4bXPQ1gcDj/dmmJkXJoIbcN1vZ2V1bX1js7RV3t7Z3duvHBy2jUo1ZS2qhNLdkBgmuGQt4CBYN9GMxKFgnXB8M/U7j0wbruQ9TBIWxGQoecQpASv5vRGB7Cl/yHjer1TdmjsDXiZeQaqoQLNf+eoNFE1jJoEKYozvuQkEGdHAqWB5uZcalhA6JkPmWypJzEyQzU7O8alVBjhS2pYEPFN/T2QkNmYSh7YzJjAyi95U/M/zU4iugozLJAUm6XxRlAoMCk//xwOuGQUxsYRQze2tmI6IJhRsSmUbgrf48jJp12veea1+d1FtXBdxlNAxOkFnyEOXqIFuURO1EEUKPaNX9OaA8+K8Ox/z1hWnmDlCf+B8/gD5nJG0</latexit>

Figure 2: The VQ-VAE architecture [46], applied to learn audit sampling. During the models forward pass, an input journalentry 𝑥𝑖 is passed through the encoder network 𝑞\ producing a latent representation 𝑧𝑒 . The VQ-VAE quantises 𝑧𝑒 using acodebook of embedding vectors 𝑒 𝑗 . Afterwards, the quantised embedding 𝑧𝑞 is passed to the decoder network 𝑝𝜙 to reconstructthe journal entry 𝑥𝑖 as faithfully as possible.

• Difference estimation calculates the average difference be-tween audited amounts and recorded amounts to estimatethe total audited amount of a population [39].

• Ratio estimation calculates the ratio of audited amounts torecorded amounts to estimate the total dollar amount of thejournal entry population [17].

• Mean-per-unit estimation projects the sample average to thetotal population by multiplying the sample average by thenumber of items in a journal entry population [35].

• Monetary- or Dollar-unit sampling considers each monetaryunit as a sampling unit of the journal entry population. Asa result, entries that record-high posting amounts exhibit aproportionally higher likelihood of being selected [32, 45].

In general, variable sampling is utilised by auditors in the contextof substantive testing procedures, e.g. the overstatement of accountsreceivables.

To the best of our knowledge, this work presents the first deep-learning inspired approach to learn representative and human in-terpretable audit samples from real-world accounting data.

3 METHODOLOGYIn this section, we describe the architectural setup of the VQ-VAEmodel [46] used to learn representative audit sampling. Further-more, we provide details on the objective function applied to opti-mise the model parameters.

3.1 Latent Generative FactorsLet 𝑋 formally be a set of 𝑁 journal entries 𝑥1, 𝑥2, ..., 𝑥𝑛 , whereeach journal entry 𝑥𝑖 consists of 𝑀 accounting specific attributes𝑥𝑖1, 𝑥

𝑖2, ..., 𝑥

𝑖𝑗, ..., 𝑥𝑖𝑚 . The individual attributes 𝑥 𝑗 describe the journal

entries details, e.g., the entries’ fiscal year, posting type, postingdate, amount, general-ledger. Following the theoretical assumptionsof unsupervised representation learning, [8] we assume that distinctfactors of variation generate each entry 𝑥𝑖

𝑗. The different variational

factors are not directly observable and correspond to manifolds ina latent space 𝑍 . We hypothesise that each generative latent factor𝑧𝑖 ∈ 𝑍 corresponds to a behavioural posting pattern. As a result,it can be uncovered by an unsupervised deep learning algorithm[12, 37].

3.2 VQ-VAE ModelIt is often intractable to directly calculate the exact posterior distri-bution 𝑞(𝑧) over the latent generative factors in 𝑍 . In [30] Kingmaand Welling proposed the Variational Autoencoder (VAE) to learnan approximation of the intractable posterior distribution 𝑞(𝑧 |𝑥)given the input data 𝑋 . The VAE’s encoder network 𝑞\ learns toparameterise a continuous approximate posterior 𝑞(𝑧 |𝑥) over thelatent factors 𝑍 . In parallel, the VAE’s decoder network 𝑝𝜙 learns toreconstruct the input data 𝑋 as faithfully as possible with distribu-tion 𝑝 (𝑥 |𝑧) over the reconstructions𝑋 . To quantise the approximateposterior 𝑞(𝑧 |𝑥) and thereby learn a representative audit samplewe apply the VQ-VAE model introduced by Van den Oord et al. in[46] and as shown in Fig. 2. In contrast to the VAE, the VQ-VAEmodel defines a discrete latent embedding space 𝐸 ∈ R𝐾𝑥𝐷 where𝐾 denotes the size of the space, and 𝐷 is the dimensionality of eachdiscrete latent embedding vector 𝑒 𝑗 ∈ 𝐸. Due to its discrete naturethe embedding space encompasses in total 𝐾 distinct embeddingvectors 𝑒 𝑗 ∈ R𝐷 , 𝑗 = 1, 2, ..., 𝐾 . During the models forward pass aninput journal entry 𝑥𝑖 is passed through the encoder network 𝑞\producing a latent representation 𝑧𝑒 (𝑥). To obtain its correspond-ing quantised representation 𝑧𝑞 (𝑥), a nearest neighbour lookup isperformed, as defined by:

𝑧𝑞 (𝑥) = 𝑒𝑘 , where 𝑘 = argmin𝑗

| |𝑧𝑒 (𝑥) − 𝑒 𝑗 | |2 . (1)

where 𝑧𝑒 (𝑥) denotes the output of the encoder network and 𝑒 𝑗 thedistinct embedding vectors. This process can be viewed as pass-ing each 𝑧𝑒 (𝑥) through a discretisation bottleneck by mapping itonto its nearest embedding in 𝐸. Thereby the VQ-VAE autoencoder


quantises the representations 𝑧𝑒 (𝑥) using a codebook of 1-of-K em-bedding vectors 𝑒 𝑗 . Afterwards, the quantised embedding 𝑧𝑞 (𝑥)is passed to the decoder. The complete set of parameters of themodel are the union of the parameters of the encoder, decoder, andthe embedding space. The learned quantised posterior distribution𝑞(𝑧 |𝑥) probabilities are defined as ’one-hot’, given by:

𝑞(𝑧 = 𝑘 |𝑥) ={1 for 𝑘 = argmin𝑗 | |𝑧𝑒 (𝑥) − 𝑒 𝑗 | |2,0 otherwise,

(2)

where 𝑧𝑒 (𝑥) denotes the output of the encoder network and 𝑒 𝑗denotes a particular embedding vector.

3.3 VQ-VAE LearningIn the forward pass, to learn a set of quantised embeddings ofreal-world journal entry data, we compute the model loss L asdefined in Eq. 3. The loss function is comprised of four terms thatare optimised in parallel to train the different model componentsof the VQ-VAE, given by:

L(𝑥) = log𝑝 (𝑥 |𝑧𝑞 (𝑥)) + 𝛼 | |𝑠𝑔[𝑧𝑒 (𝑥)] − 𝑒 | |22+ 𝛽 | |𝑧𝑒 (𝑥) − 𝑠𝑔[𝑒] | |22 + 𝛾 log 𝑝 (𝑥 |𝑧𝑒 (𝑥)),

(3)

where 𝑧𝑒 (𝑥) denotes encoder output, 𝑧𝑞 (𝑥) the quantised encoderoutput and 𝑒 the set of embedding vectors. The first term denotes thediscrete reconstruction loss derived from the reconstruction of thequantised embeddings 𝑧𝑞 (𝑥). It encourages the embeddings 𝑧𝑞 (𝑥)to be an informative representation of the input [40]. Throughoutthe training process, the embeddings 𝑒 receive no gradients fromthe reconstruction loss. This originates from the straight-throughgradient estimation ∇𝑧L of mapping 𝑧𝑒 (𝑥) to its nearest neighbor𝑧𝑞 (𝑥). The embeddings are optimized using a vector quantisationtechnique that applies a stop-gradient operator denoted by 𝑠𝑔[·].The operator is defined as the identity in the forward pass and haszero partial derivatives in the backward pass [46], denoted by:

𝑠𝑔[𝑥] ={𝑥 forward pass ,0 backward pass ,

(4)

where 𝑥 denotes the input vector to the stop-gradient operator.Both the second and third loss term use stop-gradients. The sec-

ond term denotes the embedding loss, that moves the embeddings 𝑒towards the encoder outputs 𝑧𝑒 (𝑥). Due to the non-differentiabilityof the embedding assignment, the embedding space is dimension-less. It can grow arbitrarily if the embeddings 𝑒 do not train asfast as the encoder parameters \ . Therefore, the third term of Eq.3 denotes the commitment loss which guarantees that the encoderoutput 𝑧𝑒 (𝑥) commits to one of the embeddings. To encourage theencoder output 𝑧𝑒 (𝑥) to remain an informative representation andnot ’collapsing’ towards one of the embeddings we enhanced L bya fourth loss term as recently introduced by Fortuin et al. in [16].The added term, denotes the reconstruction loss derived from theencoder outputs 𝑧𝑒 (𝑥). The gradient ∇𝑧L of the backward pass isapproximated similar to the straight-through estimator presentedin [9]. Thereby, the gradients of the decoder input 𝑧𝑞 (𝑥) are copiedback to the encoder output 𝑧𝑒 (𝑥) as shown in Fig. 2. Since the out-put representation of the encoder and the input to decoder share

the same 𝐷 dimensional space, the gradients contain useful infor-mation on how to update the encoder parameters \ to increase themodel likelihood. The VQ-VAE can be interpreted a special caseof the VAE in which the model’s likelihood log 𝑝 (𝑥) can be max-imised by the optimisation of the Expectation Lower Bound (ELBO)[30]. Maximising the ELBO increases the representativeness of thelearned quantised representations 𝑧𝑞 (𝑥) and therefore mitigatesthe sampling risk with progressing model training.

4 EXPERIMENTAL SETUPIn this section, we describe the experimental setup and model train-ing. Due to the high confidentiality of journal entry data, we evalu-ate the proposed methodology based on two public available real-world datasets to allow for reproducibility of our results.

4.1 Datasets and Data PreparationTo evaluate the audit sampling capability of the VQ-VAE architec-ture, we use two publicly available datasets of real-world financialpayment data that exhibit high similarity to real-world account-ing data. The datasets are referred to as dataset A and dataset Bin the following. Dataset A corresponds to the City of Philadel-phia payment data of the fiscal year 2017 3. It represents the city’snearly $4.2 billion in payments obtained from almost 60 city of-fices, departments, boards and committees. Dataset B correspondsto vendor payments of the City of Chicago ranging from 1996 to2020 4. The data is collected from the city’s ’Vendor, Contract andPayment Search’ and encompasses the procurement of goods andservices. The majority of attributes recorded in both datasets (sim-ilar to real-world ERP data) correspond to categorical (discrete)variables, e.g. posting date, department, vendor name, documenttype. We pre-process the original payment line-item attributes to (i)remove of semantically redundant attributes and (ii) obtain a binary(’one-hot’ encoded) representation of each payment. The followingdescriptive statistics summarise both datasets upon successful datapre-processing:

• Dataset A: The ’City of Philadelphia’ payments encompassa total of 𝑛 = 238, 894 payments comprised of 10 categoricaland one numerical attribute. The encoding resulted in a totalof 8, 565 one-hot encoded dimensions for each of the city’svendor payment record 𝑥𝑖 ∈ R8,565.

• Dataset B: The ’City of Chicago’ payments encompass atotal of 𝑛 = 72, 814 payments comprised of 7 categorical andone numerical attribute. The encoding resulted in a totalof 2, 354 one-hot encoded dimensions for each of the city’svendor payment record 𝑥𝑖 ∈ R2,354.

4.2 VQ-VAE TrainingOur architectural setup follows the VQ-VAE architecture [46] asshown in Fig. 2, comprised of an encoder- and decoder-network aswell as an additional embedding layer that are trained in parallel.The encoder network 𝑞\ uses Leaky Rectified Linear Unit (LReLU)activation functions [48] except in the ’bottleneck’ layer where no

3The dataset is publicly available via: https://www.phila.gov/2019-03-29-philadelphias-initial-release-of-city-payments-data/.4The dataset is publicly available via: https://data.cityofchicago.org/Administration-Finance/Payments/s4vu-giwb/.


1… …

Learned Quantised Embedding:

z1<latexit sha1_base64="ao7YOVG3tXfGNeBI4fV/EOPiPPI=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120i7dbMLuRqihP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEsG1cd1vZ2V1bX1js7BV3N7Z3dsvHRw2dZwqhg0Wi1i1A6pRcIkNw43AdqKQRoHAVjC6mfqtR1Sax/LBjBP0IzqQPOSMGivdP/W8XqnsVtwZyDLxclKGHPVe6avbj1kaoTRMUK07npsYP6PKcCZwUuymGhPKRnSAHUsljVD72ezUCTm1Sp+EsbIlDZmpvycyGmk9jgLbGVEz1IveVPzP66QmvPIzLpPUoGTzRWEqiInJ9G/S5wqZEWNLKFPc3krYkCrKjE2naEPwFl9eJs1qxTuvVO8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c0Rzovz7nzMW1ecfOYI/sD5/AEQCo2m</latexit>

z 2 <latexit sha1_base64="ZzKjtcJFGu2OAJipvjC2jutANLY=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRKihP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindeqd5dlGvXeRwFOIYTOAMPLqEGt1CHBjAYwDO8wpsjnRfn3fmYt644+cwR/IHz+QMRjo2n</latexit>

ze(x)<latexit sha1_base64="45V87jfsoOvlNXhojl0E1tjSmqA=">AAAB7XicbVBNSwMxEM3Wr1q/qh69BItQL2W3CnosevFYwX5Au5RsOtvGZpMlyYp16X/w4kERr/4fb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2DppaJotCgkkvVDogGzgQ0DDMc2rECEgUcWsHoeuq3HkBpJsWdGcfgR2QgWMgoMVZqPvWg/HjaK5bcijsDXiZeRkooQ71X/Or2JU0iEIZyonXHc2Pjp0QZRjlMCt1EQ0zoiAygY6kgEWg/nV07wSdW6eNQKlvC4Jn6eyIlkdbjKLCdETFDvehNxf+8TmLCSz9lIk4MCDpfFCYcG4mnr+M+U0ANH1tCqGL2VkyHRBFqbEAFG4K3+PIyaVYr3lmlenteql1lceTRETpGZeShC1RDN6iOGoiie/SMXtGbI50X5935mLfmnGzmEP2B8/kDAtyOwQ==</latexit>

VQ-VAE Encoder

Quantised Payment Representations:


Quantised Approximate Posterior… …

Doc. No. Date Department Subject Vendor … Amount1 17003768 07.07.2016 42 COMMERCE RENTS 0285 XEROX Corporation … 648.66

Doc. No. Date Department Subject Vendor … Amount1 17002929 07.07.2017 02 SERVICES Prof. Services TEMPLE University … 187.71

2 17002932 07.07.2017 02 SERVICES Prof. Services TEMPLE University … 3125.78

3 17002951 28.04.2017 02 SERVICES Prof. Services TEMPLE University … 197.01

… … … … … … … …

Doc. No. Date Department Subject Vendor … Amount1 17056438 14.10.2016 02 SERVICES Prof. Services TEMPLE University … 404.46






[ ]

[

]


q✓<latexit sha1_base64="aUZulGPS2II0KxNAue3tJMxsHVY=">AAAB8XicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGC/cA2lM120y7dbOLuRCih/8KLB0W8+m+8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXFtRKzucZxwP6IDJULBKFrp4bGXdXHIkU56pbJbcWcgy8TLSRly1Hulr24/ZmnEFTJJjel4boJ+RjUKJvmk2E0NTygb0QHvWKpoxI2fzS6ekFOr9EkYa1sKyUz9PZHRyJhxFNjOiOLQLHpT8T+vk2J45WdCJSlyxeaLwlQSjMn0fdIXmjOUY0so08LeStiQasrQhlS0IXiLLy+TZrXinVeqdxfl2nUeRwGO4QTOwINLqMEt1KEBDBQ8wyu8OcZ5cd6dj3nripPPHMEfOJ8/+RaRHA==</latexit>

zq(x) = e85<latexit sha1_base64="2Eb2MD8A4b71V3TFSOgnjkKRt74=">AAAB+nicbVDLTsJAFJ3iC/FVdOlmIjHBDWlRIxsTohuXmMgjgaaZDlOYMJ3WmamKtZ/ixoXGuPVL3Pk3DtCFgie5yck59+bee7yIUaks69vILS2vrK7l1wsbm1vbO2ZxtyXDWGDSxCELRcdDkjDKSVNRxUgnEgQFHiNtb3Q58dt3REga8hs1jogToAGnPsVIack1i49ucpuWH47gOSRuUjtNXbNkVawp4CKxM1ICGRqu+dXrhzgOCFeYISm7thUpJ0FCUcxIWujFkkQIj9CAdDXlKCDSSaanp/BQK33oh0IXV3Cq/p5IUCDlOPB0Z4DUUM57E/E/rxsrv+YklEexIhzPFvkxgyqEkxxgnwqCFRtrgrCg+laIh0ggrHRaBR2CPf/yImlVK/ZxpXp9UqpfZHHkwT44AGVggzNQB1egAZoAg3vwDF7Bm/FkvBjvxsesNWdkM3vgD4zPH8oGkwo=</latexit>

zq(x) = e54<latexit sha1_base64="lOjQDkjDwm07kgeEAxT+M+Od4Wc=">AAAB+nicbVDLTsJAFJ3iC/FVdOlmIjHBDWkRoxsTohuXmMgjgaaZDlOYMJ3WmamKtZ/ixoXGuPVL3Pk3DtCFgie5yck59+bee7yIUaks69vILS2vrK7l1wsbm1vbO2ZxtyXDWGDSxCELRcdDkjDKSVNRxUgnEgQFHiNtb3Q58dt3REga8hs1jogToAGnPsVIack1i49ucpuWH47gOSRuclJLXbNkVawp4CKxM1ICGRqu+dXrhzgOCFeYISm7thUpJ0FCUcxIWujFkkQIj9CAdDXlKCDSSaanp/BQK33oh0IXV3Cq/p5IUCDlOPB0Z4DUUM57E/E/rxsr/8xJKI9iRTieLfJjBlUIJznAPhUEKzbWBGFB9a0QD5FAWOm0CjoEe/7lRdKqVuzjSvW6VqpfZHHkwT44AGVgg1NQB1egAZoAg3vwDF7Bm/FkvBjvxsesNWdkM3vgD4zPH8PvkwY=</latexit>

zq(x) = e35<latexit sha1_base64="Ane8nRm+4U4ShrnogXzodXcXnUM=">AAAB+nicbVDLTsJAFJ3iC/FVdOlmIjHBDWlBoxsTohuXmMgjgaaZDlOYMJ3WmamKtZ/ixoXGuPVL3Pk3DtCFgie5yck59+bee7yIUaks69vILS2vrK7l1wsbm1vbO2ZxtyXDWGDSxCELRcdDkjDKSVNRxUgnEgQFHiNtb3Q58dt3REga8hs1jogToAGnPsVIack1i49ucpuWH47gOSRuUjtJXbNkVawp4CKxM1ICGRqu+dXrhzgOCFeYISm7thUpJ0FCUcxIWujFkkQIj9CAdDXlKCDSSaanp/BQK33oh0IXV3Cq/p5IUCDlOPB0Z4DUUM57E/E/rxsr/8xJKI9iRTieLfJjBlUIJznAPhUEKzbWBGFB9a0QD5FAWOm0CjoEe/7lRdKqVuxapXp9XKpfZHHkwT44AGVgg1NQB1egAZoAg3vwDF7Bm/FkvBjvxsesNWdkM3vgD4zPH8JokwU=</latexit>

zq(x) = e8<latexit sha1_base64="rwyy+MFJSK0M7DBb/Tb/FBIp4KE=">AAAB+XicbVDLSsNAFL3xWesr6tLNYBHqpiRVsBuh6MZlBfuANoTJdNIOnTycmRRryJ+4caGIW//EnX/jtM1CWw9cOJxzL/fe48WcSWVZ38bK6tr6xmZhq7i9s7u3bx4ctmSUCEKbJOKR6HhYUs5C2lRMcdqJBcWBx2nbG91M/faYCsmi8F5NYuoEeBAynxGstOSa5pObPmTlxzN0haib1jLXLFkVawa0TOyclCBHwzW/ev2IJAENFeFYyq5txcpJsVCMcJoVe4mkMSYjPKBdTUMcUOmks8szdKqVPvIjoStUaKb+nkhxIOUk8HRngNVQLnpT8T+vmyi/5qQsjBNFQzJf5CccqQhNY0B9JihRfKIJJoLpWxEZYoGJ0mEVdQj24svLpFWt2OeV6t1FqX6dx1GAYziBMthwCXW4hQY0gcAYnuEV3ozUeDHejY9564qRzxzBHxifP09dkss=</latexit>

Doc. No. Date Department Subject Vendor … Amount1 17154000 21.04.2017 10 MD RENTS 0285 XEROX Corporation … 859.69

2 17154425 05.05.2017 42 COMMERCE RENTS 0285 XEROX Corporation … 1359.53

3 17154469 26.05.2017 14 HEALTH RENTS 0285 XEROX Corporation … 421.38

… … … … … … … …

Doc. No. Date Department Subject Vendor … Amount1 17003768 07.07.2016 44 LAW COURT REPT. ESQUIRE DEP. SOL., LLC … 444.60

Doc. No. Date Department Subject Vendor … Amount1 17063736 25.11.2016 44 LAW COURT REPT. MARLENE BELL INC. … 454.20

2 17003787 07.07.2016 44 LAW COURT REPT. BISNOW & JOSEPH LLC … 235.50

3 17115392 06.04.2017 44 LAW COURT REPT. B & R SERVICES INC … 501.10

… … … … … … … …

Quantised Payment Representations: ze(x)<latexit sha1_base64="45V87jfsoOvlNXhojl0E1tjSmqA=">AAAB7XicbVBNSwMxEM3Wr1q/qh69BItQL2W3CnosevFYwX5Au5RsOtvGZpMlyYp16X/w4kERr/4fb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2DppaJotCgkkvVDogGzgQ0DDMc2rECEgUcWsHoeuq3HkBpJsWdGcfgR2QgWMgoMVZqPvWg/HjaK5bcijsDXiZeRkooQ71X/Or2JU0iEIZyonXHc2Pjp0QZRjlMCt1EQ0zoiAygY6kgEWg/nV07wSdW6eNQKlvC4Jn6eyIlkdbjKLCdETFDvehNxf+8TmLCSz9lIk4MCDpfFCYcG4mnr+M+U0ANH1tCqGL2VkyHRBFqbEAFG4K3+PIyaVYr3lmlenteql1lceTRETpGZeShC1RDN6iOGoiie/SMXtGbI50X5935mLfmnGzmEP2B8/kDAtyOwQ==</latexit>

Doc. No. Date Department Subject Vendor … Amount1 17153227 13.04.2017 52 LIBRARIES LIBRARY 0306 BAKER & TAYLOR … 522.56

2 17157213 28.04.2017 52 LIBRARIES LIBRARY 0306 BRODART COMPANY … 3117.12

3 17157913 28.04.2017 52 LIBRARIES LIBRARY 0306 MIDWEST TAPE … 114.03

… … … … … … … …

Doc. No. Date Department Subject Vendor … Amount1 17056438 14.10.2016 52 LIBRARIES LIBRARY 0306 INGRAM LIBRARY SER. … 404.46

Figure 3: Exemplary VQ-VAE vector quantisation of city payments and corresponding audit samples represented by themodelslearned embeddings 𝑒𝑘 , for 𝑘 = argmin𝑗 | |𝑧𝑒 (𝑥) − 𝑒 𝑗 | |2. For each entry 𝑥𝑖 VQ-VAE infers a low-dimensional representation 𝑧𝑒in the latent space 𝑍 . The distinct representations 𝑧𝑒 are quantised 𝑧𝑞 by the embeddings 𝑒𝑘 . As a result, the quantisations 𝑧𝑞constitute a set of representative audit samples of the original entry population 𝑋 .

non-linearity is applied. The decoder network 𝑝\ use LReLUs in alllayers except the output layers where sigmoid activations are used.Table 1 depicts the architectural details of the networks which areimplemented using PyTorch [36].

Table 1: Number of neurons per layer ℓ of the encoder 𝑞\ anddecoder 𝑝𝜙 networks that comprise the VQ-VAE architecture[46] used in our experiments.

Net Dataset ℓ = 1 2 3 4 ... 10 11𝑞\ (𝑧 |𝑥) A 5,096 2,048 1,024 512 ... 4 2𝑝𝜙 (𝑥 |𝑧) A 2 4 8 16 ... 2,048 5,096

𝑞\ (𝑧 |𝑥) B 2048 1024 512 256 ... 4 2𝑝𝜙 (𝑥 |𝑧) B 2 4 8 16 ... 1024 2048

In accordance with [48], we set the scaling factor of the LReLUsto 𝛼 = 0.4. We initialize the parameters of the encoder decodernetworks as described in [18]. The embeddings 𝑒 are initializedby sampling from a uniform prior distribution 𝑒 𝑗 ∼ U(−1, 1). Toallow for interpretation and visual inspection by human auditorswe sample each discrete latent embedding vector 𝑒 𝑗 ∈ R2. Weevaluate distinct codebook sizes of 𝐾 ∈ {24, 25, 26, 27} embeddingsto learn different degrees of payment quantisation. The models aretrained with batch wise SGD for a max. of 4,000 training epochs, amini-batch size of𝑚 = 128 journal entries, and early stopping oncethe loss converges. We use Adam optimization [29] with 𝛽1 = 0.9and 𝛽2 = 0.999 in the optimization of the network parameters.To determine the 𝑧𝑞 (𝑥) and 𝑧𝑒 (𝑥) reconstruction losses (first andfourth term of Eq. 3) we use a Mean-Squared-Error (MSE) loss, asdefined by:

L𝑀𝑆𝐸 (𝑥) = | |𝑥 − 𝑝𝜙 (𝑞\ (𝑥)) | |22, (5)

where 𝑥 denotes an encoded journal entry, 𝑞\ the encoder, and𝑝𝜙 the decoder network. Kaiser et al. in [27] observed that train-ing the embeddings, as done on the second term of Eq. 3, can be

stabilized. This is achieved by maintaining an Exponential Mov-ing Average (EMA) over (1) the embedding vectors 𝑒 𝑗 and (2) thecount 𝜋 𝑗 of nearest embeddings 𝑧𝑞 (𝑥) mapped onto the individualembedding vectors. Thereby, the count per embedding is definedas 𝜋 𝑗 =

∑𝑁𝑖=1 1[𝑧𝑞 (𝑥𝑖 ) = 𝑒 𝑗 ] where 1 denotes the indicator func-

tion. The EMA count 𝑐 𝑗 of each embedding is then updated permini-batch𝑚, as defined by [40]:

𝑐𝑚+1𝑗 = [ 𝑐𝑚𝑗 + (1 − [) 𝜋 𝑗 , (6)

with updating each embedding 𝑒 𝑗 respectively, as follows:

𝑒𝑚+1𝑗 = [ 𝑒𝑚𝑗 + (1 − [)

𝑁∑𝑖=1

𝜋 𝑗 𝑧𝑒 (𝑥𝑖 )𝑐𝑚𝑗

, (7)

where [ denotes the EMA decay parameter. We used this enhance-ment in all our experiments and set [ = 0.95. Throughout thetraining, we are also interested in the average number of bits usedby a particular model to quantise the latent factors of variation.This is measured by codebook perplexity of the each model, asdefined by:

P𝑒𝑟𝑝 (𝜋) = 2−∑𝐾𝑗=1 𝑝 (𝜋 𝑗 ) log2 𝑝 (𝜋 𝑗 ) , (8)

the likelihood of a quantised representation 𝑧𝑞 (𝑥) of being assignedto a particular embedding 𝑒 𝑗 . Furthermore, we determine the aver-age purity of all payments 𝜔 𝑗 quantised by a particular embedding𝑒 𝑗 , as defined by:

P𝑢𝑟𝑖𝑡𝑦 (𝜔, 𝜋) =1𝐾

𝐾∑𝑗=1

|𝜔 𝑗 |𝜋 𝑗

, (9)

where 𝜔 𝑗 = {𝑥𝑖 ∈ 𝑋 |𝑧𝑞 (𝑥𝑖 ) = 𝑒 𝑗 }. Figure 3 depicts an exemplaryVQ-VAE vector quantisation of the city payments and correspond-ing audit samples represented by the learned embeddings 𝑒 .


5 EXPERIMENTAL RESULTSIn this section, we assess the latent quantisation learned from real-world city payments quantitatively and qualitatively. Also, we ex-amine the semantic disentanglement of the payments attributesin the latent dimensions in terms of interpretability by a humanauditor.

5.1 Quantitative EvaluationWe are interested in the degree of representativeness of the learnedquantised embeddings when training VQ-VAE models with varyingcodebook sizes. The quantitative results obtained for both datasetsare shown in Tab. 2. It can be observed that both, the average code-book usage (P𝑒𝑟𝑝 ) and quantisation purity (P𝑢𝑟𝑖𝑡𝑦 ) increases withlarger codebook size. Hence, a more fine-grained quantisation ofthe latent generative factors is learned. Also, the reconstructionloss derived from the quantised embeddings (L𝑧𝑞

𝑀𝑆𝐸) and the recon-

struction loss derived from all embeddings (L𝑧𝑒𝑀𝑆𝐸

) converge withincreased codebook size 𝐾 . This observation corresponds to ourinitial hypothesis that real-world accounting data is generated by alimited set of latent factors that can be uncovered by an unsuper-vised deep learning model. The quantitative results indicate that theVQ-VAE provides the ability to learn embeddings that quantise suchlatent generative factors and therefore constitute a representativeaudit sample.

Table 2: Reconstruction losses, codebook perplexity, andcluster purity obtained for different codebook size𝐾 on bothcity payment datasets (variances originate from parameterinitialization using five random seeds).

Data 𝐾 L𝑧𝑞𝑀𝑆𝐸

L𝑧𝑒𝑀𝑆𝐸

P𝑒𝑟𝑝 P𝑢𝑟𝑖𝑡𝑦A 23 0.577 ± 0.13 0.453 ± 0.24 5.394 ± 3.81 0.864 ± 0.09A 24 0.454 ± 0.06 0.312 ± 0.34 12.668 ± 0.62 0.845 ± 0.01A 25 0.417 ± 0.10 0.281 ± 0.33 22.024 ± 1.29 0.853 ± 0.02A 26 0.382 ± 0.05 0.232 ± 0.13 37.677 ± 1.72 0.872 ± 0.01A 27 0.345 ± 0.17 0.208 ± 0.19 59.755 ± 2.83 0.890 ± 0.01B 23 1.675 ± 0.03 1.535 ± 0.04 6.082 ± 0.17 0.440 ± 0.03B 24 1.622 ± 0.06 1.424 ± 0.10 9.788 ± 1.12 0.416 ± 0.01B 25 1.587 ± 0.07 1.351 ± 0.15 17.941 ± 2.87 0.377 ± 0.06B 26 1.467 ± 0.02 1.121 ± 0.04 31.217 ± 5.04 0.373 ± 0.01B 27 1.407 ± 0.05 1.071 ± 0.07 42.171 ± 9.39 0.321 ± 0.02

5.2 Qualitative EvaluationWe are also interested in the semantics of the latent generativefactors represented by the learned embeddings. Figures 4 and 5(left) show a quantisation learned by two VQ-VAE models trainedwith codebook sizes 𝐾 = 27 (dataset A) and 𝐾 = 26 (dataset B).For each learned embedding, we conduct a qualitative review ofthe corresponding payment quantisations. The review of the corre-sponding quantised payments for models with 𝐾 = 27 result in thefollowing observations:

• Dataset A: Among others, the learned embeddings quantisethe payments according to (1) fleet management 𝑒50 (’autoparts’), (2) legal services 𝑒52 (’appointed attorneys’), (3) officematerials and supplies 𝑒51 (’Staples Business Advantage’), (4)professional services 𝑒121 (’consultancy services’), as well as(5) material and supply 𝑒127 (’fuel and gasoline’) payments;

• Dataset B: Among others, the learned embeddings quan-tise the payments according to (1) transportation services𝑒2 (’public transport maintanance’), (2) family assistanceservices 𝑒54 (’homeless financial support’), (3) aviation main-tenance 𝑒17 (’cleaning and fuel’), (4) IT services 𝑒11 (’soft-ware’), (5) water management 𝑒48 (’pipe supply’), as well as(5) library services 𝑒60 (’library facilities’) payments.

The qualitative results indicate that the embeddings learned by theVQ-VAE semantically quantises the payments generative factors.The representativeness of the embeddings is underpinned by thehigh quantisation P𝑢𝑟𝑖𝑡𝑦 obtainable with increased codebook size.

Table 3: Disentanglement metrics and scores obtained forboth city payment datasets using different codebook sizes𝐾 (variances originate from parameter initialization usingfive random seeds).

Data 𝐾 𝛽-VAE [24] 𝐹𝑎𝑐-VAE [28] MIG [11] DCI [14]

A 23 0.160 ± 0.02 0.110 ± 0.06 0.025 ± 0.02 0.039 ± 0.01A 24 0.166 ± 0.01 0.108 ± 0.06 0.029 ± 0.01 0.038 ± 0.01A 25 0.166 ± 0.03 0.119 ± 0.01 0.031 ± 0.02 0.046 ± 0.01A 26 0.182 ± 0.02 0.134 ± 0.01 0.068 ± 0.08 0.061 ± 0.04A 27 0.193 ± 0.01 0.149 ± 0.03 0.081 ± 0.06 0.139 ± 0.07B 23 0.244 ± 0.01 0.142 ± 0.02 0.051 ± 0.03 0.690 ± 0.03B 24 0.240 ± 0.01 0.145 ± 0.03 0.057 ± 0.03 0.703 ± 0.02B 25 0.277 ± 0.04 0.144 ± 0.02 0.053 ± 0.04 0.709 ± 0.01B 26 0.290 ± 0.02 0.144 ± 0.01 0.067 ± 0.01 0.715 ± 0.01B 27 0.324 ± 0.01 0.146 ± 0.01 0.080 ± 0.03 0.717 ± 0.01

5.3 Disentanglement EvaluationFinally, we are interested to which extend the individual journalentry attribute characteristics are disentangled in the distinct la-tent dimensions 𝑧𝑖 . A high disentanglement of the journal entryattributes increases the interpretability of the learned quantised em-beddings by a human auditor. While there is no formally acceptednotion of disentanglement yet, the intuition is that a disentangledrepresentation separates the different informative factors of vari-ation of a dataset [8]. We evaluate four disentanglement metrics,commonly used in unsupervised representation learning [34]:

• The 𝛽-VAE metric proposed in [24] measures the disentan-glement as the accuracy of a linear classifier that predictsthe index of the fixed journal entry attribute 𝑥 𝑗 . We samplebatches of 16 representations, trained the classifier on 1,000batches, and evaluate on 500 batches.

• The 𝐹𝑎𝑐𝑡𝑜𝑟 -VAE proposed in [28] measures the disentangle-ment as the accuracy of a majority vote classifier to predictthe index of the fixed journal entry attribute 𝑥 𝑗 . We samplebatches of 16 representations, trained the classifier on 1,000batches, and evaluate on 500 batches.

• The Mutual Information Gap (MIG) as proposed in [11] mea-sures for each journal entry attribute 𝑥 𝑗 the two dimensionsin 𝑧𝑒 (𝑥) that have the highest mutual information with 𝑥 𝑗 .We bin each dimension in 𝑍 into 20 bins and calculated theaverage mutual information over 1,000 samples.

• TheDisentanglement Metric (DCI) proposed in [14] computesthe entropy of the distribution obtained by normalizing theimportance of each dimension in 𝑧𝑒 (𝑥) for predicting the


Figure 4: Exemplary quantised latent representations 𝑧𝑒 (colored by quantisation) and embeddings 𝑒 (numbered red circles) inR2 learned by the VQ-VAE of the 238,894 ’City of Philadelphia’ vendor payments (dataset A) using a codebook size of 𝐾 = 27(128) embeddings (left). Learned disentanglement of the distinct payment attributes 𝑥 𝑗 (e.g., ’subject-object code’, ’paymenttype’, ’fiscal month’) by the learned representations 𝑧𝑒 (colored by attribute value) in the latent dimensions 𝑧𝑖 (middle-right).

Figure 5: Exemplary quantised latent representations 𝑧𝑒 (colored by quantisation) and embeddings 𝑒 (numbered red circles)in R2 learned by the VQ-VAE of the 72,814 ’City of Chicago’ vendor payments (dataset B) using a codebook size of 𝐾 = 26(64) embeddings (left). Learned disentanglement of the distinct payment attributes 𝑥 𝑗 (e.g., ’city department’, ’vendor name’,’amount’) by the learned representations 𝑧𝑒 (colored by attribute value) in the latent dimensions 𝑧𝑖 (middle-right).

value of an attribute 𝑥 𝑗 . We use a decision tree, trained theclassifier on 1,000 batches, and evaluate on 500 batches.

The disentanglement scores obtained for the distinct metrics arepresented in table 3. It can be observed that increasing the codebooksize 𝐾 yield an increased disentanglement of the journal entryattributes in the latent dimension of 𝑍 . Figure 4 and 5 (middle-right) illustrate the learned disentanglement of the distinct paymentattributes, e.g. city department, payment type, fiscal month. It can

be observed that the learned quantised posterior disentangles thepayment attributes to allow for a highly explainable audit sampling.

6 SUMMARYIn this work, we proposed a deep learning inspired approach toconduct audit sampling in the context of financial statement au-dits. We showed that Vector Quantised-Variational Autoencoder(VQ-VAE) neural networks could be trained to learn a quantised


representation of financial payment data recorded in ERP systems.We demonstrated, based on two real-world datasets of city pay-ments, that such a learned quantisation corresponds to the latentgenerative factors in both datasets. Our experimental results pro-vide initial evidence that VQ-VAE’s can be utilised in the contextof representative audit sampling and therefore offer the ability toreduce sampling risks. Furthermore, the learned representation al-lows for human interpretable discrete sampling. We hope that thistechnique will enhance the toolbox of auditors in the near future tosample journal entries from large-scale financial accounting data.Given the tremendous amount of journal entries recorded by or-ganisations, deep-learning-based sampling techniques can providea starting point for a variety of further downstream audit tasks.

ACKNOWLEDGEMENTSWe thank the members of the statistics department at DeutscheBundesbank for their valuable review and remarks. Opinions ex-pressed in this work are solely those of the authors and do notnecessarily reflect the view of the Deutsche Bundesbank nor Price-waterhouseCoopers (PwC) International Ltd. and its network firms.

REFERENCES[1] Consideration of Fraud in a Financial Statement Audit, AU Section 316. American

Institute of Certified Public Accountants (AICPA), 2002.[2] International Accounting Standard (ISA) 1 - Presentation of Financial Statements.

International Federation of Accountants (IFAC), 2007.[3] Practice Aid for Testing Journal Entries and Other Adjustments Pursuant to AU

Section 316. Center for Audit Quality (CAQ), 2008.[4] International Standard on Auditing (ISA) 530, Audit sampling and Other Means of

Testing. International Federation of Accountants (IFAC), 2009.[5] International Standards on Auditing 240, The Auditor’s Responsibilities Relating to

Fraud in an Audit of Financial Statements. International Federation of Accountants(IFAC), 2009.

[6] Audit Sampling, AU-C Section 530. American Institute of Certified Public Accoun-tants (AICPA), 2012.

[7] A. D. Akresh, J. K. Loebbecke, and W. R. Scott. Audit approaches and techniques.Research Opportunities in Auditing: The Second Decade, 13, 1988.

[8] Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review andnew perspectives. IEEE transactions on pattern analysis and machine intelligence,35(8):1798–1828, 2013.

[9] Y. Bengio, N. Léonard, and A. Courville. Estimating or propagating gradi-ents through stochastic neurons for conditional computation. arXiv preprintarXiv:1308.3432, 2013.

[10] I. Bhattacharya and E. Roos Lindgreen. A semi-supervised machine learningapproach to detect anomalies in big accounting data. In Proceedings of theEuropean Conference on Information Systems (ECIS), 2020.

[11] T. Q. Chen, X. Li, R. B. Grosse, and D. K. Duvenaud. Isolating sources of dis-entanglement in variational autoencoders. In Advances in Neural InformationProcessing Systems, pages 2610–2620, 2018.

[12] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. Infogan:Interpretable representation learning by information maximizing generativeadversarial nets. In Advances in neural information processing systems, pages2172–2180, 2016.

[13] M. L. De Prado. Advances in financial machine learning. John Wiley & Sons, 2018.[14] C. Eastwood and C. K. Williams. A framework for the quantitative evaluation of

disentangled representations. In International Conference on Learning Representa-tions (ICLR), 2018.

[15] R. J. Elder, A. D. Akresh, S. M. Glover, J. L. Higgs, and J. Liljegren. Audit samplingresearch: A synthesis and implications for future research. Auditing: A Journalof Practice & Theory, 32(1):99–129, 2013.

[16] V. Fortuin, M. Hüser, F. Locatello, H. Strathmann, and G. Rätsch. Som-vae:Interpretable discrete representation learning on time series. arXiv preprintarXiv:1806.02199, 2018.

[17] S. J. Garstka and P. A. Ohlson. Ratio estimation in accounting populations withprobabilities of sample selection proportional to size of book values. Journal ofAccounting Research, pages 23–59, 1979.

[18] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforwardneural networks. In Proceedings of the thirteenth international conference onartificial intelligence and statistics, pages 249–256, 2010.

[19] S. V. Grabski, S. A. Leech, and P. J. Schmidt. A review of erp research: A futureagenda for accounting information systems. Journal of information systems,25(1):37–78, 2011.

[20] D. M. Guy, D. R. Carmichael, and R. Whittington. Practitioner’s guide to auditsampling. Wiley, 1998.

[21] D. M. Guy, D. R. Carmichael, and R. Whittington. Audit sampling: An introduction(5th edition). John Wiley & Sons Inc, 2002.

[22] T. W. Hall, A. W. Higson, B. J. Pierce, K. H. Price, and C. J. Skousen. Haphaz-ard sampling: selection biases and the estimation consequences of these biases.Current Issues in Auditing, 7(2):P16–P22, 2013.

[23] T. W. Hall, J. E. Hunton, and B. J. Pierce. Sampling practices of auditors in publicaccounting, industry, and government. Accounting Horizons, 16(2):125–136, 2002.

[24] I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed,and A. Lerchner. beta-vae: Learning basic visual concepts with a constrainedvariational framework. In International Conference on Learning Representations(ICLR), 2017.

[25] N. B. Hitzig. Audit sampling: A survey of current practice. The CPA Journal,65(7):54, 1995.

[26] G. Jokovich. Statistical sampling in auditing. International Journal of Accountingand Financial Management, 16:892–898, 2013.

[27] Ł. Kaiser, A. Roy, A. Vaswani, N. Parmar, S. Bengio, J. Uszkoreit, and N. Shazeer.Fast decoding in sequence models using discrete latent variables. arXiv preprintarXiv:1803.03382, 2018.

[28] H. Kim and A. Mnih. Disentangling by factorising. In International Conferenceon Machine Learning (ICML), 2018.

[29] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXivpreprint arXiv:1412.6980, 2014.

[30] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.

[31] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521(7553):436, 2015.[32] D. A. Leslie, A. D. Teitlebaum, and R. J. Anderson. Dollar-unit sampling: a practical

guide for auditors. Copp Clark Pitman; Belmont, Calif.: distributed by Fearon-Pitman, 1979.

[33] Y. Liu, M. Batcher, and F. Scheuren. Efficient sampling design in audit data.Journal of Data Science, 3:213–222, 2005.

[34] F. Locatello, S. Bauer, M. Lucic, G. Raetsch, S. Gelly, B. Schölkopf, and O. Bachem.Challenging common assumptions in the unsupervised learning of disentangledrepresentations. In International Conference on Machine Learning, pages 4114–4124, 2019.

[35] J. Neter and J. K. Loebbecke. Behavior of major statistical estimators in samplingaccounting populations: An Empirical Study. American Institute of Certified PublicAccountants (AICPA), 1975.

[36] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin,N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performancedeep learning library. In Advances in Neural Information Processing Systems,pages 8024–8035, 2019.

[37] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learn-ing with deep convolutional generative adversarial networks. arXiv preprintarXiv:1511.06434, 2015.

[38] M. Reichert and B. Weber. Ad hoc changes of process instances. In EnablingFlexibility in Process-Aware Information Systems, pages 153–217. Springer, 2012.

[39] D. M. Roberts. Statistical Auditing. American Institute of Certified Public Ac-countants (AICPA), 1978.

[40] A. Roy, A. Vaswani, N. Parmar, and A. Neelakantan. Towards a better under-standing of vector quantized autoencoders. 2018.

[41] M. Schreyer, T. Sattarov, B. Reimer, and D. Borth. Adversarial learning of deep-fakes in accounting. NeurIPS 2019 Workshop on Robust AI in Financial Services:Data, Fairness, Explainability, Trustworthiness, and Privacy, Vancouver, BC, Canada,2019.

[42] M. Schreyer, T. Sattarov, C. Schulze, B. Reimer, and D. Borth. Detection ofaccounting anomalies in the latent space using adversarial autoencoder neuralnetworks. 2nd KDD Workshop on Anomaly Detection in Finance, Anchorage,Alaska, USA, 2019.

[43] M. Schultz andM. Tropmann-Frick. Autoencoder neural networks versus externalauditors: Detecting unusual journal entries in financial statement audits. InProceedings of the 53rd Hawaii International Conference on System Sciences, 2020.

[44] D. A. Schwartz. Computerized audit sampling. The CPA Journal, 68(11):46, 1998.[45] K. W. Stringer. Practical aspects of statistical sampling in auditing. In Proceedings

of the Business and Economic Statistics Section, pages 405–411. American StatisticalAssociation, 1963.

[46] A. van den Oord, O. Vinyals, et al. Neural discrete representation learning. InAdvances in Neural Information Processing Systems, pages 6306–6315, 2017.

[47] M. Wiese, R. Knobloch, R. Korn, and P. Kretschmer. Quant gans: deep generationof financial time series. Quantitative Finance, pages 1–22, 2020.

[48] B. Xu, N. Wang, T. Chen, and M. Li. Empirical Evaluation of Rectified Activationsin Convolution Network. pages 1–5, 2015.

learning sampling in financial statement audits using

Documents