1
Experiment
Imagine querying and extract information from a set of financial statements 5,734 public company
financial reports, all 10-Ks, that were all represented in the XBRL format and the spreadsheet
“Experiment.xlsx1” contained those results. (What I actually did was use this spreadsheet2 and I deleted
columns from that spreadsheet to get to what you see in the “Experiment.xslx” spreadsheet)
BALANCE SHEET:
You know that the accounting equation3 states that “Assets = Liabilities and Equity”.
You extract the values for the concepts of “Assets” and “Liabilities and Equity” and you put the
information into a spreadsheet for all 5,734 public companies. You get almost what you expect, total
“Assets” and total “Liabilities and Equity” are almost the same, but the totals are different by -
136,504,393,431. That is not bad, there is only a difference of .24%. But there is a difference as can be
seen here:
1 Experiment, http://xbrlsite.azurewebsites.net/2018/Experiment/Experiment.zip
2 Use this to rerun my query, http://xbrlsite.azurewebsites.net/2018/Experiment/ExtractionPrototype-SPEC6-
2018-03-31_All.zip 3 Wikipedia, Accounting Equation, https://en.wikipedia.org/wiki/Accounting_equation
2
Using the filter functionality of Excel, all the values for the test “Assets = Liabilities and Equity” that are
not “0” are then listed and you see the following:
So, why are there differences? You can investigate these 19 items were “Assets” <> “Liabilities and
Equity”. For example, consider the last company WORLDS ONLINE INC. If you go to the SEC web site or
the XBRL Cloud Edgar Dashboard4:
If you select the filing and then you go to the XBRL Cloud Viewer by clicking the blue “Viewer” button,
then navigating to the balance sheet you see the following5:
4 WORLDS ONLINE INC. on the Edgar Dashboard, https://edgardashboard.xbrlcloud.com/edgar-
dashboard/?cik=0001522767
3
Basically, the balance sheet is not balancing in their XBRL-based financial report. If you go to the HTML
version of the report to see what is going on you notice that the information in their XBRL-based report
is represented incorrectly:
5 WORLDS ONLINE INC. in the XBRL Cloud Viewer,
https://edgardashboard.xbrlcloud.com/flex/viewer/XBRLViewer.html#instance=http%3A%2F%2Fwww.sec.gov%2FArchives%2Fedgar%2Fdata%2F1522767%2F000126493117000034%2Fworld-20161231.xml&table=xbrl%3AimpliedTable&network=http%3A%2F%2Fworldsonline.com%2Frole%2FBalanceSheets
4
What appears to be going on is that the fact value for “Total Long Term Assets” was erroneously used to
represent the line item “TOTAL ASSETS”. So this is a FILER ERROR.
Let’s look at one more. The company KOPIN CORP. If you go to the SEC web site or the XBRL Cloud
Edgar Dashboard:
https://edgardashboard.xbrlcloud.com/edgar-dashboard/?cik=0000771266
You check into what is going on and you discover that KOPIN CORP is reporting TWO FACTS for “Assets”.
You can see this by looking at the XBRL instance which shows the two facts with the same context; the
only difference between the facts is the “decimals” value which articulates the rounding of the fact:
The balance sheet shows a more detailed fact:
5
The segment information disclosure shows the same fact but scaled differently so that the different
“decimals” value is used.
And so, you can conclude that this is an error caused by the Excel application that extracts information is
not handing duplicate facts correctly. This is a SOFTWARE ERROR.
The point is: (a) everything that is correct is proven to be correct and (b) everything that is incorrect can
be explained to be either a filer error, taxonomy error, or test error and can be fixed.
6
CASH FLOW STATEMENT:
Of the three primary financial statements (balance sheet, income statement, cash flow statement); the
cash flow statement is the most consistent at the high level. If we extract net cash flows from operating,
investing, and financing activities as well as exchange gains and the total “Net cash flow” we get the
following results.
Accountants understand that “Net cash flow from operating activities” + “Net cash flow from investing
activities” + “Net cash flow from financing activities” + “Exchange gains (losses)” = “Net Cash Flow”:
The information extracted is very close to what is expected, only off by .17% with 253 XBRL-based
reports that are not consistent with the rule we would expect this information to be consistent with.
This is still significant.
But note something important. There are actually TWO ways companies represent their cash flow
statement.
In the first approach, the rule we are using to evaluate the information extracted is correct: (i.e.
Exchange gains (losses) is included within the roll up of net cash flow)
“Net cash flow from operating activities” + “Net cash flow from investing activities” + “Net cash flow
from financing activities” + “Exchange gains (losses)” = “Net Cash Flow”
However, this second approach uses a different rule: (i.e. Exchange gains (losses) is NOT included in the
net cash flow roll up; rather they are included in the roll forward of cash and cash equivalents)
“Net cash flow from operating activities” + “Net cash flow from investing activities” + “Net cash flow
from financing activities” = “Net Cash Flow”
7
This screen shot shows these results:
Again, of the 5,734 financial reports, there are 5,586 which are consistent with the results we would
expect if we use the two rules for how the cash flow statement might be created.
And again, if we look at each of the 148 that are inconsistent with our expectation, we can understand
why the inconsistency exists.
https://edgardashboard.xbrlcloud.com/edgar-dashboard/?cik=0001076784
If you look at the cash flow statement and get your calculator out and add it up, you can see that the
cash flow statement does not foot for the current period:
8
This is verified by the XBRL calculation relations:
If you look at the HTML version of the report you will note that version of the cash flow statement ties
to the XBRL version and it likewise does not foot:
9
And so, the conclusion in this case is that the company creating the report has an ERROR in the reported
information.
Looking at one more, consider the report of Woodland Holdings Corp which you can get to from the
Edgar Dashboard6:
If you look at Woodland Holdings Corp’s report using the XBRL Cloud viewer7 you see that they have an
XBRL representation error.
The concept used to represent the line item “Net cash provided by operating activities” is incorrect.
6 Woodland Holding Corp, Edgar Dashboard, https://edgardashboard.xbrlcloud.com/edgar-
dashboard/?cik=0001635965 7 Woodland Holdings Corp 10-K,
https://edgardashboard.xbrlcloud.com/flex/viewer/XBRLViewer.html#instance=http%3A%2F%2Fwww.sec.gov%2FArchives%2Fedgar%2Fdata%2F1635965%2F000116169717000200%2Fwoodl-20161231.xml
10
And you could continue on and repeat this process for each of the 148 cash flow statements which are
inconsistent with the expected results which were extracted, considering the two rules that could be
used to represent the relationship between five high-level concepts on the cash flow statement.
And so likewise, everything is either correct or is incorrect and can be corrected.
Income statement:
And so this brings us to the main even in this experiment, the income statement. I have extracted two
pieces of information from the income statement: “Revenues” and “Net Income (Loss)”. No other
information was extracted in order to make a point related to extracting information from financial
reports in general.
Note that when we extracted information from the balance sheet, we extracted the facts for “Assets”
and we extracted the facts for “Liabilities and Equity” and we used the accounting equation, “Assets =
Liabilities and Equity” to verify the information that was extracted. For the cash flow statement we did
the same thing, using the two rules for the five concepts to help us understand if the information
extracted was correct.
11
Now consider the income statement. Here you see a column with the concept “Revenues” and you see
a column with the concept “Net Income (Loss)”. How do you KNOW that the information which was
extracted is correct? How do you KNOW that you are working with the right information?
Contrast having no comparison information or rules to having 18 pieces of information tied together
with 10 rules that verify that the “Sudoku Puzzle8” that is the income statement fits together correctly.
If all these pieces fit together as expected, what is the probability that the information you are working
with is the correct information from the income statement of the financial statements?
The point: If you don’t have rules (mapping rules, impute rules, consistency rules) be conscious that you
are basically flying blind.
Analysis:
The following Excel spreadsheet “ExtractionPrototype-SPEC6-2018-03-31_AllGood.zip9” contains
information which was extracted from 1,418 XBRL-based financial reports of public companies that all
report the same way, using the same high-level reporting style. Of the 1,418 reports; all are consistent
with (a) a set of mappings of concepts used to represent some financial concept, (b) a set of impute
rules that are used to logically deduce financial concept information if an expected fact was not
reported, (c) a set of consistency rules that makes sure that the relations between the 18 financial facts
are consistent with expectation.
The following Excel spreadsheet “ExtractionPrototype-SPEC6-2018-03-31_All.zip10” contains
information which was extracted from 1,815 XBRL-based financial reports of public companies that all
8 Wikipedia, Sudoku, https://en.wikipedia.org/wiki/Sudoku
9 ExtractionPrototype-SPEC6-2018-03-31_AllGood.zip,
http://xbrlsite.azurewebsites.net/2018/Experiment/ExtractionPrototype-SPEC6-2018-03-31_AllGood.zip 10
ExtractionPrototype-SPEC6-2018-03-31_All.zip, http://xbrlsite.azurewebsites.net/2018/Experiment/ExtractionPrototype-SPEC6-2018-03-31_All.zip
12
report the same way, using the same high-level reporting style. What is different about this spreadsheet
is that both CORRECT and INCORRECT representations are included.
The mapping rules, impute rules, and consistency rules are EXACTLY the same in both of these
spreadsheets. What is different is the set of reports for which information is being extracted.
Both of these spreadsheets extract information from XBRL-based reports that use the reporting style
that has the code11:
COMID-BSC-CF1-ISM-IEMIB-OILY-SPEC6
The code itself is meaningless, simply a combination of a balance sheet style, a cash flow statement
style, and an income statement style. The balance sheet style is “BSC” for classified balance sheet. See
page 5 of the reporting styles:
The cash flow statement style is “CF1” for the first cash flow statement rule where exchange gains are
included within the net cash flows roll up. See page 28 of the reporting styles:
11
US GAAP Reporting Styles, http://www.xbrlsite.com/2018/10K/US-GAAP-Reporting-Styles.pdf
13
The income statement reporting style code is “SPEC6”. See page 13 of the reporting style codes
document:
14
Now, to provide a rather extreme example to make a point, different public companies can have
different reporting styles. For example, the style INTBX is the reporting style code of financial
institutions that use interest-based revenues style of reporting. Here is an Excel spreadsheet
“ExtractionPrototype-INTBX-2018-06-30.zip12”. Page 24 of the reporting style document has this
income statement:
The point is this. In order to extract information effectively from an XBRL-based report, you MUST:
1. Somehow understand which reporting style us used.
2. Use the correct set of rules that defines the structure of the primary financial statements
including
a. Balance sheet
b. Income statement
c. Statement of comprehensive income
d. Cash flow statement
12
ExtractionPrototype-INTBX-2018-06-30.zip, http://xbrlsite.azurewebsites.net/2018/Experiment/ExtractionPrototype-INTBX-2018-06-30.zip
15
3. Use the mapping, impute, and consistency rules to both extract the information you are
attempting to extract and verify that the information which was extracted is consistent with that
expectation.
There are somewhere between 33 and 540 reporting styles13 used by public companies that report using
US GAAP. Of all the approximately 6,000 public companies that report using US GAAP, approximately
90% of those companies use 29 different reporting styles.
Based upon empirical evidence, a significant number of public companies have slight variations from
other companies due to ambiguities that exist within US GAAP or IFRS. It is highly likely that, over time,
the number of reporting styles will decrease.
US GAAP reporting to the SEC has had the most testing14. IFRS reporting to the SEC has less testing but
the patterns are similar15. Specifically, here are the reporting style rules for US GAAP16. Here are the
reporting style rules for IFRS17.
Private companies and SMEs are expected to have less variation than the complex reports of public
companies.
Conclusion: Financial reports are graphs
Financial reports are graphs. In particular, the primary financial statements whose transactions flow
through the general ledger via the general journal and all special ledgers/journals that support that
process and include the balance sheet, income statement, cash flow statement, statement of
comprehensive income, and statement of changes in equity are graphs.
In order to effectively create such reports correctly or to effectively extract information from such
reports it is essential to understand WHICH graph of the possible sets of graphs which provides the
metadata (mapping rules, impute rules, consistency rules) that a financial report uses. Today, most
companies have had their reporting style codes manually assigned using some automation to assist in
that manual process. However, it is almost certain that reporting styles can be determined dynamically
by probing a financial report. This has not yet been tested, but this will highly-likely work.
Extracting individual facts from an XBRL-based report without extracting the complete set is extremely
risky. As is said, “Ignorance is bliss.” Just because you don’t check to see if an error has occurred does
not mean that extracted information is correct.
13
Charles Hoffman, Making the Case for Reporting Styles, http://xbrlsite.azurewebsites.net/2017/library/MakingTheCaseForReportingStyles.pdf 14
US GAAP Test data, http://xbrl.squarespace.com/journal/2018/7/28/us-gaap-test-data-2017-10-ks.html 15
IFRS Test data, http://xbrl.squarespace.com/journal/2018/7/14/updated-list-of-ifrs-filings.html 16
Reporting Style Rules for US GAAP (Older version), http://www.xbrlsite.com/2018/Prototype/ReportingStylesUSGAAP/Index.html 17
Reporting Style Rules for IFRS (Older version), http://www.xbrlsite.com/2018/Prototype/ReportingStylesIFRS/Index.html
16
While there has been significant testing of high-level financial concepts and it is clear that the high-level
metadata that is being used today works as anticipated; it is becoming clear that metadata related to
the next tier of information is likewise necessary. Type or class relations keep companies representing
more detailed information and those using that information from misclassifying a fact. For more
information, please see Chain of Capabilities Necessary to Automate Accounting Processes18.
18
Chain of Capabilities Necessary to Automate Accounting Processes, http://xbrlsite.azurewebsites.net/2018/Library/ChainOfCapabilities.pdf