| Benford analysis the easy way |
|
As interest in fraud detection grows, the use of Benford analysis is growing in the audit community. This technique relies on comparing the distribution of the leftmost digit in a set of numbers to a theoretical distribution. This is something nextanalytics does with ease - no spreadsheet forumlas to copy or break, adapting to new data sets with almost no effort.
As reference and background, we have used a dataset provided by the Journal of Accountancy in an article written by Lehman, Weidenmeir Watson and Jones. Assuming we start with their source data (saved to a CSV file) and have to accept it as-is, which is typical in an audit situation, our first task is to import the data we want in the format we need.
Transaction,Date,Employee,Amount,Manager A simple ImportData command brings in the first 'Transaction' column from the CSV file as numeric data, ignores the date because we won't use it, grabs the 'Employee' column as the row label, the 'Amount' as a text column, and ignores the last 'Manager' column. importdata,SuperBenford.csv,0,1,numeric_,ignore_,rowlabel_,text_,ignore_
Once we have the data, we can filter to just the rows where the Amount starts with '1', then get a collapsed list of Employee numbers and a count.
selectbycaptions,KeepBySel,Row,0,False,-1,False,,True,StartsWith,1,1
Repeating for the digits 2 through 9, we are left with nine different pages, one for each digit, which we join together with a single command. ConcatenatePagesWith,Digit,StartsWith
We then perform a simple pivot to get the digits across the columns. This gives us a table profiling each employee and the number of times they had amounts starting with each digit. SwapTextColumnWithColumn,From
Converting each row into a ratio of the row total, we have the proportion of each employee's amounts that started with each digit. Compare,ToAxis,Ratio,Sum,Row,,,-1,-1,0,0,False,False,30, These results can now be compared to the theoretical or ideal value for each digit, loaded as a separate ('Target') page. This comparison can be expressed as percentage of the target, where 100 is a perfect match. Compare,ToARow,PercentOf,Actual,Row,Target,,Target,-1,0,0,True,False,1, With nextanalytics, we can also easily create arbitrary alert levels, such as plus or minus 2.5 or 5 percent. Simple steps that make the headlines stand out on the page.
That is all there is. Got another dataset, simply change one line to import the data - the rest of the project runs as is.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||