By Johannes C. Scholtes, Chairman and CSO, ZyLAB - The Five Golden Ws of Investigation

28 Oct 2014 12:52 PM | Chere Estrin (Administrator)
In every investigation, whether it is a criminal or legal investigation, there are five golden Ws that the investigator must answer in order to be successful. These are:

  • WHO is it about?
  • WHAT happened?
  • WHEN did it take place?
  • WHERE did it take place?
  • WHY did it happen?
Some might even consider two other pertinent questions:
  • HOW did it happen?
  • HOW MUCH or HOW OFTEN?
How do you cull through the vast volume of data to find these answers? Manually analyzing the data to find the answers is very time consuming and requires lots of resources. The reason is that you do not know exactly what it is you are looking. What words should you be searching in order to find the “smoking gun?” How do you find patterns among the words? Do you highlight the words as you manually review each of the documents, then go back and see how they are connected? It’s not that simple. Criminals use aliases; transfers may be done by unknown off-shore companies or via unknown bank accounts, etc. All of this complicates and slows down the investigation.

In addition, the size of electronic data that needs to be investigated continues to grow with increasing complexity, exacerbating the problem. Of course, there are technologies that can help expedite the investigation. Computer technology can help analyze large data sets at tremendous speed for specific patterns. In combination with other technological advances including text mining, computational linguistics, statistics, machine learning and even artificial intelligence, it is much easier to analyze the data specifically focused on finding the five Golden Ws.

Modern text mining and content analytics can search on a higher level than just key words. For example, with text mining linguistic patterns like ‘someone pays someone else’ or ‘someone meets someone else at a certain location and at a certain time’ can be identified without using the exact names or amounts. By extracting such patterns combined with simple statistics, one can easily identify unknown persons, companies, bank account numbers, and also spot code names and aliases.

Criminals will try to cover up illegal activities by hiding information in non-searchable file formats or by embedding different types of electronic objects within complex compound files where the most relevant information is often hidden in the deepest layers. Your solution needs to identify information even when it is hidden in the deepest layers and be able to search those seemingly unsearchable formats such bitmaps, images, non-searchable PDFs, audio files or even a video. By combining text mining with advanced analytics, relevant information can be quickly identified at speeds many times faster and more efficient than what humans could ever do. The investigators can easily validate the relevant information to prevent so-called tunnel vision and identify invalid evidence or investigation directions.

Over the years, I have seen many real-life cases where this hybrid man-machine approach has identified twice the amount of relevant information with half the resources in half the time! This is a great example where Big Data analytics can lead to Big Savings!

Powered by Wild Apricot Membership Software