Using computed similarity of distinctive digital traces to evaluate non-obvious links and repetitions in cyber-investigations

Bollé, Timothy; Casey, Eoghan

doi:10.1016/j.diin.2018.01.002

Using computed similarity of distinctive digital traces to evaluate non-obvious links and repetitions in cyber-investigations

Details

Download: 1-s2.0-S1742287618300343-main.pdf (1276.31 [Ko])
State: Public
Version: Final published version

Serval ID

serval:BIB_2F1A43ACD87E

Type

Article: article from journal or magazin.

Collection

Publications

Institution

UNIL/CHUV

Title

Using computed similarity of distinctive digital traces to evaluate non-obvious links and repetitions in cyber-investigations

Journal

Digital Investigation

Author(s)

Bollé Timothy, Casey Eoghan

ISSN

1742-2876

Publication state

Published

Issued date

03/2018

Peer-reviewed

Oui

Volume

Pages

S2-S9

Language

english

Abstract

This work addresses the challenge of discerning non-exact or non-obvious similarities between cyber-crimes, proposing a new approach to finding linkages and repetitions across cases in a cyber-investigation context using near similarity calculation of distinctive digital traces. A prototype system was developed to test the proposed approach, and the system was evaluated using digital traces collected during actual cyber-investigations. The prototype system also links cases on the basis of exact similarity between technical characteristics. This work found that the introduction of near similarity helps to confirm already existing links, and exposes additional linkages between cases. Automatic detection of near similarities across cybercrimes gives digital investigators a better understanding of the criminal context and the actual phenomenon, and can reveal a series of related offenses. Using case data from 207cyber-investigations, this study evaluated the effectiveness of computing similarity between cases by applying string similarity algorithms to email addresses. The Levenshtein algorithm was selected as the best algorithm to segregate similar email addresses from non-similar ones. This work can be extended to other digital traces common in cybercrimes such as URLs and domain names. In addition to finding linkages between related cybercrime at a technical level, similarities in patterns across cases provided insights at a behavioral level such as modus operandi (MO). This work also addresses the step that comes after the similarity computation, which is the linkage verification and the hypothesis formation. For forensic purposes, it is necessary to confirm that a near match with the similarity algorithm actually corresponds to a real relation between observed characteristics, and it is important to evaluate the likelihood that the disclosed similarity supports the hypothesis of the link between cases. This work recommends additional information, including certain technical, contextual and behavioral characteristics that could be collected routinely in cyber-investigations to support similarity computation and link evaluation.

Keywords

Digital forensics, Digital traces, Digital evidence, Similarity measures, Email similarity, Trace similarity, Case comparisons, Case linkage, Cyber-investigation, Near similarity computation, Crime analysis, Forensic intelligence

URN

urn:nbn:ch:serval-BIB_2F1A43ACD87E3

OAI-PMH

oai:serval.unil.ch:BIB_2F1A43ACD87E

DOI

10.1016/j.diin.2018.01.002

Web of science

000428307900002

Publisher's website

http://www.dfrws.org/

Open Access

Yes