Social Engineering issue and KINAITICS’s SED tool

12 March 2024

Author: Marco Angelini / ENG


If we visualize an organization’s security as a chain, employees are the weakest link in its cyber defense. In KINAITICS, we will focus on Phishing which is the most common social engineering cyber threat that companies face every day.

Kevin Mitnick (aka the Condor), one of the world’s most famous hackers and CEO of a computer security and consulting firm, was convinced that a company can invest millions of dollars in its own software, hardware, and state-of-the-art security devices, but if there is even a single employee in the company who can be manipulated with a social engineering attack, all the money invested will have been wasted.

Detection against social engineering means adoption of methods and techniques used to identify suspicious activities or attempts at psychological manipulation by attackers.

In KINATICS, a new tool is going to be developed: Social Engineering Detection (SED) tool. The methodology takes a hybrid approach that mixes heuristics, blacklist and machine learning to optimize both performance and detection response. The tool is supported by external sources to keep the detection capability up to date (e.g. Phishtank, VirusTotal, ip2c, HuggingFace and others). Additionally, to preserve the organization’s privacy, emails will be anonymized.

The tool is also powered by NLP (natural language processing) analysis, to improve the understanding of textual contents by ML model. This involves syntactic and semantic analysis of textual content that encompasses the entire NLP chain including parsing, tagging, word sense disambiguation, categorization, clustering, summarization, text similarity, sentiment analysis, emotion recognition, identification of textual patterns.


A 3-phase analysis carried out by the tool will be shown to the user:

  1. Header analysis
  2. body analysis
  3. analysis of attachments

Furthermore, emails generally contain a lot of sensitive personal information (Personally Identifiable Information or PII), which can identify an individual. However, to detect social engineering patterns in email text, images and attachments, such information is not necessary and does not need to be managed by social engineering detection tools, as per the General Data Protection Regulation (GDPR). That’s why SED contains a component to anonymize the email. The goal is to identify all PII entities in email messages and to make them unidentifiable (de-identification). To this end, an open-source library by Microsoft called “Presidio: Data Protection and De-Identification SDK”[1] has been used