Neural Analysis and Correlation Engine vs. AI Generated Malware

Written by Abhishek Singh, Inception Cyber Co-Founder and CTO | Nov 12, 2024 9:46:37 PM

In our blog, ‘Staying Ahead: Understanding the Latest Email Evasion Tactics,’ based on an analysis of the latest threats, we delve into the details of today’s malicious attachments and URLs. In the current threat landscape, evasion techniques span across the entire attack kill chain, with an aim to hide malicious payloads at each and every stage with an aim to bypass the detection technologies. Generative AI can adapt and learn, enabling them to craft payloads using evasion techniques on an unprecedented scale. These capabilities can be leveraged to launch Advanced Persistent Threats (APTs) or campaign-based attacks with an aim to bypass the current detection technology.

As shown in Figure 1.0, evidence is mounting that threat actors are using generative AI to craft malware. Furthermore, as per the Black Hat MEA 2024 generative AI is also being leveraged to rapidly produce new malware variants embedded with evasion techniques designed to circumvent detection technologies.

Figure 1.0 Slides from upcoming presentation at Black Hat, generative AI used to develop malware

Our core technology, the Neural Analysis and Correlation Engine (NACE), addresses the detection of evasive malicious attachments and URLs without relying on traditional malicious payloads or behavior, which are masked by advanced evasion techniques or are further down the kill chain. By sidestepping dependency on these malicious behavior/features, NACE maintains resilience against evasion tactics, 0-day variants of malicious behavior/features introduced by threat actors or generated via Artificial Intelligence.  

LLMs and NLP models used in NACE identify the semantic and thematic meaning in emails, isolating those that contain semantics previously used by threat actors to deliver malicious attachments or URLs. Once the email is isolated, the learning around the semantics and thematics embedded in the body and subject of the email is used as a feature set to detect malicious attachments, URLs, and identity-based attacks.

To build such a system, as a part of the first step, we designed a framework to extract semantic and thematic meanings from historical emails used by the threat actors to deliver malicious attachments and URLs. 

 Figure 2.0 Architecture to extract the semantics and thematic meaning from emails

The texts from historic emails which were being used by the threat actor, were extracted, processed, and represented using a pre-trained embedding model, BGE-M3, to obtain dense vector representations of the content. To optimize clustering performance, the dimensionality of these embeddings was reduced, facilitating more efficient processing. Among several clustering algorithms tested, OPTICS (Ordering Points to Identify the Clustering Structure) was chosen to group the reduced embeddings based on their semantic similarity. Representative keywords for each cluster were derived using class-based Term Frequency-Inverse Document Frequency (c-TF-IDF), a modification of standard TF-IDF that treats each cluster as a single document. This method effectively captures the most relevant terms for each topic. Lastly, the semantic meanings of the clusters were extracted using Phi-3-Mini-4K-Instruct, a model designed for generating and refining semantic representations.

Our analysis also showed that the semantics and topics get repeated extensively by the threat actors to deliver malicious attachments and URLs. 

Every incoming email is analyzed by the Neural Analysis and Correlation Engine (NACE) to determine whether its topics, semantic and thematic elements align with those commonly used by threat actors to deliver malicious attachments and call-to-action URLs. Once this alignment is detected, the topic, semantic and thematic information embedded in the email becomes part of a feature set in the expert system in NACE to identify malicious attachments, URLs, and identity-based attacks—without relying on indicators from the malicious payload itself.

Figure 3.0 Slides from our upcoming presentation at Black Hat: Detecting APT Attacks with NACE

The battle between AI-driven threats and security technology is just beginning. With state-sponsored adversaries equipped with substantial resources, technical expertise, and most of the knowledge of AI based architecture publicly available, this confrontation will only grow more challenging.

        Figure 4.0 State affiliated threat actors leveraging AI

At InceptionCyber, we built a Neural Analysis and Correlation Engine (NACE) from the ground up, applying first principles to detect sophisticated malicious attachments and URLs—such as ransomware, phishing, and other attacks crafted by human threat actors and generative AI.

Join us at Black Hat MEA 2024 and AVAR 2024 as we dive into the technical details of our Neural Analysis and Correlation Engine (NACE) and showcase key results demonstrating its impact against these advanced, evasive threats.