Inception Cyber's Responsible AI Commitment
Last Revision |
March, 15, 2025 |
Initial Version |
December 5th, 2024 |
Introduction
At Inception Cyber, our commitment to Responsible AI is rooted in decades of real-world cybersecurity experience and a profound emphasis on data security and privacy. One of our founders is from Europe—where data protection laws are notably rigorous—adding a global perspective that underscores our belief in privacy as a universal priority. We strive to harness the latest AI breakthroughs in a way that respects user data and fosters enduring confidence among our customers. Inspired by frameworks from industry leaders like IBM, Microsoft, and Google, we pursue an ethical, transparent approach that safeguards both individual privacy and organizational resilience.
Why Responsible AI Matters
- Trust: AI-driven solutions must maintain customer trust by handling data ethically and securely.
- Privacy: Users deserve assurance that their information will not be misused for unintended purposes.
- Fairness & Accountability: AI should avoid unintended biases, and its creators should be accountable for outcomes.
- Long-Term Viability: An AI solution that respects ethical and privacy boundaries is more sustainable in the face of evolving regulations and customer expectations.
Our Approach to AI in Email Security
At Inception Cyber, we use generative and predictive AI and advanced threat detection techniques to proactively identify malware, phishing attempts, and business email compromise (BEC) campaigns. However, we have put in place strict safeguards to protect the data we analyze:
Limited Data Retention |
- We do not store emails after they are analyzed unless they are confirmed to be malicious. This ensures that legitimate email content is not retained, preserving end-user privacy.
- We never collect or store data solely for the purpose of training models on individual user behaviors.
|
No Training on Customer Emails |
- Our AI models are developed and refined using curated, ethically sourced datasets and do not rely on customer emails for training.
- This helps us avoid bias and hallucinations (unintended or fabricated outputs) that can arise when systems learn from limited or unverified user data. It also ensures we do not inadvertently replicate or expose customer information within our AI models.
- No manual data labeling: Because we don’t train on customer emails, our team never needs to read or label individual emails, further protecting privacy and preventing any accidental data exposure.
|
Use of Public Large Language Models |
- We employ publicly available large language models, which we vertically adapt to understand intent in email content.
- Before integrating new versions or model updates, we conduct thorough internal testing and security reviews to confirm that the model:
- Maintains or improves detection accuracy.
- Aligns with our Responsible AI principles.
- Does not inadvertently expose or learn from sensitive customer data.
- By leveraging a shared AI foundation and customizing it for cybersecurity use cases, we ensure a robust, specialized approach—without compromising on data privacy or ethical standards.
|
No Profiling of Individual Employees |
- We focus on attacker tactics—not user or employee behavior. Our detection methods prioritize threat actor intent and leveraging contextual relationship between intent and auxiliary features to give verdict.
- By avoiding personal or behavioral data in our training, we reduce the risk of bias and false conclusions. This strategy also ensures we can scale our protection without creating undue surveillance risks or false positives for large organizations.
|
Transparency & Consent |
- We strive to keep our customers informed about what data is processed and how. We communicate clearly about the scope of our analysis and provide opt-in and opt-out mechanisms where feasible.
- In cases where we do collect or retain information (e.g., malicious emails), we clearly outline the reason and duration in our [Privacy Policy] and other customer-facing documentation.
|
How Have We Trained Our Models? |
- Inception Cyber’s detection models draw on over a decade of evasive attack data, combined with continuously updated feeds from reputable threat intelligence sources. This historical dataset captures the evolution of malware, phishing, business email compromise (BEC), malicious campaigns and semantics used by threat actors to deliver them—reflecting real-world attacker tactics and intent.
- Our training process has been overseen by AI researchers, data scientists, and malware experts who have tackled similar challenges at companies like Microsoft, FireEye, and Cisco. Design to extract semantics is mentioned in our arXiv paper.
- Their domain expertise in both cybersecurity and artificial intelligence ensures our models are trained with a deep understanding of attacker methods and a commitment to robust, ethical data handling.
- Our patent-pending NACE technology is specifically designed not to require the Landing URL or malicious features from attachment in an email. This gives us a key technology advantage, since the technology does not need malicious payload or landing URL it is immune to evasions by threat actors or by AI, designed to hide malicious payload or landing URL.
- For detailed information about NACE, please refer to our blogs [1][2][3], the AVAR 2024 conference presentation, and our arXiv [4] paper.
|
How Do We Keep the Model Updated?
- We periodically incorporate malicious emails from threat intelligence feeds to refine and enhance NACE. We store and analyze these emails only when they are confirmed to be malicious, as outlined in our Limited Data Retention policy. This approach ensures our models stay aligned with the latest, real-world threats and semantics attackers use to deliver malware, Phishing or BEC emails.
- We view this process as part of a broader industry practice: many cybersecurity vendors similarly gather and share malicious samples to benefit the larger security community. By doing so, we keep pace with rapidly evolving attack methods without compromising our commitment to responsible data handling or user privacy.
Inspired by Industry Best Practices
- IBM’s Principles for Trust and Transparency emphasize ethical, explainable AI. We take cues by sharing how our system works via blogs and conference presentations and ensuring our threat models remain auditable.
- Microsoft’s Responsible AI Standard focuses on accountability and reliability. Inception Cyber aims to adopt similar guardrails, regularly reviewing model outputs for efficacy and bias.
- Google’s AI Principles highlight safety, privacy, and wide benefits. We echo these concepts by prioritizing user protection and privacy in the face of ever-evolving threats.
How This Differentiates Inception Cyber
By not training on customer emails and not profiling individual employee behaviors, our detection models remain laser-focused on understanding the intent of an email and leveraging the contextual relationships between intent, SMTP headers, deep file parsing results, and auxiliary information from URL to deliver a verdict. Meanwhile, our selective use of public large language models ensures we benefit from cutting-edge research and advancements while preserving strict data governance. Our approach:
- Enhances Privacy: Protects customer data from being stored or inadvertently exposed in AI training sets.
- Reduces Bias & Hallucinations: Mitigates the risk of skewed or misleading patterns that may arise from user-specific data.
- Minimizes False Positives: Focuses on intent-based detection rather than employee-specific behaviors.
- Builds Trust: Maintains a clear firewall between user data and our underlying AI models while leveraging state-of-the-art open-source technologies.
Conclusion
At Inception Cyber, Responsible AI is foundational to our mission and critical to preserving the trust of our customers. By adopting best practices from leading technology companies, enforcing strict data governance principles, and rigorously vetting public large language models, we ensure that our AI-driven email security platform safeguards your organization without compromising user privacy or ethical standards.
We believe this approach—balancing innovation with respect for data privacy—positions Inception Cyber at the forefront of responsible, intent-driven threat detection. We welcome questions, feedback, and collaboration as we continue to refine our methods and push the envelope of what’s possible in AI-driven cybersecurity.
For Further Reading:
[1] Harnessing Language Models to Stop Evasive Malicious Email Attachments, explains NACE’s detection capability to detect malware and phishing URL through a case study on the inception of exploitation. The blog compares various technologies (e.g., Sandbox, Signature-based, EDR, XDR) with NACE and presents results showing how our technology detects threats that other solutions miss.
[2] Rethinking Threat Prevention with an Evasion-First Mindset, discuss how LLMs are being leveraged at a foundational level in our technology to prevent evasions.
[3] Transforming BEC Protection Against Generative AI Evasions with Deep Learning, Zero-Shot Classification, and Zero Trust Principles. Details designed to identify BEC email messages, regardless of whether they are generated by a threat actor or an AI.
[4] arXiv paper Details design to extract semantics.