Real-Time Data Engineering For a Cybersecurity Company

A world leader in cybersecurity, a Fortune 500 company, turned to Aligned Research Group to implement a true real-time pipeline for security events for one of their products aimed at telecoms.

With zero-day attacks becoming zero-hour and zero-minute, creating automated data stream processing pipelines becomes critical for keeping an upper hand against bad actors. While the infrastructure for Kafka-based data collection supported millions of networking events per second, most of the processing was done in batch mode of varying frequency.

Aligned Research Group architects, working closely with the customer’s data science and data engineering teams, designed a hybrid real-time/batch processing pipeline architecture and the implementation and production deployment roadmap. During implementation phase, engineers working on-site and from our US and Porto, Portugal locations, wrote a number of analytic and ETL tools in different languages (Python, C++, Go, Scala), deployed Kubernetes cluster, built CI/CD for automated testing and DevOps, and migrated and rewrote a number of batch processing scripts to fit into the real-time model.

Special attention was paid to converting data science algorithms to work in a production, real-time environment. Maintaining flexibility for innovation is critical to continue to develop novel strategies to detect and remediate new attacks, and the design of the DevOps environment optimized for machine learning algorithm reflected this challenge. Creating new data workflows involved optimizing anomaly detection, building a specialized in-memory high-throughput graph engine, and creating other components working together to help detect attacks in real-time. Our supercomputing implementation experience came handy with building several GPU-powered algorithms, including a large-scale implementation of the neural network for deep learning.

Impressed with the robustness of the engineered data processing pipeline and the impact of the real-time analysis on generating security impact, the customer retained Aligned Research Group to continue to work with its engineering team to help improve the pipeline and to add new algorithms.

This project required a combination of several major competencies of Aligned Research Group: scalable data engineering and DevOps, cybersecurity expertise, telecommunication industry knowledge, data science, and experience with massively parallel computing NVIDIA GPGPUs. We played a major role in the design and transformation of data processing pipelines from batch processing to a real-time streaming, reduced latency, improved uptime, and improved effectiveness of the customer’s product. While sensitive nature of this project limits what we can disclose, this customer would be happy to provide the reference for this engagement upon request.