What makes the malicious document problem a good candidate for machine learning (ML)? Could you have used rules?
“Millions of documents in milliseconds,” not sure how to even parse it - what is involved in making it work?
Can you explain to the listeners the motivation for reanalyzing old samples, what ground truth means in ML/detection engineering, and how you are using this technique?
How fast do the attackers evolve and does this throw ML logic off?
Do our efforts at cat-and-mouse with attackers make the mice harder for other people to catch? Does massive-scale ML detections accelerate the attacker's evolution?