What is a one sentence summary of your feature request?
Enable OCR processing for images embedded within PDF files to allow content inspection and policy enforcement.
Please describe your idea in detail. What is your problem, why do you feel this idea is the best solution, etc.
Currently, Endpoint Protector can scan and analyze text-based documents for sensitive content, but it lacks the ability to extract and analyze text within image-based PDFs—such as scanned documents or screenshots embedded in PDFs. This limitation creates a blind spot where data exfiltration can occur by simply embedding sensitive information as an image. Adding OCR capabilities to analyze images within PDFs would close this gap, ensuring the product delivers comprehensive data loss prevention across all document types. This would enhance compliance with data protection regulations and improve visibility into potential data leakage attempts.
How do you currently solve the challenges you have by not having this feature?
Today, we rely on manual review processes or separate OCR tools upstream in our document workflows to convert image-based PDFs into searchable text before EPP scans them. This is not only inefficient but also prone to human error and inconsistent enforcement. It also leaves real-time monitoring and blocking ineffective against this vector.