Challenge
Our clients are inundated with large volumes of unstructured and semi-structured data from disparate sources—reports, feeds, logs and records. A lack of standardized formatting, overlapping information and incomplete entries make it difficult to identify emerging risks and track trends in real time. Manual triage is labor-intensive, error-prone and too slow to support timely interventions.
Solution
Noblis led research and developed a prototype implementation of a ML-powered triage prototype comprising of:
- Clustering algorithms to group semantically similar records regardless of source or syntax
- Fuzzy matching to reconcile near-duplicate entities like “John Smith event” and “J. Smith appointment”
- Data aggregation to combine disparate data sources, types and records like pieces in a puzzle
- Network visualizations to map relationships among entities—e.g., names, clusters, time-stamped locations—enabling users to visually track high value connections and interactions
Impacts
Our clients quickly realized operational results including:
- Efficiency Gains such as reduced manual triage time, enabling analysts to focus on deep analysis
- Faster Detection with identified high value targets in just minutes where it used to take days — allowing for quicker interdiction
- Actionable Insights via interactive network graphs that helped uncover non-obvious threat and risk vectors
- Expanding Depth with the ability to leverage valuable information from different sources and types to develop a comprehensive representation of a target