Editor’s Note: The following article provides a structured and comprehensive overview of the LegalTechTalk session titled Navigating Emerging Data Sources in E-Discovery & Investigations, held on June 26, 2025. Moderated by Tom Makin, Managing Associate in Disputes & Investigations at Simmons & Simmons, the session featured insights from Christina Zachariasen, eDiscovery Executive Director at A&O Shearman; Jeffrey Shapiro, Director of Forensic and Integrity Services at EY; and Michael Sarlo, Chief Innovation Officer and President of Global Investigations and Cyber Incident Response Services at HaystackID. Together, the panel examined the accelerating complexity of emerging data sources—ranging from collaboration platforms and cloud applications to ephemeral messaging and generative AI—and their growing role in legal investigations, regulatory response, and discovery workflows.
Designed for legal technology professionals, cybersecurity specialists, and information governance leaders, this discussion emphasized the practical strategies necessary to manage non-traditional electronically stored information (ESI) with defensibility, efficiency, and compliance. The panelists explored the benefits of early expert engagement, internal data containment, and transparent AI integration, while also addressing the importance of risk-aware governance models. As modern communication ecosystems continue to evolve, this session offered timely and actionable guidance for professionals seeking to adapt their legal and investigative capabilities to the realities of today’s data landscape.
Industry News – Technology Beat
Managing Emerging Data in eDiscovery: Lessons from LegalTechTalk 2025
ComplexDiscovery Staff
Leading experts from HaystackID, EY, and A&O Shearman share insights on handling collaboration tools, cloud data, and AI in modern discovery workflows.
Legal discovery is undergoing a fundamental transformation as new categories of electronically stored information (ESI) continue to emerge at scale. At LegalTechTalk, a distinguished panel moderated by Tom Makin, Managing Associate in Disputes & Investigations at Simmons & Simmons, discussed this dynamic shift in a session focused on navigating emerging data sources.
Panelists included Christina Zachariasen, eDiscovery Executive Director at A&O Shearman; Jeffrey Shapiro, Director of Forensic & Integrity Services at EY; and Michael Sarlo, Chief Innovation Officer at HaystackID. Together, they explored how legal teams can adjust their workflows, governance models, and technologies to meet the growing demands of a fragmented data environment.
What Qualifies as Emerging Data?
Jeffrey Shapiro characterized emerging data sources (EDS) as “anything not an email or traditional office document,” encompassing chats, ephemeral messaging, video, audio, generative AI artifacts, and more. The legal implications of these formats are mounting, not only for disputes and investigations but also for regulatory compliance and privacy oversight. Shapiro emphasized, “Emerging data sources matter because they reflect mission-critical communications within modern organizations.”
Zachariasen added that data from messaging platforms like Slack and Microsoft Teams has now overtaken email in many document collections. Relativity reported a 430% increase in short message data volumes between 2022 and 2023, illustrating the scale and acceleration of this shift.
Bridging the Gaps: Strategy, Technology, and Governance
Managing these complex data sources requires early engagement of technical experts, as Christina Zachariasen noted: “Exporting from these platforms can be slightly treacherous,” with variable formats impacting everything from processing to review. Panelists stressed the value of involving eDiscovery and forensic specialists at the outset to mitigate downstream complications and preserve defensibility.
Sarlo reinforced the importance of working directly with stakeholders to understand how platforms are used, particularly given the lack of clear custodian boundaries in collaborative environments. “When you’re dealing with many of these alternative data sources, we start to look at who’s talking to whom and about what as a method to reduce volume early on,” he said.
Jeffrey Shapiro tied the conversation back to the broader imperative of sound information governance, framing it as an enterprise-wide strategic necessity. “This isn’t just for litigation or investigations. It’s an imperative for every business,” he said. Effective governance not only aligns with regulatory frameworks such as GDPR but also improves operational insight and security postures.
Data Integration and Platform Limitations
Even as tooling improves, panelists agreed that integrating disparate structured and unstructured data sources remains a challenge. While the industry has matured beyond the manual processes of a decade ago, no comprehensive solution currently exists for unifying all forms of ESI into a cohesive narrative.
Shapiro emphasized that although innovation has accelerated, organizations must still rely on “good advisors, good technologists, and good technology” to extract insights. Constant platform evolution—especially APIs and export methods—further complicates the landscape. Sarlo highlighted this challenge specifically in mobile and Slack data acquisition: “You need to go and check every single time if the acquisition method you use actually still works today.”
Security and Compliance Considerations
As the use of emerging data sources grows, so do the associated risks. The session addressed security and compliance head-on, especially in the context of cross-border data transfers and cyber incidents. Zachariasen underscored the enduring importance of basic security hygiene, such as segregated software instances and encrypted data transfers.
From a breach response standpoint, Sarlo advised organizations to engage outside counsel promptly and not to overly rely on internal IT. He explained, “You don’t trust your in-house IT team and you definitely rely on third-party consultants to validate what you’re seeing early on.” He also warned of logging limitations, which can significantly hinder incident investigations if not addressed proactively.
Shapiro noted that the strategic significance of data risk is now well documented in corporate risk disclosures. “Cyber risks and risks around AI are either a principal or emerging risk for nearly every Fortune 100 company,” he stated.
Minimizing Exposure: The Case for Internal Review
Minimizing data movement can reduce risk and increase efficiency. Increasingly, companies are leveraging internal tools for early data assessment. Solutions embedded in platforms like Microsoft 365 are helping organizations perform initial reviews within their firewall before escalating to external providers.
While these tools are not yet comprehensive, Shapiro recommended exploring internal eDiscovery capabilities as a way to “minimize what goes out and maximize efficiency.” Keeping review processes internal where possible supports cost control, data security, and regulatory compliance.
AI in E-Discovery: A Paradigm Shift in Progress
In the final segment, the panel turned to artificial intelligence, particularly the integration of generative AI into investigative workflows. Traditional TAR (technology-assisted review) has been used for over a decade, but GenAI is beginning to shift how teams conduct ECA, review, and issue analysis.
Christina Zachariasen noted that while TAR remains valuable, GenAI brings new efficiencies and predictability to document review. “I can give you a price for what it costs to run X number of documents, and I know it’s going to be that number,” she said.
Michael Sarlo emphasized that GenAI is most effective when used with intent and context. “It doesn’t work as well if you don’t know what you want to find,” he said. However, for known issues, timelines, and visual content, it delivers significant value. Sarlo recounted the use of AI to summarize construction site photos and flag OSHA violations—an application that would have been impractical with human reviewers alone.
The Future is Transparent and Defensible
Transparency, defensibility, and human oversight remain critical. Shapiro reminded the audience that AI must be deployed with clear protocols, noting that most GenAI tools now support traceable decision-making: “You have to make this a defensible process. There’s a legal underpinning here.”
Prompt engineering has become a core capability, but it is evolving. Today’s review protocols will increasingly inform AI models, streamlining and scaling human expertise.
Staying Ahead of the Data Curve
The growing ubiquity of emerging data sources demands a reevaluation of traditional legal discovery practices. Whether navigating collaborative chat logs, multimedia files, or AI-generated records, professionals in legal technology, cybersecurity, and information governance must adapt to new paradigms for risk mitigation, compliance, and insight generation.
LegalTechTalk’s session offered both a warning and a roadmap: organizations that invest in governance, maintain defensible workflows, and leverage responsible AI will be best positioned to manage the scale and speed of modern data. As the session demonstrated, the question is not whether these changes are coming, but how fast teams can prepare.
Returning to Shapiro’s opening reflection: the reason professionals gather in sessions like these is because “emerging data sources matter.” For those managing legal risk and digital evidence, understanding and addressing these new information frontiers is no longer optional—it is operationally and strategically essential.
News Sources
- Makin, Tom, et al. Navigating Emerging Data Sources in E-Discovery & Investigations. Panel discussion transcript, LegalTechTalk, London, 2025.
- LegalTechTalk 2025 – Europe’s Event for Legal Transformation
Assisted by GAI and LLM Technologies
Additional Reading
- What UK Law Firms Really Want: Procurement Insights from LegalTechTalk 2025
- The O2’s Transformation Sets the Stage for LegalTechTalk 2025
- HaystackID® Elevates Legal Tech Strategy Across Europe with Case Insight
and CoreFlex
(HaystackID)
Source: ComplexDiscovery OÜ
The post Managing Emerging Data in eDiscovery: Lessons from LegalTechTalk 2025 appeared first on ComplexDiscovery.