Start / Tech Law Talks / Ai for legal departments managing ediscovery and data retention risks with microsoft copilot

AI for legal departments: Managing eDiscovery and data retention risks with Microsoft Copilot

23 min • 20 juni 2024

Anthony Diana and Therese Craparo are joined by John Collins from Lighthouse to provide an overview of some of the challenges and strategies around data retention and eDiscovery with Microsoft’s AI tool, Copilot. This episode explores Copilot’s functionality within M365 applications and the complexities of preserving, collecting and producing Copilot data for legal purposes. The panelists cover practical tips on managing Copilot data, including considerations for a defensible legal hold process and the potential relevance of Copilot interactions in litigation.

----more----

Transcript:

Intro: Hello, and welcome to Tech Law Talks, a podcast brought to you by Reed Smith's Emerging Technologies Group. In each episode of this podcast, we will discuss cutting-edge issues on technology, data, and the law. We will provide practical observations on a wide variety of technology and data topics to give you quick and actionable tips to address the issues you are dealing with every day.

Anthony: Hello, this is Anthony Diana, a partner in the Emerging Technologies Group at Reed Smith, and welcome to the latest Tech Law Talks podcast. As part of our ongoing podcast series with Lighthouse on Microsoft M365 Copilot and what legal departments should know about this generative AI tool in M365. Today, we'll be focused on data retention and e-discovery issues and risks with Copilot. I am joined today with Therese Craprro at Reed Smith and John Collins of Lighthouse. Welcome, guys. So, John, before we start, let's get some background on Copilot. We've done a few podcasts already introducing everyone to Copilot. So if you could just give a background on what is Copilot generally in M365.

John: Sure. So the Copilot we're talking about today is Copilot for Microsoft 365. It's the experience that's built into tools like Word, Excel. PowerPoint, Teams, Teams meetings. And basically what it is, is Microsoft's running a proprietary version of ChatGPT and they provide that to each one of their subscribers that gets Copilot. And then as the business people are using these different tools, they can use Copilot to help generate new content, summarize meetings, create PowerPoints. And it's generating a lot of information as we're going to be talking about.

Anthony: And I think one of the interesting things that we've emphasized in the other podcasts is that each M365 application is slightly different. So, you know, Copilot for Word is different from Copilot for Exchange, and they act differently, and you really have to understand the differences, which we talked about generally. So, okay, so let's just talk generally about the issue, which is retention and storage. So, John, why don't you give us a primer on where is the data generally stored when you're doing a prompt and response and getting information from Copilot?

John: So the kind of good news here is that the prompts and responses, so when you're asking Copilot to do something or if you're chatting with Copilot in one of the areas that you can chat with it, it's putting the back and forth into a hidden folder in the user's mailbox. So the user doesn't see it in their outlook. The prompts and responses are there, and that's where Microsoft is storing them. So there's also files that get referenced that are stored in OneDrive and SharePoint, which we may talk about further. But in terms of the back and forth, those are stored in the Exchange mailbox.

Anthony: That's helpful. So, Therese, I know we've been working with some clients on trying to figure this out and doing testing and validation, and we've come across some exceptions. You want to talk about that process, I'll say.

Therese: I think that's one of the most important things when we're talking about really any aspect of Copilot or frankly, new technology, right? It's constantly developing and changing. And so you need to be testing and validating and make sure you're understanding how it's working. So as you said, Anthony, you know, we found early on when Copilot, our clients first started using Copilot, that the prompts and the responses for Outlook were not being retained in that hidden folder, right? And then Microsoft has since continued to develop the product. And now, in most cases, at least we're seeing they are, you know, similarly, for those of you who are using transcriptionless Copilot, so it's Copilot that doesn't can give you meeting summaries and answer questions during the meeting, but doesn't retain the transcript, because people had some concerns about retaining transcripts, we're seeing that those Copilot artifacts, so the prompt and the response are now currently not being retained in the hidden folder. So a lot of this is you need to understand how Copilot is working, but understand that it's also a dynamic product that's constantly changing. So you need to test it to make sure you're understanding what's happening in your environment with your organization's use of Copilot.

Anthony: Yeah. And I think it's critical that it has to constantly be tested and validated. Like any technology, you have to constantly test it because the way it's happening now, maybe even if it's being retained now, could change, right? If they revise the product or whatever. And we've seen this with Microsoft before where they may not always, you know, they change where it's stored because for whatever reason, they decided to change the storage. So if you have an e-discovery process and like, you just have to be aware of it. Or if you're trying to retain things and you're trying to govern retention, you just have to make sure you understand where things are stored. Okay, so John, if you could explain presently sort of how retention works with Copilot data that's stored in the hidden folder of Exchange.

John: So what Microsoft has done so far is they've made it possible for the prompts and responses that we were talking about. So when you're in Word or Excel or PowerPoint, or if you're using the chat function inside of the Teams or in general, those are the subject to the same retention that you've set for your one-to-one in your group chats. So if you have a 30-day auto-delete policy for your one-to-one in group chats, that's going to be applied to these Copilot interactions. So I've heard, and I think you guys may have heard this as well, that Microsoft is going to split those off, but it's not on the roadmap that we've seen, but we've heard that they are going to make them two separate things. But right now they're the one in the same.

Therese: Yeah, and I think it's the good and the bad news, right, for people who are looking at Copilot and how do I best manage it. The good news is that you can control retention. You can set a retention on it that's within the organization's control and you can make the decision that's right for your organization. The bad news is that it has to be the same as whatever right now is for whatever you're setting for Teams chat, which may or may not be how long you would like to retain the Copilot data. So there are some features that are good, that gives you a little bit control to make decisions that are right for the organization. But right now, they're only partially controllable in that sense. So you have to make some decisions about, you know, how long do you need Teams chat? How long do you need Copilot? And where's the right place in the middle to meet business needs, right? And also to take into consideration how long this data should exist in your organization.

Anthony: Yeah. And John, we've heard the same thing and heard from Microsoft that they're working on it, but I haven't heard if there's a roadmap and when that's happening. But we have several clients who are monitoring this and are constantly in contact with Microsoft saying, when are we going to get that split? Because at least for a lot of our clients, they don't want the same retention. And I think, Therese, we could talk a little bit about it in terms of what people are thinking about, what to consider. Once we get to a place where we can actually apply a separate retention for Copilot, what are the factors to consider when you start thinking about what is the appropriate retention for this Copilot data?

John: And Therese, do you want them to be ephemeral where you could have a setting where they just go, they aren't captured anywhere? I'd be curious if you guys think that's something that you would want clients to consider as an option.

Therese: Well, look, I mean, the first thing all of our clients are looking at is business needs, right? Is this with anything, right? Do the artifacts from Copilot need to exist for a longer period of time for a business use? And in general, the answer has been no. There's no real reason to go back to that. There's no real reason to keep that data sitting in that folder. There's no use of it. The users don't go, like John, as you said, you can't see it. Users aren't going back to that data. So from a business perspective, you know, number one thing that we always consider, the answer has been no, we don't need to retain these Copilot artifacts for any business reason. The next thing we always look to is record retention, right? Is there a legal regulatory obligation to retain this Copilot artifacts that are coming out? And in most cases, when our clients look at it, the answer is no, it's not a record. It's not relied on for running the company. It doesn't currently fall under any of the regulations in terms of what a record is. It's convenience information or transitory information that may exist in the organization. So typically, again, that next component is records. And typically, at least with Copilot, we're seeing that the initial output from Copilot, the question and the response, are not considered records. And so once you get to that point, the question is, why do I need to keep it at all? Which is, John, to what you're alluding to, you know, today for all data types, whether it's Copilot or otherwise, right, over-retention presents risks. It presents privacy risks and security risks, all kinds of risks to the company. So the preference is to retain data only for as long as you need it. And if you don't need it, the question arises, do you need to keep it at all? Do you need it even for a day? Could you make it ephemeral so that it can just disappear when it's gone because it has served its useful life and there's no other reason to keep it? Now, we always have to consider legal holds whenever we have these conversations, because if you have a legal hold and you need to retain data going forward, you need to have a means of doing that. You need to be aware of how to retain that data, how to preserve that data for a legal hold purpose if you deem it to be relevant. of it. So that's always the caveat. But typically when we're seeing people look at this, when they actually sit down and think about it, there hasn't been to date really a business or records reason to retain that data in the ordinary course of business.

Anthony: And so it's a matter of how do you enforce that? And whether, John, I mean, when we talk about ephemeral, it is retained. So ephemeral would be probably like one day, right? So it would basically be kept for one day, which raises all kinds of issues because they're there for one day. And as we've seen with other, whether it's team chats or any type of instant messaging, once it's there, and we're gonna talk a little bit about preservation, it's there, right? So for one day it's there. So let's talk a little bit about sort of the e-discovery issues and particularly preservation, which I think is the issue that a lot of people are thinking about now as they're rolling this out is, can I preserve this? So John, how do you preserve Copilot data?

John: So that's pretty straightforward, at least in one respect, which is if you're preserving a user's Exchange Online mailbox, unless you put in some kind of condition explicitly to exclude the Copilot prompts and responses, et cetera, they're going to be preserved. So they will be part of what's being preserved along with email and chats and that type of thing. So the only The only question is, and if we're going to get into this, Anthony and Therese, but the reference files, the version shared and all that. But as far as the prompts and responses, those are part of the mailbox. They're going to be preserved.

Anthony: So you're talking about a potential gap here then. So let's just talk about that. When you're doing prompts and responses, oftentimes you're referencing a specific document, right? It could be a Word document. You're asking for a summary, for example, of a Word document. it's going to refer to that document in the prompt and response. So what is or isn't preserved when you preserve the mailbox in that situation?

John: Well, I know we were talking about this before, but there's really, the question is, do you have the version shared feature enabled? Because the Microsoft documentation says if you want referenced files to be preserved as part of your Copilot implementation, you have to enable version shared. But in our testing, we're seeing inconsistent results. So in one of our tenants, we have version shared, and that's exactly what it's doing is if you say, summarize this document or use this document to create a PowerPoint, it is treating it almost like a cloud attachment. And it's it, but that's not, but that's not for preservation purposes. That's at the collection stage. It goes back to the topic that I know you guys, we talk about this a lot is, well, do I have to preserve every SharePoint and OneDrive where something might live that somebody referenced, right? And that's kind of the question there with the reference files.

Anthony: Got it. So you're not preserving it necessarily because like a modern attachment, which we've talked about in the past, it's not preserving it. Although if they're looking at a document from their OneDrive and you have the OneDrive on hold, that document should be there. So when you go to collect it, you can. Assuming that there's this setting, you have to have the version shared. So it's actually linking that attachment to this Copilot data. So a lot to digest there, but it's complicated. And again, I think you point out, this is a work in progress you have to test, right? You cannot assume that based on what we're saying, it's actually going to work that way because we've seen the same thing. You have to test and it often changes because this is a work in progress and it's constantly being changed. But that's an interesting point to think through. And again, I think from a preservation standpoint, Therese, is it required to preserve it if you have Copilot data and they're referring to a document? Is it similar to like an e-comms where we say, generally in the e-comms front where we talk about a Team's message, we always said, well, you need it because it's the complete electronic communication. So therefore to get completeness, we generally say you should be producing it in the like Copilot data, do we think it's going to be any different?

Therese: Look, I mean, I think it depends is the answer. And if you look out there, even when we're talking about e-discovery, when you look at the cases that are out there talking about links, right, links to attachments or links to something that's in an email or in an e-comm, it's mixed, right? Right. The courts have not necessarily said in every case you have to preserve every single document for every single link. Right. You need to preserve what is relevant, even with production. Courts have said I'm not going to make them go back and find every single link necessarily. But if there's something that is deemed relevant and they need that link, you may have an obligation to go back and find it and make sure that you can find that document. So I don't think it's as clear cut as you must, you know, turn on version shared to make sure that you can, you are, quote, preserving every single referenced file and every single Copilot, you know, document. We certainly don't preserve every single document that's referenced in a memo, right? Or in a document that it refers to. There's a reference to it and maybe you have to go find that. So I think that it's not really clean cut. It's really a matter of looking at your Copilot setup. And making some strategic decisions about how you are going to approach it and what makes the most sense and making sure you're communicating that, right? That the structure and the setup of Copilot is coordinated with legal and the people who need to explain this to courts or to regulators and the like. And that you're educating your outside counsel so that they can make sure that they are representing it correctly when they're representing you in any particular case that says, this is how our Copilot works. These are the steps that we take to preserve. This is why. And this is how we handle that data. And I think really that's the most important thing. We're sort of, this is a new technology. It's, we're still figuring out what the best practices are and how it should be. I think the most important thing is to be thoughtful. Understand how it functions. Understand what steps you're taking and why. So that those can be adequately communicated. I think most of the time when we see these problems popping up, it's because someone hasn't thought about it or hasn't explained it correctly. Right? And that's the most important thing is understanding the technology and then understanding your approach to that technology from a litigation perspective.

Anthony: Yeah. And I think one of the challenges, right? And I think this is both a risk and a challenge. And we've heard this from a lot of litigation teams as this Copilot is being launched is it's not always accurate, right? Like it's, and I think you maybe make the argument that it's not relevant because if I'm a user and I'm asking a question, and it comes back, and it's just answering a question, and it's wrong, but you don't know that. I mean, it's Copilot. It's just giving you an answer. Is it really relevant? What makes it relevant? I may be asking the question of Copilot relating to the case. Let's assume that it's relating to the case in some way, underlying matter. You ask a question, you get a response back. Do you really need to preserve that? I've heard from litigators saying, well, if they go to Google, and they do a Google search on that topic, we're not preserving that necessarily. So what do we think that the arguments are that it's relevant or not relevant to a particular matter?

Therese: I mean, look, relevance is always relative, right, to the matter. And I think that it's difficult with any technology to say it's never relevant, right? Because relevance is a subject matter and a substantive determination. Just to say a particular technology is not relevant is a really hard, I think, position to take in any litigation, frankly. It's also very difficult to say, well, it's not reliable, so it's not relevant. Because I can tell you, I've seen a lot of emails that are not reliable, and they are nonetheless very relevant, right? The fact that somebody asked a certain type of prompt in certain litigations could be quite relevant in terms of what they were doing and how they were doing it. But I think it's also true that they're not always going to be relevant. And there's a reliability aspect that often comes in, I think, probably less so at the preservation stage and more so at the production stage. Right, in terms of how reliable is it, is this information? Again, this is about understanding the technology. Does your outside counsel know that if you are one, are you going to take the position that we're not going to produce it because it's not reliable, right? And be upfront about that and take that position and see if that's something that you can sustain in that particular case. Can you can explain that they would not be relevant here and it's not going to be reliable in any case, right? Are you going to take that position or not? Or at the very least, if you are going to produce it, that you understand that it is inherently unreliable, right? A computer gave an answer to a question that may or may not be right, and depends on a user reviewing that. You know, if the user used it and sent it out, you can review, that's when it becomes important or valuable. But understanding the value of the data, so you take appropriate positions in litigation, so that if for some reason Copilot artifacts are relevant, you can say, well, sure, that may be on a topic that is relevant to this case. But the substance of that is unreliable and meaningless, right, in this context. So I think, I mean, one of the funny things that I always think is, right, we say email is relevant, but not all email is relevant. We preserve all email because we don't have a way at the preservation stage to make a determination as to which email is relevant and which email is not, right? But I think that's true, right, with Copilot. I mean, at the end of the day, unless you are being upfront that I'm not preserving this, or you can say this type of data, there are cases where email is not relevant, right, at all for the case, unless you could take that position. You preserve that data because you don't know which of those Copilot interactions are on the topic that could matter or could be relevant. But you're thoughtful again down the road about your strategic positioning about whether or not it should be produced or whether or not it has any value in evidentiary value in that litigation given the nature of the data itself.

Anthony: And John, you talked a little bit about this. I know you're doing some testing. Everyone's doing some testing now on collecting and reviewing this data, this Copilot data. What can you tell us about? You got prompts and responses. Are they connected in any way? Is there a thread if you're doing a bunch of them? How does it all work based on what you're seeing so far?

John: Right. Well, like a lot of Microsoft, when it comes to e-discovery, some of it's going to depend on your licensing. So if you have premium eDiscovery, then for the most part, what we've been seeing in our testing is that when you collect the chats or when you collect the Copilot information, what it's going to do is if you select the threading option and the cloud attachment option, it's going to treat the Copilot back and forth largely like it's a Teams chat. So you'll see a conversation, it'll present as an HTML, it'll show you, it'll actually collect as cloud attachments, the files that are referenced, if you've got that set up. So to a large degree, in terms of determining if things are relevant and that type of thing, you can do keyword searches against them and all of that. So at this point, what we're seeing with our testing is that for the most part, it's treating the back and forth as these chat conversations similar to what you see with Teams.

Anthony: And I'm sure there'll be lots of testing and validation around that and disputes as we go forward. But that's good to know. Okay, well, I think that that covers everything. Obviously, a lot more to come on this. And I suspect we'll be learning a lot more about Copilot and retention and discovery over the next six months or so, as it becomes more and more prevalent, and then starts coming up in litigation. So thank you, John and Therese, and hopefully you all enjoyed this and certainly welcome back. We're going to have plenty more of these podcasts on this topic in the future. Thanks.

Outro: Tech Law Talks is a Reed Smith production. Our producers are Ali McCardell and Shannon Ryan. For more information about Reed Smith's Emerging Technologies practice, please email [email protected]. You can find our podcasts on Spotify, Apple Podcasts, Google Podcasts, reedsmith.com, and our social media accounts.

Disclaimer: This podcast is provided for educational purposes. It does not constitute legal advice and is not intended to establish an attorney-client relationship, nor is it intended to suggest or establish standards of care applicable to particular lawyers in any given situation. Prior results do not guarantee a similar outcome. Any views, opinions, or comments made by any external guest speaker are not to be attributed to Reed Smith LLP or its individual lawyers.

Transcript is auto-generated.

Kategorier

Poddar Teknologi

Förekommer på

Teknik

00:00 -00:00