Reading Time: 10 minutes
I have been unhappily surprised by the apparent lack of usage data available to academic law libraries. After talking to some folks at the MAALL annual meeting, my own experience seems to be a common one. While our colleagues in other law library contexts may have rich usage data, legal publishers do not make this data available to law schools. It’s something that needs to change.
I also got the sense that perhaps we weren’t all talking about the same thing. I would be interested if anyone has a usage report from a major legal publisher they’d be willing to share, whether you’re in a law school or not. I have heard from a fair number of people that they have no data. As poor as the data is that I’ve received, I guess I’m a half-step ahead. I have been asking around and, I’ll be blunt, what the legal publishers have provided so far isn’t acceptable. It is also less than what they are giving their other customers. I know because I have received that more expansive data.
The Need for Data
It seems unnecessary but I will explain why I think you need data about electronic license use in law libraries. I realize that there is one view that, since you can’t cancel or walk away from some contracts—which I think is open for discussion—that usage data doesn’t matter. It won’t inform a purchase or collection retention decision.
I think that’s short-sighted. There are two reasons. First, if we do not know what is being used, we do not know what is not being used. If law students are participating in legal research instruction and then using the resources in a substantially different manner, that would be important to know.
Similarly, what does faculty use tell us? Nothing, if we have no usage data. It might tell us that student use is aligned with how experts like faculty use the resources. It might highlight differences that we should think about. For example, in the Thomson Reuters data I received (and shown below), students tended to use a current version-only of administrative codes. Faculty tended to use the current+historic version of the administrative code for the same jurisdiction. Why? Do we need to think about the impact of those choices? Is it because they mean to make those choices? Do we know why?
Second, there may in fact be reasons to drop or rethink the content we license from even the largest legal publishers. It is bananas to me that I have to account for $10 in public transit to and from a conference site on an employee reimbursement but I do not need to account for $75,000 for an electronic license beyond having a vibe. We cannot make informed decisions about our content if we do not know if it is being used.
Even if you do not feel the need to use it in your purchasing or licensing decisions, you should be thinking about its value to your research instructors and its place in your collection. Let’s say you are licensing electronic formats of content that you used to keep in print. The reason you collect it still is because you think it’s valuable and should remain in the collection, regardless of format. But if it’s not being used, and it had remained in print, what would you have done with it? (you are measuring internal circulation, aren’t you, to determine print usage?)
I hope the answer would usually be that you’d weed it. Why would you not want to be prepared to have that discussion about your electronic titles? I realize that your legal publisher vendor may not be able to segment content in a way that works for your collection development choices, but I think we should be licensing content with our eyes open about what is providing value and what isn’t.
Let’s look at the data now.
Thomson Reuters Westlaw
The data I got back from Thomson Reuters was a bald ranking of databases by usage. It showed me which databases were used by students and faculty, with 1 being the most and the highest number (1500-ish for students, 650-ish for faculty). What does that ranking consist of? No idea. I don’t know whether it’s searches or other transactions, or time spent.
I have so many questions. The rankings have a very very long tail. What accounts for a top ranked database? How many searches? How much time is spent? Where do people go after they use that database? The latter has never been achievable, as far as I know, even though I can do traffic flow with website analytics and have been able to for a decade. Color me unpersuaded that the legal publishers cannot also provide that information.
Why is it that I can get better analytics and traffic flow for my website and I can’t get that sort of traffic analysis for legal research? We are teaching a process, a process that will get better the more the person repeats it. What an amazing impact if we could see whether the process was being followed? But legal publishers fail to provide data that is common from free applications like Google Analytics or Matomo for web-based content.
I was intrigued that there was a substantial amount of overlap in the most common databases. Some of the choices make sense to me although I wonder if there is an opportunity to bring the source usage closer together. For example, the number one resource faculty used within Westlaw was Westclip. For law students, that ranked 26. Some of that distance is due to scale: more law students, nearly 4 times as many databases used. If I were to normalize the rankings, I expect they’d be closer.
Legal research instructors may be curious about where the Key Number system falls in the rankings. I loved Steve Probst‘s (from University of Arkansas) analogy that these are the hashtags of our newest law students’ legal research.
If Westclip is my number one resource for law faculty in Westlaw, perhaps I should be understanding why? And do I need to ensure that faculty know how to utilize the same tools in other legal research databases? If there is a demand for alert services—which seems obvious as well as probable—then there is a service push we could make to ensure that faculty have that knowledge. Or maybe we acquire a more robust news-gathering or alert tool, one that law students might see in practice.
I’m open to walking away from any vendor. It wouldn’t be the first time, having gone “sole source” (dropping either LexisNexis or Westlaw) before. If we are teaching information literacy and not marketing a legal research database, that should be something relatively easy to do. A literate new lawyer will be able to adapt to any tool.
LexisNexis
Just as Thomson Reuters and RELX are the yin and yang of legal publishing, so their usage data is at the opposite ends of functionality. I was able to get an annual report of our usage data for our law school. How would you use this data?
LexisNexis provides no information on what is being used. If faculty are using the Lexis alert functions, I have no idea. I’m surprised by the November 2023 numbers. Why was Shepards stable in November but searches plummeted to 5% of the previous month. Why are searches (which I expect include cases but what else?) so high in October 2024 (and 35% higher than the previous October) when Shepards usage is nearly zero? Why are document views so variable, going from almost 1 to 1 in May 2024 to over 10 to 1 in December 2023?
Here’s the thing. I have a ton of hypotheses. These include changes in how Lexis+ is configured, which directs people in a different way to different resources. Traffic flow impacts analytics. This would let me ask our account people what has changed. It could mean our instructors are teaching differently. Or that faculty have switched up their course timing or that there is a different mixture of courses. Or that there are journal publication timelines or other assignments that happen to line up with a given month.
There are just no answers. And, often, there are patterns in data that can help you understand what is going on. Is there suddenly a need to show law students (or maybe faculty? who knows, this data doesn’t show) how citators work? Or why, from a professional competence perspective, they should be using them?
I asked the Lexis+ representatives at the MAALL conference about the AI usage data. We learned that a researcher can keep their AI conversations for 90 days. Conversations can consist of up to 10 interactions (question prompts and answers). In other words, LexisNexis is capturing transactions and sub-transactions (sub searches, filters) within their AI product.
How do I see those? Using the LexisNexis data above, is an AI conversation a search? Is a sub-prompt within a conversation a search too? If someone starts a conversation, how often do their conversations only end up with 1 interaction? Will we see our search data jump by 10x as conversation elements are tracked? How many conversations invoke no sub interactions, or, alternatively, use all 10, to make their research more precise? In other words, how many are maximizing the efficiency of the AI process and how many are taking the “AI is a starting point” mantra a bit too literally.
No idea. I am sure LexisNexis has this data. It should be customer-facing for my researchers’ usage.
Bloomberg Law
I’m aware that other law libraries have cancelled Bloomberg Law. Given its very small footprint in the legal profession, I was curious about how it was being used. I had quizzed our librarian team and so I had some anecdotal expectations. I had worked at the ABA and so I knew about the professional responsibility manual and also the BNA content’s reputation, having worked in a labor and employment law firm. What was the most important content used?
Here’s the data I received.
[Left blank intentionally]
Nothing. Turns out, Bloomberg Law has accounted for a substantial part (about 15%) of our library’s electronic license budget and has captured no data on our usage. This omission has existed for some number of years, so it’s not even like there’s some historical data I can look at. I am grateful to their staff for activating this for us so we may get future data but it means I have no idea how or if anyone is using this expensive resource.
I’m not entirely without data avenues. I can ask the faculty do you use Bloomberg and when did you last use it? But this shifts the conversation from an organizational, strategic one to a tactical one. A move off Bloomberg Law with certain faculty relying on it will mean moving those faculty to new resources or having hard discussions around money. This is fine, except that I don’t really have any way to quantify what their usage is or what the impact will be on them. It’s a binary investigation: you use this, yes or no? How much is unknown.
Unlike making a decision to keep either Westlaw or LexisNexis, Bloomberg Law is a resource I know plenty of law schools do without. Very few practitioners use the resource: the highest number I have seen is just over 6% and that’s localized in large law firms, with the number dropping unsurprisingly as you expand to the wider solo and small firm market. If no one is using the proprietary content within Bloomberg Law system, it’s not clear what value it offers. Some libraries have cancelled print titles to shift usage to Bloomberg Law. However, if you have no data to show it’s being used, no matter how storied the content may be, there’s no value for the cost of providing that access.
Lexis Digital Library
This is a legal publisher choice. Lexis Digital Library is a great example of actionable user data. This may be because the usage data is made available through OverDrive’s Marketplace. Lexis+ may not be engaged to provide data but other Lexis entities can.
Inside the Marketplace, I can run a variety of reports. If your library’s authentication is configured in a certain way (SAML, not EZProxy), you can distinguish your students and faculty. You can see content usage and you can also see user activity. Every report can be downloaded to Excel so you can crunch your own numbers or create your own reports.
Bring Usage Data Out of the Shadows
I’m pretty irritated. People who have worked with me over the years will know that this is the sort of thing that drives me around the bend. If you’ve followed my blog, you know that data is important to me. While I can make a decision without complete information, it seems like professional malpractice to make electronic collection decisions without data. It’s important for making decisions to add or remove content, to start up or sunset service initiatives. It’s vital to telling the story of what we are doing with the money entrusted to us.
Ashley Russell (from University of Cincinnati) gave a great MAALL presentation on working with vendors. One area she touched on was the gray area of courtesy support by our account representatives. They try to help us navigate or get information in gray areas where the information may not already be compiled or be accessible. I want to acknowledge that, because it’s how I received all of the data I have received so far (except Lexis Digital Library, where I was given a tutorial and user account and was let loose, much to my delight). There is no way for me to request this data myself or get regular reports. It relies on me having a relationship with an account representative who has relationships themselves with colleagues within their business that can provide this data. This is a fragile information pipeline.
We should not be relying on the kindness of strangers when we are paying substantial amounts for legal research licenses. If the vendors aren’t providing this information directly, we need to require it. Otherwise, as I have now found with Bloomberg Law, when I need to make a decision based on data, I may find that we have no data.
When I worked for the lawyer regulator in Ontario, Canada, we ended up putting this in our license. I’ve written on this before, about how you have to ask in your license for what you need. This is the clause I used for LexisNexis Canada:
[Publisher] will be required to provide [Client] with [a hard copy and] an electronic copy of the utilization reports for the [licensed content] on a monthly basis in [specify application, like Microsoft Excel] format. The utilization report will provide details about usage by
(a) sub-account,
(b) password,
(c) client or matter information,
(d) database name,
(e) amount of time used,
(f) the type and
(g) number of transactions,
(h) the costs associated with each transaction,
(i) the number of concurrent Users accessing the database at any given time, and any other related information.
In addition, [Publisher] will provide [Client] with definitions of transaction types and other terms, and a price list to explain the cost structure for databases and related transactions.
Why these items? Because these facets are things that LexisNexis was capturing. So this is not a wish list so much as a list of actual data points that a legal publisher is capturing. I think it could be made generic, to ensure you capture what you want but the publisher can use the label they have internally for it (data source v. database name, that sort of thing).
Keep in mind this was for what I think is one of the hardest types of law libraries to get data for. It was one that was publicly accessible or where the passwords are not attached to specific individuals. Legal publishers do not have any excuse in environments like law firms or law schools where every user is an identifiable individual.
Our renewals with Thomson Reuters or RELX are not imminent but it will give me time to work on this clause. For example, I would swap out (b) passwords for something that allows me to distinguish staff, students, and faculty. We know the legal publishers can make this distinction because they have already done it (faculty portals, etc.).
Here is my initial thinking on a more law school-centric version:
[Publisher] will be required to provide [Client] with an electronic copy of the utilization reports for the [licensed content] on a monthly basis in [specify application, like Microsoft Excel] format. The utilization report will provide details about each transaction including:
(a) a data element that records time and date of transaction,
(b) a data element that records the transaction type (search, AI conversation, document view, print, etc.) and which specifically designates which interactions used AI,
(c) a data element that records client, matter, or project information,
(d) a data element that records in detail the database or source name in a form that is not abbreviated,
(e) a data element that distinguishes students, staff, and faculty user accounts [not individual users necessarily],
(f) a data element that records the notional costs associated with each transaction,
(g) a data element that records the duration of the transaction or, in the case of an AI conversation, the number of interactions in the conversation.
In addition, [Publisher] will provide [Client] with definitions of transaction types and other terms, and a price list to explain the cost structure for databases and related transactions.
More importantly, it seems to me that these gray, courtesy areas of our relationships with vendors should be better documented. I will be watching for the things that we are paying tens of thousands of dollars for and still only get on sufferance. I will not feel as though I’m being a good steward of our funding if I can’t explain to others how we are spending these huge amounts of money.
P.S. If anyone has either good examples of usage data reports from Westlaw, LexisNexis, or Bloomberg in the U.S., or good clauses to ensure that you are getting them, I would love to hear from you. Maybe we could create a bank of contract clauses that we could all use with legal publishers for standard data, in lieu of COUNTER or SUSHI.