Student Data Harvested by Education Publishers

They haz more than u think

Posted by Billy Meinke on March 21, 2018   •   8 min read
If the product is free, then you are the product.

The above concept is something that people still find interesting when hearing it for the first time. Sure, on some level the typical user of social media platforms like Facebook and Instagram (which is owned by FB) knows this. People know that “free” technology tools make money by serving them ads for products and services that they are likely to engage with. Even if we don’t end up buying said products, it’s pretty amazing how well online ads mirror the things we want, regardless of our desire to tell anyone what itches we want to have scratched. Creepy even. Larry Lessig referred to this as “Little Brother” – something of a companion to Orwell’s Big Brother. We hope that Big Brother isn’t actually watching, but we know Little Brother is sitting on our shoulder a lot of the time, getting a sweet view of our online life.

This isn’t always a terrible thing. Think about the books you wouldn’t have read, the music you might not have listened to, or the travel deal you wouldn’t have known about if such methods didn’t exist. In a regular day you probably log onto a computer or open up a smartphone phone and view, click, like, or in some way interact with online media. Sometimes you’re shopping for something and want to see which retailer has it in stock. Other times you’re “liking” something an online friend has done, expressing some sense of support or shared enjoyment. If you’re not using any sort of ad blocker or do-not-track software, gobs of data about your online life are being collected through your web browser or apps of choice. Let’s not kid ourselves…we know it’s happening. Well, pour yourself a cup of tea and lean in, because I’m going to tell you what worries me about how this is being done to our students.

mesmerized by numbers by Hsing Wei / CC BY-NC-ND 2.0

When we go about our day to day lives as I mentioned above, companies have a pretty good idea what we are doing. They know when you search for things, which things you buy, and they know when others are doing things you like. It’s less that we are walking around with a scrolling marquee on our foreheads, and more like the jacket pocket that contains our internal dialogue has a hole in it. For me, online privacy is about controlling the flow of personal information. We give lots of it away when we let Little Brother chill on our shoulder, but what if we don’t have a very good idea of who is actually listening and what they are writing down? Truth is, most people don’t.

In our personal lives, this is on us. It’s on me and it’s on you. But consider that not all uses of our personal information are intended to simply nudge us to switch coffee brands or buy that new car sooner. Recent revelations about how the data firm Cambridge Analytics profiled and manipulated swaths of voters during the 2016 US Presidential election is an example of how this kind of data can be weaponized. You read that correctly, weaponized. That topic is beyond this the scope of this post, but you can read about more frightening examples of how personal data can be used maliciously in a March 9th article by David Golumbia and Chris Gilliard titled There Are No Guardrails on Our Privacy Dystopia.

But education is different. When we take classes towards earning a certification, or work towards a college degree, we should feel a sense of safety because we are doing things under the careful watch of an institution or organization. When we step foot into the classroom, we can take our jackets off (or remove our slippers as we do in Hawai’i) and get comfortable. We are there to learn. We are there to create and share, and to collaborate with other learners and with our instructor or facilitator. But especially in higher education (HE), large-scale classrooms involve the use of technology as management tools. The learning management system (LMS) is ubiquitous in HE; it’s where we look for learning resources that have been curated by our professors, where we submit our homework, and where we check our grades. But sure, LMSs have been around for a while. So what’s the big deal?

Well the big deal is that students are being tracked (that is, data being collected about them) more often, and to a greater degree, than we actually understand.

Whose responsibility is it to guard and monitor how student data is collected and used? Well, it depends. Data is routinely collected in various ways while we learn. In some cases, data about our learning is captured by researchers whose use of such data is governed by an ethics committee, commonly an Institutional Research Board. The IRB reviews all applications submitted by researchers that involve the collection and use of personal information (aka human subjects research), and set strict guidelines for it’s protection. Because as you now know, personal data about us can be used to cause harm. But most of the time researchers follow guidelines, adhere to policies, and protect your data. So things are all good.

Another kind of data is basic information that a student information system for an institution will have about learners. This often is just demographic information about you – your name, birth date, courses you’re registered in, etc. These data records are essential to institutional intelligence. Is enrollment rising or falling? Where? How long are students taking to complete an academic program? Universities and colleges use data about their student population to make decisions about how to best serve learners, among other things. It should be no surprise that educational technology vendors (think: publishers such as Pearson, Cengage, MacMillan) are granted access to some of this information when integrating their software tools into LMSs. Publishers need to know who needs access to which online textbook or homework tool, and this is often viewed or linked inside the LMS – this is the integration. Thing is, publishers are required to sign a data use agreement with the institution. Here is an excerpt of data use agreement language for the University of Hawai’i.

Data Re-Use. Contractor agrees that any and all Institutional Data exchanged shall be used expressly and solely for the purposes enumerated in the Agreement. UH Institutional Data shall not be distributed, repurposed or shared across other applications, environments, or business units of the Contractor. The Contractor further agrees that no Institutional Data of any kind shall be revealed, transmitted, exchanged or otherwise passed to other vendors or interested parties except on a case-by-case basis as specifically agreed to in writing by a University officer with designated data, security, or signature authority.

As you can see, UH has clear guidelines for how our institution student data is reused. It shouldn’t be shared with others unless the use is specifically agreed upon by a signing authority from the university. Makes sense. Let’s for a moment pretend that external companies don’t already have a profile for people who use their products, even if they haven’t created an account themselves. In theory, data handed over to vendors is governed by the above data use agreement. To some degree, this data is less mobile, and is in fewer hands.

halt by Hsing Wei / CC BY-NC-ND 2.0

But here’s the kicker. Listen up.

While being given a demo of a publisher’s online textbook rental program by our campus books store, I was shown an integration with our LMS, which is driven by our Student Information System. Our instutitional data feeds the LMS, populating it with the courses a student takes. The correct class(es) show up in the LMS, and students take a look at the resources and guidance a professor has set up for them there. But what actually happens is that students are presented with an End User License Agreement (EULA) when they access a textbook rental (often part of an ‘inclusive access’ program) or external homework system. The vendor needs to make sure that only students who have paid for the textbook or homework machine can access it. So each student logs in with their instutional credentials, and clicks through the EULA, often not reading a single word.

From a 2005 Electronic Frontier Foundation article:

We've all seen them – windows that pop up before you install a new piece of software, full of legalese. To complete the install, you have to scroll through 60 screens of dense text and then click an "I Agree" button. Sometimes you don't even have to scroll through to click the button. Other times, there is no button because merely opening your new gadget means that you've "agreed" to the chunk of legalese.

Each time a professor tells their students to purchase a digital textbook rental, they essentially require their students to agree to the terms buried within a EULA. These EULAs aren’t available online…believe me, I looked and even asked several publishers to share theirs (hint: they ignored my request).

These EULAs include verbiage that references user data then quietly mentions that the vendor may use this information for advertising purposes, to improve the product, or for any number of uses. And the terms can (usually) change at any time.

So let’s step back for a moment and recap the three kinds of data we’re talking about here:

  1. Data about learners that are collected by researchers at our institution, which stay within our institution
  2. Data our education institition has, which it selectively shares with outside vendors/companies
  3. Data vendors/companies collect about our students when they use their products

Should we be careful what data collect when doing research about learning (#1)? Sure. Should education insitutions be careful as they share student data they routinely collect (#2)? Of course, and they are. But who watches how this third kind of data is used? I would wager that students don’t. Professors may not even be shown the EULAs that they require their students to click through. And because this third kind of data isn’t data vendors get from the institution, it doesn’t fall under the governance policies of the institution. So who is keeping track of this?

I’m going to cut to the chase and say that the campus store probably doesn’t either.

What we have here is a situation where gobs of data are collected about students, much of it in the course of their education. Data collected about students in the course of their education must be carefully managed, and I’m worried that it’s not. I’m worried that data about our students are being collected by companies and vendors as we require them to use digital textbook rentals and homework software. And I’m worried that no one is watching. This data on its own might not be compelling, but when you consider the possibility that this data very well could be shared with third parties beyond the vendor, things get tricky. As the recent revelations about the psychological profiling and micro-targeting of data marketing firm Cambridge Analytica are proving, powerful (and potentially very harmful) things are possible when loads of data are cross-referenced.

So I’m putting the publishers to task, and am requesting they answer a few simple questions about their use of our students’ data.

  1. Where can I find the full text of your EULAs associated with the use of their digital textbook rentals (or related products) and data collected therein?
  2. What data are collected about students who use these products?
  3. Which parties ("data partners") are these student data shared/traded/sold to?
  4. What terms govern the reuse of this student data by these partners?

It may very well be that the EULAs are available, or that after publishing this blog post, the publishers are willing to share them. It also may be that no data are actually collected about students who use these products. It may also be that no data are shared with other parties beyond the direct product vendor.

But I want to hear it from them.

So, if you have access to a EULA from one of the vendors, I would be happy to recieve it via Twitter DM or my email listed on the ‘contact’ page of this website. Unlike vendor EULAs, I’m easily found.


Header image by Marcin Ignac / CC BY-NC-ND