Data collected through Coursera profile – what are the ethical issues at stake?

The data you can enter through the Coursera “profile”, while it is not compulsory, promises potential career opportunities, new connections and course recommendations. This may be of less interest to those who have an existing level of education, a job and existing connections; however, if you are visiting Coursera with the hope of increased career opportunities, it may be significant (although it is difficult to tell without further analysis or Coursera data).

A quick check of the Coursera privacy policy reveals that ‘general course data’ and site ‘activity’ may be shared with ‘Content Providers and other business partners’, including personally identifiable information, and ‘Content Providers and other business partners may share information about their products and services that may be of interest to you where they are legally entitled to do so’.

In addition to the site activity data (presumably course searches, enrolments and so on) collected by Coursera, the additional information you can provide to personalise your ‘learning experience’ and recommendations is fairly extensive, including work experience, education, career goals, location, age and gender:

Coursera profile Coursera profile

While it is possible through this page to limit the information to ‘only me’, ‘the Coursera community’ or ‘everyone on the web’, presumably the privacy policy will still allow Coursera staff, and associated ‘content providers’ or ‘business partners’ to access and analyse this data. The options presented (for which I selected ‘only me’ in all cases) give a slightly false and misleading sense of privacy, since the privacy policy outlines that it should say ‘only me, plus Coursera staff, Content Providers and business partners’ – that is, assuming I have understood the policy correctly, although presumably many will never read it at all.

There do not seem to be any options (easily visible on this page, at least) for hiding all your data from everyone else (including Coursera staff, ‘Content Providers’ and ‘business partners’), nor do there appear to be any options for customising who can view your site activity. For an educational site – where some may be following a promise of improved career opportunities, and where everyone is not beginning at the same ‘starting point’ – it, in my opinion, seems appropriate to be able to hide all data from everyone.

Viewing different Coursera courses to influence my recommendations

The tweaks to my Coursera profile and learning plan have had a fairly limited effect so far on my Coursera recommendations.

I notice my recently viewed courses have an impact, so will look to alter this:

Recently viewed courses in Coursera
Recently viewed courses in Coursera

First, I am switching my profile and learning plan to a nurse in the healthcare industry:

Coursera profile
Coursera profile
Coursera learning plan
Coursera learning plan

…courses that others identifying themselves as nurses now appear…

Coursera - 'People who are Nurses took these courses'
Coursera – ‘People who are Nurses took these courses’

Notably, the first is “nursing informatics” – could this be another example of information technology dominating results?

I view some courses related to ‘Everyday Parenting’, ‘Mindfulness’, ‘Well-Being’ and ‘Buddhism and Modern Psychology’ and ‘Social Psychology’.

Below some more computer science/information technology degree recommendations…

Coursera 'Earn Your Degree'
Coursera ‘Earn Your Degree’

There are some courses displayed on ‘Personal Development’. Many are not particularly related to the areas I specified, however it is a rare opportunity to see recommended courses that are not computer science or information technology.

Coursera Personal Development
Coursera Personal Development

My explorations seem to again show a privilege towards computer science subjects – again, not surprising given the background of the founders.

However, this limited focus does seem slightly at odds with Coursera’s own slogan:

‘We envision a world where anyone, anywhere can transform their life by accessing the world’s best learning experience.’
(About Coursera)

As previously discussed, the approach for those from a Western university-educated computer science background to build something for themselves, raise funds through investment but then market it as a “universal” solution that is “best” for all appears quite common.

Tweaking my “profile” in Coursera to “software engineer”

Further to setting my initial (false) “profile” and playing with my Coursera “learning plan”, I have now tweaked my profile to indicate I am a software engineer at “Executive Level” at Facebook, with a masters :

Tweaking my Coursera profile
Tweaking my Coursera profile

I have also set my learning plan so that I am a “software engineer” in the “technology” industry:

Coursera learning plan
Coursera learning plan

The key difference here is the recommended course list, which now suggests courses that other software engineers have taken:

Coursera recommendations
Coursera recommendations

There seem to be a wealth of courses in the area, which is perhaps unsurprising given my other experiences of the site so far.

How is a YouTube search for ‘algorithms in education’ altered when I am signed in?

Below are the results of a YouTube search for ‘algorithms in education’ – on the left I am signed in, on the right I am not.

YouTube search results - comparing the difference when signed in
YouTube search results for ‘algorithms in education’ – comparing the difference when signed in

The results are subtly different – when I am signed in, slightly more ‘advanced’ videos about algorithms (one from Harvard University) are displayed. Perhaps this is due to information Google holds on my age and education, or due to the fact I have watched and liked a number of longer university lectures and interviews on this course.

This speaks to the way algorithms are ‘ontogenetic, performative and contingent’ (Kitchin 2017: 21) – they are not static nor fixed, vary from user to user, from location to location and can often involve randomness.

Changing my “learning plan” in Coursera

I changed my Coursera “learning plan” to indicate that I am a Teacher/Tutor/Instructor in the Education industry, to compare the results with my previous exploration of Coursera.

The results are more varied (and not exclusively focused on software development or the “tech” industry), however there are still various programming, data/computer science and business options presented (despite expressing no preference for this kind of industry):

Coursera recommendations
Coursera recommendations after altering my “learning plan”

Coursera recommendations (based on false data) – what inclusions and exclusions are apparent?

I am experimenting with inputting false information about myself in Coursera, in order to see the difference in algorithmic recommendations. Here is how I described myself…

False data provided to Coursera
False data provided to Coursera

… and here are some recommendations provided after entering the above data…

Recommendations provided by Coursera
Recommendations provided by Coursera

The top listed courses are exclusively technology-based and “offered by” Google, and appear to have no direct connection to my listed industry “Health and Medicine”…

While my explorations were very limited here, in some ways this seems fairly consistent with my experiences of using certain (but not all) MOOC or educational course/video sites (and even more general “apps”). As soon as you step outside of the area of computer science, the range of courses is more limited, despite the sites themselves being presented as general educational sites. In looking to change my “learning plan” options (which change your profile and recommendations) revealed the “default” or “suggested” text, presented before you enter your own profile options:

Setting your Coursera "learning plan"
Setting your Coursera “learning plan”

You can see the results of my profile/”learning plan” alterations here. However, at this stage of deciding my profile options, the “software engineer” who works in “tech” seems to be the “default” starting point here. This is all perhaps no surprise given that Coursera was set up by Stanford computer scientists; as often seems the way, the developers build something for themselves (ensuring a seamless user experience for their own circumstances) and then only later branch out.

One example outside of education here is the online bank Monzo, whose early customer base was ‘95% male, 100% iPhone-owning, and highly concentrated in London’ (Guardian 2019). This description mirrors the co-founder Tom Blomfield, as he himself admits:

‘Our early customer was male, they lived in London, they were 31 years old, they had an iPhone and worked in technology. They were me. I’ve just described myself. Which has huge advantages, right? It’s very easy to know what I want.’ (The Finanser 2017)

While Monzo does claim to have a focus on social inclusion (This is Money 2019), why is this always seemingly secondary to building the app, gaining users (similar to themselves) and getting investors on board? Should social inclusion, whereby apps are designed for all users in a democratic fashion where everyone has a say, not be inherent in the very beginning planning, design and development processes? There may be a place here for considering platform cooperativism, inclusive codesign and participatory design approaches here (see Beck 2002; Scholz and Schneider 2016; West-Puckett et al. 2018).

Coming back to education, if Coursera have taken a similar approach as Monzo to designing their platform and building up their catalogue of courses, it is perhaps concerning that who do not mirror the designers and developers may be left excluded and on the margins.

Conversely, an inclusive codesign approach may have produced different results. As Trebor Scholz (P2P Foundation 2017) explains:

‘The importance of inclusive codesign has been one of the central insights for us. Codesign is the opposite of masculine Silicon Valley “waterfall model of software design,” which means that you build a platform and then reach out to potential users. We follow a more feminine approach to building platforms where the people who are meant to populate the platform are part of building it from the very first day. We also design for outliers: disabled people and other people on the margins who don’t fit into the cookie-cutter notions of software design of Silicon Valley.’
Trebor Scholz (P2P Foundation 2017)

FutureLearn recommendations

Here are my recommendations from FutureLearn, at least in part likely informed by some of the MOOCs I signed up to while deciding upon my micro-ethnography. These MOOCs include:

FutureLearn recommendations
FutureLearn recommendations

Signing up for these MOOCs appears to have affected these recommendations fairly significantly, given there are recommended courses in the areas of research, security and programming. However, there appear to be few (if any) courses directly touching on the areas of anthropology and music (which my enrolled courses cover); this may be due to lack of currently available courses although there may be other reasons.

How have other people been involved in shaping results?

It is not clear (at least from this page) how they make the recommendation decisions, but there may well be algorithmic ranking based on sponsorship, course popularity or “staff picks”. Therefore, it’s possible that other students’ enrolments or FutureLearn staff decisions may alter my recommendations.

Do results feel personal or limiting? Is this optimisation, or a you loop’?

I don’t think I would normally make use of the explicitly labelled recommendations, however I often make use of the search function which may include similar algorithmic ordering and ranking. The choices here seem fairly limiting, almost persuading me that – in order to be an “expert” – I should study them. There seems to be the assumption that I would choose to study a similar course to one I have studied before, even though in reality I would probably want to look at something completely different.

What might be the implications?

My concern, looking at both the general catalogue of courses and the recommendations (albeit very briefly), is that certain subjects appear privileged over others (there are a great deal of courses on computer programming, for instance). As mentioned above, this may be down to many other factors (such as course availability), however it would be interesting to see how course enrolment numbers impact upon the ranking. I personally would find this a little disconcerting – I wouldn’t want a course that simply has high enrolment numbers to be privileged in my recommendations. As elsewhere in education, just because a course may have lower numbers or generate less money, it doesn’t mean it is any less important.

Further play with Google autocomplete

There appear to be some fairly binary options about technology being “good” or “bad”, and dominant ideas of ‘success’, presented here… could this be mainly influenced by what others have searched? Or by a prevalence of articles supporting these positions?

'Is technology...' Google autocomplete
‘Is technology…’ Google autocomplete
'How to succeed...' Google autocomplete
‘How to succeed…’ Google autocomplete

What might be the effect of ordering and ranking our feeds or timelines?

In browsing through my Facebook and Twitter feeds today, and reflecting on the relation between power and algorithms (Beer 2017), a friend posted on Facebook a Guardian article asking ‘Why don’t we treat the climate crisis with the same urgency as coronavirus?’

Immediately below, a post was displayed from another friend (from different circles) linking to an article from Business Insider about the importance of hand washing and the coronavirus.

It’s difficult to tell why the algorithm placed these posts next to each other – whether there was a connection between the two, or whether both were deemed independently likely to exhibit some kind of ‘positive reaction’ (or reaction of another sort) from me. However, it did make me reflect on how initially – to a degree – these posts felt (at least to me) “pitched” against/at odds with one another (by being placed in close proximity) and exhibited initial reactions that I might not have otherwise felt had I seen the posts independently.

Twitter’s algorithmic restructuring of timelines

Turning my attention to Twitter, this kind of algorithmic restructuring of timelines, at times using “deep learning”, initially caused controversy and an #RIPTwitter “campaign”.

@mjahr‎ talked about these early efforts on the Twitter blog…

‘…when you open Twitter after being away for a while, the Tweets you’re most likely to care about will appear at the top of your timeline…’

(@mjahr‎ on Twitter blog, 2016)


…and went on to praise their “success”…

‘We’ve already seen that people who use this new feature tend to Retweet and Tweet more, creating more live commentary and conversations, which is great for everyone.’

(@mjahr‎ on Twitter blog, 2016)

 

Furthermore, in this HootSuite blog post, criticism of the Twitter algorithm was downplayed somewhat, stating that ‘the algorithm drove more engagement from users’.

Yet what does ‘engagement’ mean, how do they know what we ‘care about’ and how can they possibly prove that it is ‘great for everyone’? Driving traffic and increasing usage may be ‘great’ for both Twitter and HootSuite – companies who attempt to derive profit in one way or another from such usage – however, this kind of biased uncritical language is perhaps all too common in some circles (such as those who are profiting through advertising or sale of associated products).

Much has been written about the questions and issues that algorithms such as these raise, not least the relationship between political rhetoric and social media; for example, by Oliver (2020). I continue to read, explore and critically reflect, particularly pondering over what this might mean in an educational context (such as our own use of Twitter during this course)…