The Fast Forward Labs Newsletter
View this email in your browser
The Ethics of Machine Intelligence

There are two approaches to the ethics of machine intelligence.

The first is speculative, and is concerned with the prospect of an "intelligence explosion" where machines pose unforeseen threats to humanity. Spearheaded by Oxford philosopher Nick Bostrom, the ethics focus on imagining potential risks and carefully orienting research to mitigate them. OpenAI and the Future of Life open letter are inspired by this line of thought. 

The second is practical, and is concerned with the inadvertent impact machine intelligence is already exerting on society today. Here, the ethics focus on explaining technologies that are not yet well understood to inform choices about their use. As an applied research company, Fast Forward Labs believes in understanding the practical consequences of emerging technologies. 

This newsletter presents four issues at stake in the practical ethics of machine intelligence.

1. Bias in Data and the Gap Between Accuracy and Fairness

As Montreal Professor Yoshua Bengio recently remarked, our society is riddled with "social, economic, and political issues that progress in AI will bring to the fore." Indeed, statistical methods reveal inequalities like income disparity between gender or race, but lack the contextual awareness required tune themselves to adjust for bias. Automating decisions wholesale therefore risks perpetuating or amplifying existing bias. That's why the FTC now encourages companies that use algorithms for credit, loans, or employment to evaluate data and models before making decisions that can impact individual lives. 

The key point here is that accuracy does not always equal fairness. Researchers at Carnegie Mellon have shown this in their Google Ad study, where women were presented with fewer ads for high-paying jobs than men. At a Future Tense event in December, Laura Moy mentioned car insurance rates were modified for night-shift workers given assumed correlations between risk and late-night driving. Here, models perform accurately, but that does not mean we want to accept their conclusions. 
Laura Moy and David Auerbach of the New America Foundation discuss algorithmic discrimination. 
In other instances, our models are not yet sophisticated enough to detect nuances in data. Our own Mike Williams has presented on the tendency sentiment analysis algorithms have to amplify male viewpoints because men use "strident, unambiguous expressions of emotion" more often than women. Models that use geolocation data to locate mobile devices has been found lacking, leading to bizarre events like people bombarding private homes in search of lost cell phones.

How policy will adapt and evolve to address new technologies remains to be seen. In Big Data's Disparate Impact, Solon Barocas and Andrew Selbst showed that existing anti-discrimination laws will require reform to hold companies accountable for algorithmic discrimination in employment. The team at Upturn is working on explaining new technologies to policy leaders to help inform future legal and policy changes, as in their recent profile of predictive policing practices in Fresno, CA. 
2. Ownership and Agency

We'll release our report and prototype on recurrent neural networks for text summarization in February. Our prototype takes an extractive approach to summarization, selecting sentences that represent the key points in a document. A question arises: who wrote the summaries, and who owns their IP? The original author, the coder who write the algorithm, the user of the algorithm, or the algorithm itself? 

Some IP law suggests that the original author owns rights to the content. Last year, Google shut down its extraction news service in Spain after a new law was passed that would charge Google a fee for all content reuse. This area of law is evolving, and we expect further action as text analytics tools become widespread. To explore this question in an artistic setting, Darius Kazemi designed his submission to the 2015 NaNoGenMo contest to co-author a novel with an algorithm: the algorithm suggested sentences for him to select, as employees now select for tone and style in BI reports written by language generation tools.
But what constitutes ownership or authorship in the first place? That's a big question, but there's some relationship with individual agency, which may need to adjust to increase adoption and realize the potential of virtual assistants. Daniel Tunkelang wrote a provocative piece recently citing our unwillingness to transfer agency to machines - to allow them to represent us. Another obstacle to machine intelligence adoption is organizational reluctance to use systems that provide probabilistic results. 
3. Internal and External Blind Spots

Personalizing experiences and recommendations for consumers is the goal of many data science efforts. Unfortunately, as Zeynep Tufecki argued in a recent post, the focus on "engagement" as a key metric for deciding which content to share on user feeds can happen "to the detriment of the substantive experiences and interactions" she and others want on sites like Facebook. 

Eli Pariser, CEO of Upworthy, described this blinding effect as a "filter bubble" in his 2011 book. We interviewed him to learn how his views are evolving as machine intelligence advances. With deep learning on the rise, companies are experimenting with novel techniques to improve recommendation algorithms.

But deep learning also presents internal external blindspots because the algorithms are very hard to interpret. The desire for algorithmic transparency has lead some to remain skeptical about the practical value of neural networks (we think they are immensely powerful but that their limits should be respected). 
4. Privacy

Last week's announcement that Yahoo! is releasing a massive data set generated excitement across the open source community. De-anonymization practices are widespread by now, so there's little risk that Yahoo! will repeat the User 927 mishap AOL fell into in 2006 (which inspired a play). Nonetheless, personalization is tightly entwined with privacy, and we advise clients analyzing user behavior at scale to consider stream data methods because they reduce the risk associated with retaining and storing data in a traditional batch analysis. 
Privacy is difficult to tackle because laws and requirements differ across state and national jurisdictions. Laws have different data transfer and breach notification requirements, and even define what qualifies as personal information differently. In a recent interview, Anita Allen and Lisa Sotto explain how Europe views privacy as a human right, while the United States views privacy as a consumer right, resulting in the complex split in requirements across industries. HIPAA continues to confuse healthcare providers and their affiliates (or Business Associates) because the definition of protected health information depends on data flows and does not cover all medical information. 

There's still a gap between the data science community and the data privacy/security community, and opportunities for collaboration. We recommend Dan Solove's Privacy & Security Blog, the DLA Piper Data Protection Handbook, and the OECD Privacy Principles as starting resources.
Finally, we're hosting an online talk January 26 about recent developments in natural language generation. Join us

Best wishes,
Kathryn and the Fast Forward Labs Team
Copyright © 2016 Fast Forward Labs, All rights reserved.

unsubscribe from this list    update subscription preferences