Tutorial @ ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT) 2021

March 4-5, 2021

CRAFT session #

Workshop Description #

Our FAccT 2021 CRAFT session, “Narratives and Counternarratives on Data Practices in the Global South,” is an interactive workshop that uses storytelling as a method to question common assumptions around data practices in countries and communities that are often grouped as “the Global South.” Building on our current project around data sharing in Africa, – and our own autoethnographic observations in other countries – we will provide various theme-based stories that are informed by common narratives around data practices and invite participants to challenge these narratives from various angles including historical contexts and cultures, legal limitations, accessibility, impact assessments, accountability mechanism, and more.

Our post-conference plan is to create “Narratives and Counternarratives on Data Practices in the Global South” story cards. Our hope is that these cards then will be used by data science educators, civil society groups, philanthropic groups, inter-governmental organizations such as the UN agencies to expose them to narratives and counternarratives of data practices in the Global South and inform them about challenges before they embark on any new initiatives or partnerships.

The CRAFT Workshop is taking place on March 5th, 6-7:30 PM (UTC). Registration information can be found on the ACM FAccT website.

Organized by: Rediet Abebe, Abeba Birhane, George Obaido, Roya Pakzad.

Data Externalies Workshop #

Workshop Description #

Externalities shape the data economy as we experience it. Besides uneven market shares, power asymmetries, and high levels of data sharing, little to no reimbursement to data subjects for their contributions are characteristics of this new market. Data externalities, recently theorized in Microeconomics, offer an explanation for the absence of significant reimbursements for data. In this tutorial, we will introduce models in which data externalities arise. Through a series of case studies, we will expose crucial aspects of the contracting environment that aggravate data externalities and allow participants to develop potential interventions. We aim to both translate insights from Microeconomics for the FAccT community and highlight opportunities for further research directions at the interface of market design, fairness, and accountability.

Structure #

Introduction
Breakout Room Discussions
Network Externalities vs. Data Externalities
Data Co-op
Data Governance and AI
Conclusion

Slides #

The Data Externalities Workshop is taking place on March 4th, 4-5:30 PM (UTC). Registration information can be found on the ACM FAccT website.

Organized by: Rediet Abebe, Charles Cui, Mihaela Curmei, Andreas Haupt, Yixin Wang.

References #

Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. 2019. Too Much Data: Prices and Inefficiencies in Data Markets. https://doi.org/10.1017/CBO9781107415324.004
Dirk Bergemann and Alessandro Bonatti. 2019. The Economics of Social Data: An Introduction. SSRN Electronic Journal (2019). https://doi.org/10.2139/ssrn.3459793
Theo Bertram, Elie Bursztein, Stephanie Caro, Hubert Chao, Rutledge Feman, P. Fleischer, Albin Gustafsson, Jess Hemerly, Chris Hibbert, L. Invernizzi, Lanah Kammourieh Donnelly, Jason Ketover, Jay Laefer, P. Nicholas, Y. Niu, H. Obhi, D. Price, A. Strait, Kurt Thomas, and A. Verney. 2018. Three years of the Right to be Forgotten.
Alessandro Blasimme, Effy Vayena, and Ernst Hafen. 2018. Democratizing health research through data cooperatives. Philosophy & Technology 31, 3 (2018), 473– 479.
Yinzhi Cao and Junfeng Yang. 2015. Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy. IEEE, 463–480.
Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th {USENIX} Security Symposium ({USENIX} Security 19). 267– 284.
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel HerbertVoss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. 2020. Extracting Training Data from Large Language Models. arXiv:2012.07805 [cs.CR]
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 1322–1333.
Antonio Ginart, Melody Guan, Gregory Valiant, and James Y Zou. 2019. Making AI forget you: Data deletion in machine learning. In Advances in Neural Information Processing Systems. 3518–3531.
Aditya Golatkar, Alessandro Achille, and Stefano Soatto. 2020. Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations. arXiv preprint arXiv:2003.02960 (2020).
Ernst Hafen, D Kossmann, and A Brand. 2014. Health data cooperatives-citizen empowerment. Methods Inf Med 53, 2 (2014), 82–86.
Thomas Hardjono and Alex Pentland. 2019. Data cooperatives: Towards a foundation for decentralized personal data management. arXiv preprint arXiv:1905.08819 (2019).
Thomas Hardjono and Alex Pentland. 2019. Empowering Artists, Songwriters & Musicians in a Data Cooperative through Blockchains and Smart Contracts. arXiv preprint arXiv:1911.10433 (2019).
Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. 2020. Approximate Data Deletion from Machine Learning Models: Algorithms and Evaluations. arXiv preprint arXiv:2002.10077 (2020).
Yang Liu, Zhuo Ma, Ximeng Liu, and Jianfeng Ma. 2020. Learn to Forget: User-Level Memorization Elimination in Federated Learning. arXiv preprint arXiv:2003.10933 (2020).
Joan Rodon Mòdol. 2019. Citizens’ Cooperation in the Reuse of Their Personal Data: The Case of Data Cooperatives in Healthcare. In Collaboration in the Digital Age. Springer, 159–185.
A Pentland, T Hardjono, J Penn, C Colclough, B Ducharmee, and L Mandel. 2019. Data Cooperatives: Digital Empowerment of Citizens and Workers.
Ankit Singh Tanwar, Nikolaos Evangelatos, Julien Venne, Lesley Ann Ogilvie, Kapaettu Satyamoorthy, and Angela Brand. 2020. Global Open Health Data Cooperatives Cloud in an Era of COVID-19 and Planetary Health. OMICS: A Journal of Integrative Biology (2020).
Yinjun Wu, Edgar Dobriban, and Susan B. Davidson. 2020. DeltaGrad: Rapid retraining of machine learning models. arXiv:2006.14755 [cs.LG]
Minhui Xue, Gabriel Magno, Evandro Cunha, Virgilio Almeida, and Keith W Ross. 2016. The right to be forgotten in the media: A data-driven study. Proceedings on Privacy Enhancing Technologies 2016, 4 (2016), 389–402.
Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2016. Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530 (2016).

Speakers and Organizers #

Rediet Abebe, Harvard University #

Rediet Abebe is a Junior Fellow at the Harvard Society of Fellows and an incoming Assistant Professor at the University of California, Berkeley. Her research is broadly in algorithms and AI, with a focus on equity and justice concerns. Abebe holds a Ph.D. in computer science from Cornell University as well as an M.S. and a B.A. in mathematics from Harvard University and an M.A. in mathematics from the University of Cambridge. She was recently named one of 35 Innovators Under 35 by the MIT Technology Review, in part for her work with MD4SG. Her research is deeply informed by her upbringing in Ethiopia. Abebe co-founded and has been co-organizing the MD4SG initiative since fall 2016.

Abeba Birhane, University College Dublin & Lero #

Abeba Birhane is a Ph.D. candidate in cognitive science at the school of computer science at University College Dublin, Ireland & Lero — the Irish software research centre. Her research explores questions of ethics, justice, and bias that arise with the design, development, and deployment of artificial intelligence.

George Obaido, University of Witwatersrand #

George Obaido is a research associate at the University of the Witwatersrand and the University of Johannesburg in South Africa. His research interests lie in using natural language processing techniques to find solutions to problems of societal importance. He completed his PhD in Computer Science at the University of the Witwatersrand in Johannesburg, South Africa, under the guidance of Prof Abejide Ade-Ibijola and Dr Hima Vadapalli. He also obtained his MSc in Computer Science from the University of the Witwatersrand.

Roya Pakzad, Taraaz #

Roya Pakzad is the founding co-director of Taraaz, a research and advocacy non-profit working at the intersection of technology and human rights. Previously, she served as a Research Associate and Project Leader in Technology and Human Rights at Stanford University’s Global Digital Policy Incubator (GDPi). She also worked with Stanford’s Program in Iranian Studies on the role of information and communication technologies and human rights in Iran. Roya holds degrees from Shahid Beheshti University in Iran (B.Sc. in Electrical Engineering), the University of Southern California (M.Sc. in Electrical Engineering) and Columbia University (M.A. in Human Rights Studies).

Charles Cui, Northwestern University #

Charles Cui is a Ph.D. student in computer science at Northwestern University. His research interests lie at the intersection of theoretical computer science and economics. He is passionate about applying tools from algorithmic and mechanism design to solve problems in data economies and environmental protection. He hopes to better understand and guide human behavior and system design to bring positive social impact to the world we live in. Charles received his bachelor’s degree from Oberlin College in 2020, where he studied mathematics and computer science.

Mihaela Curmei, University of California, Berkeley #

Mihaela Curmei is a PhD student in the Electrical Engineering and Computer Science department at Berkeley. Broadly her interests lie at the intersection of Machine Learning and Control Theory. Her research aims to improve the reliability, safety and accountability of decision making systems by developing techniques to guarantee desirable properties in feedback driven interactions between ML systems and society. Prior to Berkeley, Mihaela worked as a data scientist at Microsoft and completed her bachelor’s studies at Princeton University with a degree in Operations Research and Financial Engineering.

Andreas Haupt, Massachusetts Institute of Technology #

Andreas Haupt is a graduate student researcher at the Massachusetts Institute of Technology’s Institute for Data, Systems, and Society. His research is broadly in Mechanism Design, Microeconomic Theory and Systems Engineering. Andreas holds M.S. degrees in Economics and Mathematics from the University of Bonn, Germany, as well as B.S. degrees in Mathematics and Computer Science from the University of Bonn and Frankfurt, Germany, respectively. Andy has recently worked as a trainee at the European Commission’s Competition Authority and helped a German school centre and its students with the productive use of digital technology.

Yixin Wang, University of California, Berkeley #

Yixin Wang is a post-doctoral researcher in the Electrical Engineering and Computer Science department at Berkeley, advised by Professor Michael Jordan. She works in the fields of Bayesian statistics, machine learning, and causal inference. Her research interests lie in the intersection of theory and applications. She completed her PhD in statistics at Columbia working with David Blei and her undergraduate in mathematics and computer science at the Hong Kong University of Science and Technology.