Designing AI for Social Good

For this project, you will work with the same team as Project 2 to improve the AI application you built, making it less likely to cause unintended harms and more resilient against malicious users.

On this page:


Learning Goals

  1. Systematically anticipate unintended harms an AI product could cause in the messy real world, and trace them to their origin in the product lifecycle;
  2. Understand fairness as harms falling unevenly on people, and make deliberate design choices to prevent and mitigate unfairness;
  3. Understand that new harms can emerge after deployment through human-AI interplay, and build in mechanisms to detect and mitigate them.

Starting Point

In the previous project, you built an AI application and tested it with classmates who understood the task and were motivated to give good feedback. Now imagine it in the hands of the users your Project 2 task was actually designed for:

  • AI & Programming: Your application is now part of Instagram, helping people from all walks of life to express personal experiences through p5.js but cannot code.

  • AI & Dating: Your social connection app is now part of a platform like Hinge, helping real users find connections they desire.

  • AI & 3D Assembly: Your LEGO/IKEA assistant is now part of a LEGO/IKEA factory assembly line. The speed and accuracy of each worker’s assembly directly affects their job performance and pay.

In Project 1, you analyzed the value ecosystem around an existing product in this space: The stakeholders, their incentives, and the tensions between them. Your system operates in that same ecosystem.


Weekly Tasks

Task 1

Identify a wide range of potential harms your application might cause in real-world contexts, using the four-step workflow taught in class. Among the identified harms, analyze: which are of the highest priority for you, as its designers, to address?

💭 Read the task 1 grading rubric carefully before starting.

🛎️ Complete the green columns of this worksheet. Make it viewable to all Cornell accounts. Submit its url through this form.


Task 2

For your highest-priority harm from Task 1, analyze:

  • Who is most at risk from this harm? Describe the specific characteristics that make them vulnerable.
  • Choose one fairness definition from the lecture taxonomy and explain what it would mean applied to your system. Recall these definitions were built for binary classifiers, so you’ll need to adapt it. There is no technically correct answer. What we’re looking for is a clear connection between the definition you chose, the specific harm, and the specific user group: why does this definition capture what’s at stake for these users?

In addition, design mitigation strategy for a high-priority detectable harm:

  • Choose a harm from your prioritized list that a malicious user could attempt to trigger within a 15-minute session, and one that is specific to how your system is designed (rather than, for example, risk any LLM application carries.)
  • Describe it in enough detail that a tester could attempt to trigger it without further guidance (revise the harm description from Task 1 if it doesn’t already meet this bar).
  • Design how your system will detect and respond to it. Implement the design before the cook-off. Pilot testing with friends is strongly recommended.

💭 Read the task 2 grading rubric carefully before starting.

🛎️ Complete the orange columns of the same worksheet. Submit its url through this form.

🛎️ Each group member: Time for peer evaluation.


Cook-Off

During the week 15 class, all teams will set up a booth where students from other teams will test your system. 2 acting as normal users and 2 acting as malicious users. You will not know which two are acting maliciously. Malicious testers will receive your harm table before the session and will try to trigger the harms you identified, plus any others they can think of.

Malicious testers will complete a survey (link will become live on the day of the cook-off) for each test, documenting at least 2 specific attempts: the harm targeted, the exact input or behavior used, and what the system did in response.

🛎️ Submit your raw interaction log and confirm that all tester surveys have been completed by the end of class through this form. Specific deliverables and rubrics are listed here.


Final submission

Revise Tasks 1 and 2 based on TA feedback, then complete the purple columns of the worksheet.

  • Cook-off reflection Complete the cook-off reflection tab of the worksheet.
  • Accumulative harm monitoring: Choose from your harms table whose risk category is “accumulative” or “difficult for humans/laws to catch.” In its Mitigation Strategy column, specify: (1) which interaction events your system will log and what pattern in those logs indicates the harm is emerging; and (2) what threshold triggers a response and what that response is.
  • External accountability mechanism: Choose another harm from your harms table whose risk category is accumulative or not human noticable, and has societal implications. In its External Accountability column, describe one mechanism your system will enable so that users or third parties outside your team can surface this harm. Name who would be involved and why those people specifically are positioned to catch what your own monitoring would miss.
  • Teamwork reflection: Under “Learning Reflection” in the submission form, each team member shares a moment when they learned something from another team member(s) (e.g., to understand a fairness tradeoff, to decide which harm to prioritize, or to implement the detection mechanism.)

💭 Read the final submission grading rubric carefully before starting.

🛎️ Submit the url of the worksheet through this form. Congratulations on completing the course! Great work!

🛎️ Each group member: Time for the last peer evaluation of the semester.