Hillary N. Owusu

Hillary N. Owusu

PhD Student · CLIP Lab · University of Maryland
open to research internships and collaborations

Hi! I am a third-year PhD student in Computer Science at the University of Maryland, fortunately advised by Prof. Naomi Feldman in the CLIP Lab.

My research focuses on AI Safety and Alignment: understanding how language models behave when they reason and when they are influenced. I use mechanistic interpretability and causal intervention to study model internals, drawing on cognitive science and computational social science to understand what those internals mean.

I also care deeply about AI education, helping students and engineers engage with these systems critically. I coordinate ML programming at UMD and have mentored undergraduate researchers through full research cycles.

Feel free to reach out if you want to chat or collaborate!

Publications
ACL 2026 under review
Anchoring Depends on Confidence and Post-Training in Language Models
Hillary N. Owusu, Naomi H. Feldman
Across 6 models and 3 training paradigms, internal certainty predicts anchoring susceptibility more than factual accuracy. Post-training makes models significantly more vulnerable.
NeurIPS 2026 in preparation
Same Bug, Different Wiring: How Training Paradigm Shapes Anchoring Bias in LLMs
Hillary N. Owusu, Naomi H. Feldman
Localizing anchoring circuits across base, instruct, and distilled models via activation patching and causal tracing. Testing activation steering as a mitigation strategy.
arXiv 2025 preprint
Bias-Aware AI Chatbot for Engineering Advising at the University of Maryland
Hillary N. Owusu et al.
RAG-based advising chatbot with integrated bias detection. Led as corresponding author, supervising 4 undergraduates from scoping through publication.
Projects
Robustness · Benchmarking
GSM++ Reasoning Robustness Framework
Stress-tested LLaMA reasoning across 5 semantic variants of GSM8K. Accuracy dropped from 71% to 55.5% under Hindi translation.
Privacy · Security
Membership Inference Attack on GPT-2
Built a TF-IDF attacker on a GPT-2 shadow model achieving 97% accuracy at distinguishing training members from non-members.
Fairness · NLP · Low-resource Languages
Gender Bias in English to Twi Neural Machine Translation
Audited gender bias in English-Twi NMT across 3,080 minimal pairs. Found systematic gendered semantic drift in translations.
Experience & Service
2026
Reviewer, ACL 2026
Jul–Aug 2025
Supervised 4 undergraduates through the Bias-Aware Chatbot project; corresponding author on resulting preprint.
2025–now
ML Instructor & Coordinator, CMSE TLP Program, UMD
Teaching ML skills and AI applications to undergraduate engineers across disciplines.
2023–2024
Facilitator, CMSE Summer Bridge Program, UMD
2023–2024
Teaching Assistant, UMD
Algorithms (CMSC351), Data Science (CMSC320), C Programming (CMSC106)
Last updated March 2026 · hillaryowusu.com