Interviewer: Arushi Gupta

Trivia
- Favorite Book: The Foundation series – Isaac Asimov
- Favorite Place: Santa Fe, New Mexico
- Favorite Food: Enchiladas
- Favorite Country: United States
Could you give an overview of the research you and your lab works on and its applications?
My lab conducts research across a variety of subject areas, ranging from climate science, sustainability, election administration, and election outcomes to more general issues about where people get their information and how information works in either good or bad ways to influence people’s attitudes and behaviors. Like most faculty members at Caltech, we take a very quantitative approach to our research. We use big data, machine learning, and deep learning to explore datasets and uncover new insights into human behavior in social, political, and economic areas.
What influenced you to pursue the fields of social/political science, social/decision neuroscience, and research methodologies? What influenced you to incorporate computational methods and machine learning in your research?
There is a lot of serendipity in how academics develop their research strategies. I did my undergraduate work as a geology major at Carleton College but struggled with certain analytical subjects like physics. Eventually, I had to switch majors, and I explored both political science and economics, largely because I had enjoyed classes in those subjects and could still graduate on time. I ended up majoring in political science because I enjoyed it. During my time, I learned a lot from both qualitative and quantitative political scientists on the faculty of Carleton, including future US Senator Paul Wellstone. After completing my undergraduate degree, I attended law school but found out that I absolutely hated it.
When my fiance started a Ph.D. program in biochemistry in North Carolina, I looked at political science and law programs at Duke University. For reasons that are still sort of mysterious to me today, I was admitted to Duke’s political science program. This is where the serendipity comes in—I hit that program during a time when it was transitioning from being a traditional qualitative political science program to a more quantitative one. It was precisely at a moment when someone with my analytical interests could learn a lot from the incredible new faculty they had just recruited. This led to my interest in studying individual human behavior, specifically regarding where people get their information and how campaigns can influence people’s attitudes and actions. This became the focus of my dissertation and much of my early work at Caltech in the 1990s.
Over the years, my interests have continued to evolve, largely because of the environment that I’m in here at Caltech, with collaborations that span computer science, climate, and sustainability science, and the interests that students have.
What advantages do you see in integrating various disciplines?
What I think is unique about being a faculty member at Caltech and working with both undergraduate and graduate students here is that we are a small, non-traditional university. I’m currently sitting in Baxter Hall surrounded by an incredibly diverse group of colleagues. In the offices next door to me, there’s an economist, a psychologist, a decision theorist, and a neuroscientist, among others. I have this very eclectic mix of people around me.
It’s very easy for me to walk over to Annenberg to work with computer scientists or to Gates-Thomas to collaborate with those working in climate science. Caltech makes it easy, and by doing so, it is possible for us to do work that spans various intellectual and academic disciplines. This not only makes our work more exciting but also pushes all these disciplines in new directions. By injecting social science concerns into computer science research and vice versa, we’re producing innovative and cutting-edge research that has a significant impact on both science and society.
What aspects of your research do you most enjoy? What about working with students do you most enjoy?
The answer to the first question is working with students. Working with students is the most enjoyable part of my day, and the good news is that it’s what I spend most of my day doing. I work with Ph.D. students, undergraduates, and postdocs on all these different research projects we have. I bounce from meetings that cover our research project with Activision, where we are using their data to better understand player toxicity in their platform, to climate science, where we are trying to understand why people do not trust climate science and how we can change that, to fundamental work in computer science, where we are building deep learning tools that can detect harassment and toxicity in social media and text conversations. I go from meeting to meeting all day with students where I hear all of their exciting ideas and I try to push those ideas into publication-quality research projects.
It’s fun and I never have a dull day. I’m interacting with the world’s smartest students at Caltech. I’m always learning something new, whether it’s some cool new AI tool that a student read about to the latest campaign tactics that are going on in the Republican primary right now between DeSantis and Trump. It’s really exciting because students bring all of these incredible new ideas and thoughts into these conversations. They are constantly stimulating my thinking about ways we can turn these ideas into impactful research. So, it’s just a lot of fun to work with students.
Can you discuss a specific project that your lab is currently working on that you find exciting?
I would say the most interesting and challenging research agenda that we have is the collaboration my group has with Anima Anandkumar. For the past five years, we have been working to put together large datasets and build a variety of machine-learning tools that can sift through the data in real time to identify trolling, harassment, and toxic conversations. This has been the culmination of a lot of different research projects that I have engaged in throughout most of my career.
The issue is that toxicity has become a rampant problem across various platforms, including social media, political discussions, and gaming platforms. We have been trying to build tools that can detect such behavior and conversations in real-time. And now that we think we can do so using current tools that have been developed (by us and others), we are trying to pivot towards answering questions like: What can we do about it? At what point do we actually produce interventions? What do those interventions look like? And how do we test them?
Maya Srikanth, an undergraduate at Caltech who graduated two years ago, presented a paper on building chatbots that can intervene in toxic conversations on social media platforms by subtly or not so subtly changing their nature. With the recent developments in large language models, we are now in a better position to build and test such interventions, not only in social media but also in other relevant areas.
What are the challenges you have faced in the execution of this project?
On the research side, one challenge is that trolling, harassment, and toxicity are low-incidence behaviors, which make it difficult to detect unless large quantities of data are collected. So, we have had to spend a considerable amount of time and effort to find very large datasets where these kinds of behaviors could be detected. A few years ago, when it was very difficult for academics to easily access large quantities of social media data, we partnered with Twitter to create a dataset on all social media conversations on their platform related to the #MeToo movement. This dataset has been very important to us and we have used it to build a lot of our tools. A while after getting that data, Twitter made changes to its policies which allowed academics to collect much more data. We took advantage of the opportunity and collected a series of very large datasets including one about COVID-19 that we now use. Although we have been able to overcome the challenge of dataset collection, it has required a considerable amount of time and effort from our research group.
The second challenge we have faced is that many of the individuals who are systematically engaging in spreading misinformation, negativity, and toxicity through social media accounts are doing so in strategic ways. They know that they are being followed and tracked on platforms like Facebook and Instagram in particular, and so they try to do things to cover their tracks. These individuals change their behavior, usernames, hashtags, and keywords in subtle ways to avoid detection, making it challenging to use machine learning tools that are trained on past conversations to identify them in real-time or future interactions. As a result, we have worked on developing some really interesting machine learning and deep learning algorithms to track changes in conversations over time to detect how these negative actors are modifying their behavior.
The third challenge we encountered is associated with the size of these datasets. We want to apply natural language processing to reduce the dimensionality of the datasets we work with to find trace information on trolls and harassers. However, natural language processing on such large datasets involving tens or hundreds of millions of social media posts is a computationally intensive task that cannot just be done on a desktop computer. To address this, we have implemented two solutions. Firstly, we have provided resources for our students and postdocs to work in Google Cloud or AWS. We have also partnered with NVIDIA through Anima’s group to use their technologies to speed up our use of natural language processing. Second, we have also developed new natural language processing tools and methods of estimation with the help of undergraduate students at Caltech who have helped make fundamental discoveries. Using tensor algebra, we have been able to speed up the computation of many of the types of natural language tools that people commonly use, reducing their computational time from weeks or months to just minutes. Despite the challenges we faced, we were able to develop solutions and make significant contributions to the research literature.
What are your interests outside of teaching and research?
My wife and I live in Pasadena and have a very small Doxon, who’s about 10 pounds, and a very large yellow lab, who’s about 75 pounds. We spend a lot of time with them because our lab in particular needs a lot of exercise. So, I spend a lot of time walking and running with the dogs. My other main hobby these days is running. Before the pandemic, I had already been a runner for a long time, but my daughter challenged me to run a half marathon, which got me serious about endurance training again. I ran that half marathon and enjoyed it and so I decided to do more of those. So, when I’m not working, I mainly spend my time hanging out with the dogs and running.
What advice do you have for undergraduate students who are interested in pursuing research, especially in your areas of interest?
The advice I always give to incoming freshmen at Caltech, or to students who are considering coming here, is to get to know your professors. The wonderful thing about Caltech, especially for undergraduates, is that it’s very small and faculty work on campus. You can easily get to know your professors by meeting with them in their offices or labs, talking to them after class, inviting them to Red Door, etc. When you get to know your professors, you’ll find that many of them are really interested in you and what you are doing. This can create a lot of opportunities for you, whether it’s joining their lab, working on a research project, or pursuing some other activity on or off campus.
Getting to know your professors is a force multiplier for Caltech students because it can really give you life-changing opportunities that you would not be able to get if you were at a larger school. Many of the undergraduates who end up working in my group and lab remain personal and professional friends and contacts long after. I still communicate with Caltech undergraduates that I got to know back in the ‘90s and 2000s, and some of my former Ph.D. students are still active collaborators of mine. So, my advice is to take advantage of this opportunity to get to know your professors and see where it takes you.