How not to email prospective grad school advisors

composition.al 2020-11-26

It’s grad school application season¹, which means that prospective students have been emailing potential faculty advisors. This isn’t something that you necessarily have to do in order to find an advisor, but it can really help. Unfortunately, it can also hurt if you do it wrong. I get my share of these emails, and I want to give some advice on what makes an effective email from a prospective student.

When customization goes wrong

One piece of standard advice given to prospective students is that they should customize their letter to each prospective advisor, rather than sending the same one-size-fits-all message to everyone.

This is good advice! I get a lot of emails from prospective students that are pretty bad because they’re not customized at all. These emails often say something like “I believe I would be an excellent fit for your research group” and then go on to discuss the applicant’s extensive background in computational biology, signal processing, circuit design, and a plethora of other topics, none of which have to do with what my research group and I do (not that I have anything against those research areas, of course). These emails don’t make a good impression on me, but they don’t make a bad impression on me, either. They pretty much just bounce off.

To make an impression on a prospective advisor, you need a customized message. Unfortunately, a poorly customized message can backfire and leave a bad impression. As an example of customization gone wrong, here’s an example of an email I got from a prospective student a few weeks ago:

Dear Dr. Kuper

Greetings! I am $NAME, a prospective PhD student for Fall ‘21. I have completed my B.Sc. in Computer Science and Engineering from the CSE department of $UNIVERSITY in $YEAR.

I am highly interested in Software Engineering, Big Data Analysis & Distributed Systems. Some of your research gave me valuable insights about the aforementioned topics. Among your research, Toward Domain-Specific Solvers for Distributed Consistency caught my eye instantly. In this research, you’ve stated that domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts. Also you’ve tried to democratize the development of domain specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts.

Another one of your research Verifying Replicated Data Types with Typeclass Refinements in Liquid Haskell seemed pretty interesting to me. This research is an extension to Liquid Haskell that facilitates stating and semi-automatically proving properties of typeclasses. Your work allows refinement types, that is augmented by Liquid Haskel, to be attached to typeclass method declarations, and ensures that instance implementations respect these types. I liked both of them.

[more information about this student’s background, test scores, and so on]

Sincerely, $NAME

The author of this email wants to show that they are interested in the specifics of my research, which is great! Unfortunately, they chose to do that by plagiarizing from my papers.

The phrases

domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts

and

democratize the development of domain specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts

are both verbatim from the abstract of my SNAPL ‘19 paper with Peter Alvaro, “Toward Domain-Specific Solvers for Distributed Consistency”. The phrases

an extension to Liquid Haskell that facilitates stating and semi-automatically proving properties of typeclasses

and

to be attached to typeclass method declarations, and ensures that instance implementations respect these types

are both verbatim from the abstract of my OOPSLA ‘20 paper with Yiyun Liu, James Parker, Patrick Redmond, Mike Hicks, and Niki Vazou, “Verifying Liquid Data Types with Typeclass Refinements in Liquid Haskell”.

This isn’t a good way to apply the “customize your message” advice. It’s definitely good to mention some specific, recent papers that you’ve looked at, and it’s great if you can say something about what stood out to you about those papers, possibly by paraphrasing some part of them. However, it’s not appropriate to copy and paste large chunks of text from the abstracts of a prospective advisor’s papers to describe your own research interests. This is plagiarism, and it will come across as insulting to many prospective advisors, because it makes it look like you assume the person reading the email is not going to notice or recognize that you copied and pasted.

In fact, I’d claim that any prospective advisor who does respond positively to such an email is not someone you’d actually want as an advisor! If a prospective advisor doesn’t notice the plagiarism, that seems to me like a sign that they’re not particularly engaged in the process of writing their own papers (perhaps offloading that work to colleagues or students). If I were a prospective student, I’d be wary of signing up to work with an advisor like that.

What to do instead

Looking at other emails I’ve gotten from prospective students, here’s an excerpt from one that I consider good:

I loved reading your paper “Toward domain-specific solvers for distributed consistency,“ and I’m really excited about the possibilities that solvers people can modify and fine-tune to their specific problems themselves — without needing to be SMT experts — would bring in terms of new programming languages and tools. If you have any time in the next few months, I would love to talk with you.

This prospective student is referencing the same paper as the the previous one, but instead of using my words, they’re paraphrasing and summarizing, using their own words. Another good thing about this email is that it emphasizes the prospective student’s excitement about the topic. Excitement is more important than perfectly polished writing! I responded to this email and had several long and productive conversations with the student, and we ended up making an offer to them. They ultimately decided to accept an offer from a different school, and I’m confident that they will thrive there.

Another way to show excitement about a research topic is by asking a question. Here’s another excerpt from an email, this time from a different prospective student, asking about the same paper:

Regarding the internals of the SMT solvers…it’s my understanding that you hope to provide a way for programmers to build their own solvers specific to the problems they are working on without expertise in SMT solvers. Is there much similarity between the internals of varying solvers? Wikipedia lists maybe 30 or so that I guess would be a base to build upon and selecting one or the other might involve evaluating trade-offs. Do you think it would be possible to have a single general purpose solver to build more specific solvers on top of or will many different ones be necessary in achieving your goals?

This student’s writing is refreshingly straightforward, and they’re asking interesting questions that warrant a serious response. Here’s part of what I wrote back in response:

There are definitely some key ideas and algorithms that get used in most, if not all, modern SMT solvers! I discuss some of them in the lecture notes from my class last fall, and I also recently gave a talk about the conflict-driven clause learning (CDCL) algorithm that SAT solvers use.

But one problem is that heavily-optimized modern solvers are (like some compilers) pretty hard to understand, modify, or predict the behavior of. I know people who run several versions of Z3 at the same time, just because they don’t know which versions will work faster for the problem they’re trying to solve!

In principle, an SMT solver has a modular design, where there are distinct theory solvers and an underlying SAT solver, and it should be possible for people to plug in their own theory solvers. But in practice, my impression is that plugging in your own theory solver is still very hard to do, to the point where most people don’t attempt it. That’s something that I would like to try to address in the long run. But in the short term, I’d like to develop fluency in those aforementioned higher-level solver-aided tools, and then try to figure out what the limitations of those tools are and where they reflect a limitation in the underlying solver that a custom theory solver might help with.

Writing this response gave me an opportunity to reflect on what research goals I consider worth pursuing in the short and long term, and to try to articulate those goals. I really had to think to answer these questions — which is good. At the same time, though, the prospective student is asking specific questions (and not merely “What do you think the next step is for your research on $TOPIC”?), demonstrating a willingness to meet me halfway. Ideally, my students and I will work as a team, both investing effort in each other and making each other better in a virtuous cycle. I’m looking for students who are interested in that kind of mutually beneficial partnership.

Obligatory plug: if you’re applying to computer science graduate programs and you’re interested in studying programming languages, systems, databases, and the intersections of those areas, consider applying to UC Santa Cruz to join the Languages, Systems, and Data Lab! Deadline of January 11; GRE not required! ↩
Well, not quite verbatim: I correctly hyphenated “domain-specific solvers” in my own paper. ↩