36-401, Modern Regression, Fall 2015: Reflections and Lessons Learned

Three-Toed Sloth 2016-01-25

Summary:

Attention conservation notice: Navel-gazing by an academic.

This was my first time teaching our undergraduate course on linear models ("401"). I've taught the course which follows it (402) four times, and re-designed it once, but I've never had to actually take the students through the pre-req. They come in with courses on probability, on statistical inference, and on linear algebra, but usually no real experience with data analysis. Linear regression is usually their first time trying to connect statistical models to actual data — as well as learning about how linear regression works.

I am OK with how I did, but only about OK. The three big issues I need to work on are (1) connecting theory to practice, (2) getting feedback to students faster, and (3) better assignments.

(1) I feel like I did not strike a good balance, in lecture, between theory, computational examples, and how theory guides practice. The last thing I want to do is turn out people who just (think they) know which commands to run in R, without understanding what's actually going on. (As a student put it to a colleague in a previous semester, "The difference between 401 and econometrics is that in econometrics we have to know how to do all this stuff, and in 401 we also have to know why." This was not, I believe, intended as a compliment.) But based on the student evaluations, and still more the assignments, there're still students who are a bit fuzzy about what "holding all other predictor variables constant" actually means in a linear model. But then again, based on student feedback I persistently have a problem connecting mathematical theory to data-analytic practice; more serious re-thinking of how I teach may be in order.

(2) Students need faster and more consistent feedback on their assignments. We were somewhat constrained on speed this semester by a labor shortage, but I could have done more to ensure consistency across graders.

(3) Too many of the assignments were based on small, old data sets from the textbook. Mea culpa.

This was the first time we had two sections of 401, with two separate professors. I think we did OK at coordinating them, and I take full responsibility for all the failures and glitches. (I should add, because I know some of the students read this, that grades were curved and calculated completely independently across the two sections.)

I am very grateful for the work done on designing the curriculum for this course by my colleagues. Still, I feel like a lot of the course was spent on (to be slightly unfair) special cases which people could work out in closed form in the 1920s, and pretending that they had relevance to actual data analysis. (Cf.) The Kids do need at least a nodding acquaintance with that stuff, because people will expect it of them, but I would rather they be taught it as a nice bonus rather than a default. This would mean a lot more re-design that I put into the course.

Relatedly, I came to have a thorough, almost personal, dislike of the textbook, but that's another story.

Some things which did go well:

      Using Piazza for question-answering. (Thanks to Brendan O'Connor for pushing it on me.) Students were allowed to be anonymous to each other, but not to me or to the TAs, and this seemed to make sure there were no issues with trolling or general viciousness. (My plan of assigning them names of fossil animals as persistent pseudonyms proved too cumbersome to try.) Since when one student had a question, others usually had the same question, and they were good about reading what was posted, this drastically cut down on the amount of time I spent answering e-mail. (Concretely: I wrote under 600 e-mails for this class, compared to over 1000 last semester.) Of course, since Piazza isn't charging me or my students or CMU, I am sure they have some Cunning Plan from which we will not benefit. But I will keep using the service until they break it, or their nefarious schemes come to hideous fruition.
    • Encouraging the use of R Markdown. (I will make it mandatory for 402, and mandatory if I ever teach 401 again.)
    • Insisting on exploratory data analysis. (Teaching them to be selective about which parts of their EDA they report needs more practice on my part.)
    • I think it got through to everyone that the usual significance tests are only appropriate if the model is well-specified, so that parametric inference comes after model checking. Indeed, the former is meaningless if the model fits

Link:

http://bactra.org/weblog/1125.html

From feeds:

Statistics and Visualization » Three-Toed Sloth

Tags:

Date tagged:

01/25/2016, 10:37

Date published:

01/25/2016, 10:37