These are two ways to take a bunch of variables that are supposed to measure common latent factors and reduce them to a single or a few indices. What is the difference? I get the question fairly often, so I thought I’d put this post up.
The two approaches do different things. Inverse covariance weighting applies an assumption that there is one latent trait of interest, and constructs an optimal weighted average on the basis of that assumption. Factor analysis tries to partial out an array of orthogonal latent factors.
An intuitive way to think of it is like this:
Suppose you have data that consists of three variables: College Math Grade, Math GRE, and Verbal GRE. The two math variables will be highly correlated, and the verbal variable will be somewhat correlated with the math scores.
The inverse covariance weighted average of these three variables would result in an index that gives about 25% weight to each math score and then 50% weight to the verbal score. It “rewards” the verbal score for providing new information that the math scores don’t. The resulting index could be interpreted as a “general scholastic aptitude” index.
A factor analysis of these three variables would yield two orthogonal factors, the first factor of which would give almost 50% weight to each math variable and almost zero weight to the verbal variable, and the second would give almost zero weight to each math variable and almost 100% weight to the verbal variable. So you would get a “pure math” factor and a “pure verbal” factor.
Which one is better? It depends on the goals of your analysis.
I discuss this a bit more in my lecture on “measurement” in the quant field methods class (see these notes: [PDF]). Here is some R code to play around with these concepts too: [link].
8 Replies to “Inverse covariance weighting versus factor analysis”
Is there anything packaged for “inverse covariance weighting?”
Thanks in advance.
Berk — last time I checked Michael Anderson at UC-Berkeley had a Stata ado on his site to accompany his outstanding 2008 paper. Also, the replication materials for Casey et al.’s “Reshaping Institutions” paper has some code. I have my own R code and ado too and will post it as soon as I have the chance.
Great – thanks. I’ve used Anderson’s code for q-values before. I’ll check out his site and the Casey et al…
Thanks for the post. Very helpful.
Last I checked Anderson doesn’t have the code for the inverse covariance weighting on his website.
Hi Cyrus, I’ve checked three of the syllabi in your links and cannot find the “measurement” lecture you reference. Can you please help to point me there? (I’ve been browsing your blog–very useful, thanks!) Best, Annette
Ah yes that is a reference to an earlier version of the lectures for that class. Here is a link to the (now deprecated) lectures on measurement: link.
Hi Cyrus, Thanks for the useful post. Could you post or email your ado for inverse covariance weighting? I couldn’t find any from Anderson or Casey. Best, Birgitta
Let me put something together that is shareable — I will try to do it in the coming week or so.
Comments are closed.