0%

Short Note of Propensity Score

It is great to inspire our minds beyond the field we are mainly focused on, especially for researchers. Most of us may do a deep dive into a field that only uses several old tools that were developed 20–30 years ago. Therefore, I attended one talk in a series of workshops at Academia Sinica, talking about the idea. “Propensity score” is a revised method for swapping the possible bias from the designed experiment with the real world. The content is mainly based on the talk from the speaker, Dr. Jing-Rong Jhuang, Institute of Statistical Science.

Intro

Consider an experiment separated a number of population into two groups, one is the control and the other is the experiment one. If we directly find that there is a significant difference between the experiment and the control group, is that enough to say the experiment is really helpful? Actually no, the statistical data may have a bias with some wired results, such as confounding bias. Therefore, matching, stratification, and regression are the three-main methods to solve these problems.

Matching, if we do have a virtual world that can let all the sample to be the same, then we can set two groups with the same sample condition without any confounding bias. e.g., Building the experiment with the control group that with cloning human. That may reduce most of the confounding bias.

Stratification, we can stratify the group into several subgroups and try to find if there is any high correlation. For example, age is a continue parameter, but we can change that into the different level like elders, teenager….etc.

Regression, which set to be the linear correlation between two factors or multifactor. Therefore, it is a better way to find the correlation. But we can still improve a little from the regression, that is propensity score.

A strong assumption (based on the talk): strongly ignorable treatment assignment. If all condition are known, the parallel universes can be created.

Several ways

Propensity score matching

  • Select potential confounders
  • Estimate propensity scores using logistic regression
  • Greedy nearest neighborhood matching
  • Check the balance of covariates by univariate analysis

Inverse probability weighting for ATT

  • Select confounders that may affect the assignment
  • Estimate propensity scores using logistic regression
  • Fit the regression model and include the PS as weight

Limitation

  • At the beginning, we already assume all the confounders are in the parameter space. We may assume that all the important confounders are included already.
  • As using the propensity scores, that means the sample size will be reduced into small groups.