Emory University | |
Brent A. JohnsonTeachingResearchPublicationsSoftwareBiostatistics HomeSchool of Public Health HomeEmory Home |
Associate Professor Research Projects and InterestsModel Selection, Machine Learning, Data MiningA recent research interest area of mine is model selection in complex regression problems. Much of the recent work in this area is focused on parametric models, e.g. least squares and likelihood, while many biometric, econometric models are semiparametric. There are still many new and emerging problems in this area.Almost all of my research on this problem thus far has focused on the extension of existing regularization methods to semi-parametric models; in particular, the accelerated failure time model. I have investigated both rank-based estimation and Buckley-James-type estimation with all manner of penalties. In addition, I proposed rank-based survival ensembles to complement the mboost package in R. Below are some references whose pdfs can be downloaded on my publications page as well as some software that can also be downloaded from the software page. References:
Chung M, Long Q, and Johnson BA. (2012)
A tutorial on rank-based coefficient estimation in small- and large-scale
problems. Statistics and Computing (In press).
Semiparametric Theory in Missing Data, Causal InferenceAn ongoing research interest of mine (thanks to my advisor Anastasios (Butch) Tsiatis) is the analysis of coarsened data and a semi-parametric approach to handle it. Coarsened data is defined as a many-to-one function of the true, complete data and a generalization of traditional notions of missing data. A nice introduction to this topic is found in Tsiatis (2006). Special cases of coarsened data include survival analysis, missing data, and measurement error. My first thesis advisee, Li Li, and I did some work in this area and those papers are forthcoming. In 2011, one of Li's paper was accepted to JASA.References:
Li L, Eron J, Ribaudo H, Gulick RM, Johnson BA (2012) Evaluating the effect of
early versus late ARV regimen change after failure on the initial regimen: results from the
AIDS Clinical Trials Group Study A5095. Journal of the American Statistical Association (In press).
Treatment and Prevention of HIV and AIDSBeginning in 2004, I began applying my interest in complex treatment strategies to HIV and AIDS research. At that time, I was a postdoctoral fellow and met Joe Eron, Professor of Medicine at UNC-CH, through a colleague in Biostatistics, Michael Hudgens. We conceived of a novel strategy to estimate the causal effect of delayed switch from a failing antiretroviral regimen. Rather than conditioning the analysis only on those patients that failed, we estimate the combined effect of failing on the initial regimen and switching early or late to second-line regimen. Interestingly, in our analysis of the ACTG 5095 data, we found that our method detected mild clinical benefit, on average, to switching within 8 weeks of confirmed virologic failure of an efavirenz-containing regimen whereas the conventional method found no difference.Since arriving at Emory in 2006, I have been a member of Emory's CFAR through the Biostatistics Core. In addition to working on therapeutic studies, I have also worked on several projects in prevention. My Emory collaborators include Patrick Sullivan (EPI), Rob Stephenson (Global Health), Frank Wong (BSHE), Eric Nehl (BSHE), and Vince Marconi (Medicine, Ponce Clinic, Grady and VA Hospital). We have submitted several grant applications together and papers are forthcoming. References:
Li L, Eron J, Ribaudo H, Gulick RM, Johnson BA (2012) Evaluating the effect of
early versus late ARV regimen change after failure on the initial regimen: results from the
AIDS Clinical Trials Group Study A5095. Journal of the American Statistical Association (In press).
Environmental HealthI was introduced to statistical problems in environmental and occupational health while I was a postdoctoral fellow at UNC-CH. I worked with Larry Kupper and Stephen M. Rappaport. Rappaport is an expert in exposure biology and got me started on nonlinear regression models. The basic idea is to estimate the biomarker response curve as a function of occupational exposure. In contrast to classic pharmacokinetic data (in Davidian and Giltinan, 1995, for example), we do not see multiple outcomes per subject over time, dose, or exposure. Rather, in occupational studies of exposure, we get to observe outcome measurements for a single exposure dose. Naturally, as one might expect, the same nonlinear models that would be applied to individual-level data fit well to a random sample from the population. We have applied some standard semiparametric tools to Chinese studies of benzene exposure and are looking to develop some novel statistical methods shortly.Since arriving at Emory, I have collaborated with several investigators in our Environmental Health department. My main collaborators are Jeremy Sarnat, Stephanie Sarnat, Roby Greenwald, Ying Zhou, and Yang Liu. I continue to collaborate with Rappaport (UC-Berkeley). References:
Sarnat S, Raysoni AU, Li W, Holguin F, Johnson BA, Flores-Luevano S, Garcia J, and Sarnat JA. (2012) Air pollution and acute respiratory response in a panel of asthmatic children along the U.S.-Mexico border. Environmental Health Perspectives (In press)
Other ProjectsMeasurement Error In Nutrition. A common application of measurement error occurs in the analysis of nutritional information taken from dietary instruments. A standard goal of nutrition studies is to determine the relationship of clinical outcome to nutrient intake, say iron. Of course, we never actually get to observe how much iron a person actually gets or takes, just the types and amounts of food that a person eats. In large nutrition studies, food intake is typically obtained through questionnaires and/or diaries. In smaller studies, subjects may be required to give blood or urine samples which offer much better nutrient intake information but are also more expensive to collect, which preclude their use in large trials.References:
Johnson BA, Herring AH, Ibrahim JG, and Siega-Riz AM (2007) Structured measurement error in
nutritional epidemiology: applications in the pregnancy, infection, and nutrition (PIN) study.
Journal of the American Statistical Association, 102, 856-866.
|