Home » Miscellaneous (Page 2)
Category Archives: Miscellaneous
The TCRN recently collaborated with researchers from the NCRN node at the Unviersity of Missouri on methods for generating public use data files with synthetic point-referenced geographies and attributes. The methodology offers a new way for statistical agencies to disseminate data with fine-geographic
resolution, such as latitude and longitude, while limiting the risks of unintended disclosures. A paper describing the methodology will appear in the journal, Spatial Statistics. A link to the arXiv version is on the “Papers” page of the TCRN website.
In a paper published in the Journal of the American Statistical Association, we developed an approach that fully integrates editing and imputation for continuous microdata under linear constraints. The approach relies on a Bayesian hierarchical model that includes (i) a flexible joint probability model for the underlying true values of the data with support only on the set of values that satisfy all editing constraints, (ii) a model for latent indicators of the variables that are in error, and (iii) a model for the reported responses for variables in error. An R package implementing the model is available on CRAN and linked on the Downloadable Software page on this site.
in the Journal of the American Statistical Association
Thais Paiva successfuky defended her dissertation to become the latest PhD in statistical science funded by the NSF NCRN award to Duke University. Thais did a thesis on methods for disseminating data with synthetic geographies and on methods for deciding when to stop data collection to save costs.
TCRN graduate student Thais Paiva delivered a talk at the Census Bureau titled, “Using imputation techniques to evaluate stopping rules in adaptive survey designs.” These follow the talks by TCRN postdoc Hang Kim, who has delivered two talks at the Census Bureau on his edit-imputation research.
TCRN graduate student Jared Murray, who completed his degree in December 2013, accepted a position as a visiting assistant professor in the Department of Statistics at Carnegie Mellon University. Jared will work with the NCRN node at CMU as part of his duties.
After 2 years with TCRN, postdoctoral associate Daniel Manrique-Vallier accepted a tenure track position with the Department of Statistics at Indiana University.
The TCRN sponsored a three-day workshop February 28 – March 2, 2014, in Durham NC, on the use of micro-data from the Survey of Income and Program Participation (SIPP). The workshop included over 50 participants, most of whom were graduate students, on tools and data needed to conduct SIPP-based research project. The SIPP, fielded by the US Bureau of the Census, collects longitudinal subannual data on respondents’ income, labor force activity, household composition, health, migration, and eligibility for and participation in programs (e.g.TANF, WIC, Medicare, Medicaid, and numerous others). As such, it provides unique opportunities to examine the social and economic well-being of U.S. residents, and changes in residents experiences over time. Funding for the workshop was provided by the National Science Foundation and the Bureau of the Census under the NCRN program.
TCRN researcher Jerry Reiter and former graduate student Yajuan Si have published a paper on the use of nonparametric Bayesian methods for multiple imputation of missing data in large-scale categorical databases. They applied the methods to impute background characteristics in the Trends in International Mathematics and Statistics Study. The paper will appear in the Journal of Educational and Behavioral Statistics.
Thais Paiva, PhD candidate in the Department of Statistical Science at Duke University, won an award for “best paper” by a graduate student at the AISC 2012 conference. She presented work on releasing data with synthesized locations in order to protect confidentiality.
TCRN investigators have authored a paper to appear in the Journal of Official Statistics showing that analytically-valid, partially synthetic data need not be generated from posterior predictive distributions. This simplifies the generation of partially synthetic data.