Home | About Us | Employment | Contact | Site Map | Publications
Mathematica Policy Research - Home  Education Labor Health Disability Welfare Nutrition Early Childhood International  
   Education Labor Health Disability Welfare Nutrition Early Childhood International
 

2007 Joint Statistical Meetings Abstracts

"Using Factor Analysis and Cronbach's Alpha to Ascertain Relationships Between Questions of a Dietary Behavior Questionnaire"
Obesity and other dietary problems make it necessary to have a better understanding of dietary behavior and more effective nutrition education. A dietary behavior questionnaire was developed to measure outcomes of nutrition education as part of an effort to develop a standardized, flexible data collection tool. This questionnaire, which was separated into modules according to food groups, was field tested by Mathematica for internal consistency of responses to survey questions and the performance characteristics of individual and sets of questions. The field test data analysis identified questions that performed well and should be retained and some that performed poorly and should be either dropped or need further study. In this paper, we discuss the use of factor analysis and Cronbach's alpha to decipher the internal consistency of and relationships between questions within modules.

"Identifying the Population with Disabilities: A Comparison of Current Survey Estimates"
In the United States, there is no single, universally accepted definition of disability in either the government programs that serve people with disabilities or among federal surveys. This presentation provides an overview of the national data sources that allow for the identification of the population with disabilities: ACS, NHIS, SIPP, CPS-ASEC, Census 2000, and other data sources. The strengths and weakness of each data source will be discussed, with a particular focus on the survey items used to identify the population with disabilities. Estimates derived from these data sources will be compared, including estimate of the size of the population with disabilities, the disability prevalence rate, and employment and poverty rate for people with and without disabilities. In addition, current efforts to design new sets of disability survey items and the SIPP/DEWS will be discussed.

"Assessing Bias in Estimates in a Two-Stage Design from an Early Close Out of the First Stage Data Collection: An Empirical Investigation Using NSRCG Sample Data"
Reluctant respondents and low response rates have resulted in increased data collection costs to maintain the same level of response from one year to the next. Consequently, survey managers must assess the efficient allocation of a fixed budget to achieve the survey objectives when the survey is conducted. The list collection of college graduates is a major component of the NSRCG design and one that has considerable costs associated with it. In particular, data collection resources are concentrated on a small set of late responding schools. These resources could be used elsewhere if the schools responded earlier. This paper focuses on the effect of school nonresponse if the list collection period was not extended and a higher school level nonresponse rate was accepted. With this objective, we assess the bias of survey estimates due to school-level nonresponse at varying response rates.

"Goodness-of-Fit Tests for Logistic Regression with Complex Survey Data"
The use of the logistic regression method for unit nonresponse weight adjustments has become common practice in recent years. With this method, users must go through the usual steps in regression modeling, including assessing the goodness-of-fit (GOF) of the model. However, a GOF test that accounts for the complex survey design is not readily available; or if it is, it is not always intuitive. This paper discussed the GOF test for logistic regression with complex survey data. We investigated how much bias the result is when the GOF test for a simple random sample data is applied to data from complex sample design, and whether there is an intuitive pattern in term of bias. We also developed a test that takes into account of the complex survey design. A simulation study was used to compare the simple-random-sample test and the proposed test with readily available GOF tests.

"Sampling with Uncertain Frame Counts: Challenges in Sampling Head Start Children for the FACES Study"
The 2006 Head Start Family and Child Experiences Survey (FACES) involved four stages of sampling: Head Start programs, centers, classrooms, and children. Eligible children were those who were one or two years away from kindergarten and were new to Head Start in the fall of 2006. Because only a list of Head Start programs was available as a sampling frame, we relied on selected programs to provide lists of centers, and relied on selected centers to provide lists of classrooms and eligible children. To accommodate when the lists were provided, sample selection at each level was conducted on a rolling basis. We will describe the challenges in implementing a sampling strategy that met the sample design goals, including an oversample of children who were two years away from kindergarten, and one that was flexible enough to adapt to the actual child counts when they were lower than estimated.

"Measuring Disclosure Risk and an Examination of the Possibilities of Using Synthetic Data in the Individual Income Tax Return Public Use File"
The Statistics of Income Division (SOI) currently measures disclosure risk through a distance based technique that compares the Public Use File (PUF) against the population of all tax returns and uses top-coding, subsampling, and multivariate microaggregation as disclosure avoidance techniques. SOI is interested in exploring the use of other techniques that prevent disclosure while providing less data distortion. Synthetic or simulated data may be such a technique. But while synthetic data may be the ultimate in disclosure protection, creating a synthetic dataset that preserves the key characteristics of the source data presents a significant challenge. An additional constraint in creating synthetic data for the SOI PUF is found in maintaining the accounting relationships among numerous income, deduction, and tax items that appear on a tax return.

"Survey Designs to Optimize Efficiency and Precision for Multiple Objectives: Methods and Applications"
Allocation of the sample among strata or sample clusters on the basis of variance components and survey costs is important to survey design. Two basic approaches to solving for optimum are: maximize precision for a fixed cost or minimize cost for a specified precision. Optimum allocation equations for an estimated mean or total of a specific population are presented in sampling methods books. However, we typically need an optimum allocation that simultaneously satisfies several types of estimates and for several inference subpopulations. In this paper we review optimization methodology, its history and its extension to such multiple survey objectives. The computer algorithm we used to solve this nonlinear equation problem is described. Two recent applications are used to demonstrate the diversity of optimization problems and the flexibility of the methodology.

 

 


Back to Top