UTH

Datasets at a Glance

CHCD offers a wide range of datasets to support research, education and public service efforts. We work with partners to help turn data into insights that improve health outcomes in communities across the state and beyond.

Using Claims Data

Claims data comes from insurance billing and reimbursement records. It captures information about health care services and supplies information from different service providers and payors, including what services were delivered and how they were paid. Because these data are used to determine payment reimbursement, the payment-related information is generally more complete and more reliable than other parts of a claim.

Claims data includes codes for diagnoses and procedures; it does not include detailed clinical information such as laboratory results, vital signs, or clinical notes.

Beyond claims, the datasets often incorporate enrollment information with demographic details, which establish who is eligible for insurance coverage, as well as provider-specific data. Collectively, claims data serve as a valuable resource to study healthcare utilization, costs, quality of care, and health outcomes across large populations.

Using administrative claims data in health care research offers several key advantages:

Please review “Guide to Using Claims Data” for additional information on claims data.

Overview of CHCD Datasets

Below are the datasets available through our Center. Select the dataset on the left to get a detailed description, outlining what each dataset includes, coverage, population scope, access requirements, and eligibility criteria for UTHealth Houston and external researchers.

Databases contain individual-level, de-identified, healthcare claims information from employers, health plans, hospitals, and Medicare Advantage plans. Data about individual patients is integrated from all providers of care, maintaining healthcare utilization and cost record connections at the patient level.

  • Years of Data: 2011-2024
  • Average Annual Covered Lives: 129,000,000+
  • State / National: National
  • Data Type: Claims
  • Geolocation: 3-digit zip code for years before 2019; Metropolitan Statistical Areas (MSA) after 2019

Permission to Use and Access Commercial National Claim Data:

  • UTHealth Houston faculty may conduct analysis.
  • UTHealth Houston students who are qualified doctoral students may use the data for dissertations only.
  • External researchers may collaborate with UTHealth Houston researchers but do not receive direct access.
  • Commercial National Claim Data

    Databases contain individual-level, de-identified, healthcare claims information from employers, health plans, hospitals, and Medicare Advantage plans. Data about individual patients is integrated from all providers of care, maintaining healthcare utilization and cost record connections at the patient level.

    • Years of Data: 2011-2024
    • Average Annual Covered Lives: 129,000,000+
    • State / National: National
    • Data Type: Claims
    • Geolocation: 3-digit zip code for years before 2019; Metropolitan Statistical Areas (MSA) after 2019

    Permission to Use and Access Commercial National Claim Data:

    • UTHealth Houston faculty may conduct analysis.
    • UTHealth Houston students who are qualified doctoral students may use the data for dissertations only.
    • External researchers may collaborate with UTHealth Houston researchers but do not receive direct access.
  • Medicare FFS 5% National Sample (Parts A, B & D)

    The CHCD is licensed by Centers for Medicare and Medicaid Services (CMS) and RESDAC as a state agency, which allows the use of Medicare Fee-for-Service data. The CHCD is also certified by CMS as a Qualified Entity (QE). Our QE status allows the use of Medicare Parts A, B, and D for Fees-for-Service (Traditional Medicare) claims data to promote transparency in healthcare delivery. CHCD is the only QE in the state of Texas and the only university-based QE in the country.

    • Years of Data: 2014–FY24 (September 30, 2024)
    • Average Annual Covered Lives: 3,781,729+
    • State / National: 5% National Sample
    • Data Type: Claims
    • Geolocation: 5-digit ZIP Code

    Permission to Use and Access Medicare 5% National Sample (Parts A, B & D):

    • Any students within UTHealth Houston or outside the organization do not qualify for data use.
    • Requestors may have access to the data if they meet requirements under State Agency or Qualified Entity (QE) terms. CHCD will provide additional information during the Data Request process.
  • Medicare FFS Texas (Parts A, B & D)

    The CHCD is licensed by Centers for Medicare and Medicaid Services (CMS) and RESDAC as a state agency, which allows the use of Medicare Fee-for-Service data. The CHCD is also certified by CMS as a Qualified Entity (QE). Our QE status allows the use of Medicare Parts A, B, and D for Fees-for-Service (Traditional Medicare) claims data to promote transparency in healthcare delivery. CHCD is the only QE in the state of Texas and the only university-based QE in the country. CHCD holds 100% of Texas Medicare Fee-for-Service claims.

    • Years of Data: 2014–2024
    • Average Annual Covered Lives: 5,268,640
    • State / National: Texas
    • Data Type: Claims
    • Geolocation: 5-digit ZIP Code

    Permission to Use and Access Medicare Texas (Parts A, B & D):

    • Any students within UTHealth Houston or outside the organization do not qualify for data use.
    • Requestors may have access to the data if they meet requirements under State Agency or Qualified Entity (QE) terms. CHCD will provide additional information during the Data Request process.
  • Non-Medical Drivers of Health (NMDoH)

    NMDoH dataset is a collection of publicly available data related to non-medical drivers of health indicators including education, economic stability, health care access and quality, neighborhood and built environment, and social and community context. Granularity varies by dataset (census tract, ZIP code, county, or state). Data has been enhanced for easier querying and integration. Review Non-Medical Drivers of Health Data Library, for a full list of variables and data sources.

    • Years of Data: Varies by variable
    • Average Annual Covered Lives: Varies by variable
    • State / National: Texas
    • Data Type: Non-claims; Varies
    • Geolocation: Varies (e.g., zip code, county, city)

    Permission to Use and Access NMDoH: No restrictions.

  • Texas All-Payor Claims Database (TX-APCD)

    In 2021, The Texas legislator appointed The Center for Health Care Data (CHCD) as the Texas All Payor Claims Database (TX-APCD) Administrator (House Bill 2090). This unique database contains healthcare claims data from 2019 and onward. The TX-APCD includes 100% of Texas Medicaid, 100% of Texas Medicare Advantage and all commercial plans regulated by the Texas Department of Insurance.

    • Years of Data: 2019-2024
    • Average Annual Covered Lives: 15,000,000+
    • State/National: Texas
    • Age Groups: All
    • Data Type: medical, behavioral health, pharmacy, dental claims, eligibility, and provider files
    • Geolocation: 5-digit zip code

    Permission to Use and Access TX-APCD Data:

    • The CHCD is required to use the data for state reporting and is also permitted to use the data for research and studies consistent with the requirements of the rule. External researchers must qualify as a Qualified Research Entity” (QRE) in order to gain access to the data. Texas Insurance Code § 38.402(9) defines a QRE as:
      • An organization engaging in public interest research for the purpose of analyzing the delivery of health care in this state that is exempt from federal income tax under Section 501(a), Internal Revenue Code of 1986, by being listed as an exempt organization in Section 501(c)(3) of that code;
      • An institution of higher education engaged in public interest research related to the delivery of health care in this state; or
      • A health care provider in this state engaging in efforts to improve the quality and cost of health care.
    • A QRE may:
      • Use information contained in the database only for purposes consistent with the purposes of the law
      • Use the information only in accordance with standards, requirements, policies, and procedures established by the CHCD (for example, the CHCD does not permit access to data from a location outside the United States).
      • Use the data only for the purpose of the single, IRB-approved research proposal.
      • Not sell or share any data contained in the database.
      • Not access or use any data in a manner that may violate state or federal privacy laws.
    • Each data request will be reviewed by the CHCD as required by law. Once approved, the qualified research entity designated researchers shall be granted access to the data extract on CHCD secure servers.
  • THCIC – Texas Discharge Data

    The Texas Health Care Information Collection (THCIC) Public Use Data File (PUDF) contains data on discharges from Texas hospitals. The Texas Outpatient PUDF contains data on outpatient surgical and radiological procedures from Texas hospitals and ambulatory surgery centers. This dataset cannot be used to conduct longitudinal studies as each row is event-based and not person-based. Both datasets also include provider information (i.e., name, address, specialty).

    • Years of Data: 2008–2024
    • Average Annual Covered Lives: 191,435,133
    • State / National: Texas
    • Data Type: Summary of “Event” data
    • Geolocation: 5-digit ZIP Code

    Permission to Use and Access THCIC:

    • UTHealth Houston-SPH faculty and students with faculty support may analyze the dataset.
    • UTHealth Houston adjunct faculty should contact [email protected] for access.
    • External researchers may collaborate with UTHealth Houston-SPH researchers.
    • Any researcher may request the dataset directly from Texas Health and Human Services Commission (HHSC).

Frequently Asked Questions

Need a quick answer? Find detailed information about our data services, data access, costs & timelines, technical questions, and security & compliance.

Explore FAQ Categories
LOADING...
LOADING...