Featured Answer:
The Centers for Medicare & Medicaid Services (CMS) provides healthcare data through data.cms.gov, the CMS Data Navigator, and other web tools. While CMS offers some data APIs (including Medicare Blue Button and certain ...
Introduction
The Centers for Medicare & Medicaid Services (CMS) provides healthcare data through data.cms.gov, the CMS Data Navigator, and other web tools. While CMS offers some data APIs (including Medicare Blue Button and certain dataset APIs), many datasets, quality reports, and provider tools are only available through the web portal or have limited API coverage. Browser automation can serve as an effective alternative—or complement—to CMS Data APIs for exporting Medicare and Medicaid datasets, quality metrics, provider information, and public use files when API access is restricted or when you need data that is only exposed in the portal.
Why Use Browser Automation for CMS Data Export?
- Partial API Coverage: Not all CMS datasets on data.cms.gov or CMS reporting tools expose full REST or bulk-download APIs
- Portal-Only Tools: Quality reporting, Compare tools (Hospital, Nursing Home, etc.), and custom report builders are often web-only
- Bulk and Custom Exports: Automate large or filtered exports that exceed API limits or require interactive filters
- Historical and Refreshed Data: Systematically pull updated public use files and refreshed datasets as CMS publishes them
- Provider and Facility Data: Collect provider directories, NPI data, and facility-level metrics from CMS web interfaces
- Medicare and Medicaid Public Data: Automate extraction of published Medicare/Medicaid statistics and reports that lack a dedicated API
Setting Up CMS Portal Data Export Automation
Here's how to automate data collection from CMS data portals using browser automation:
import { chromium } from 'playwright';
const response = await fetch("https://api.anchorbrowser.io/api/sessions", {
method: "POST",
headers: {
"anchor-api-key": "YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
'headless': false,
'proxy': {
'type': 'residential',
'country': 'US'
}
}),
});
const { id } = await response.json();
const connectionString = `wss://connect.anchorbrowser.io?apiKey=YOUR_API_KEY&sessionId=${id}`;
const browser = await chromium.connectOverCDP(connectionString);
const context = browser.contexts()[0];
const ai = context.serviceWorkers()[0];
const page = context.pages()[0];
// Navigate to CMS data portal
await page.goto("https://data.cms.gov");
// Use AI agent to find and open the target dataset or tool
await ai.evaluate(JSON.stringify({
prompt: 'Navigate to the desired dataset or category (e.g., Medicare, Medicaid, quality, or provider data). Wait for the page to load.'
}));
Exporting Datasets from data.cms.gov
Automate dataset discovery and export from the CMS data portal:
const exportCmsDataset = async (page, ai, datasetNameOrUrl) => {
await page.goto(datasetNameOrUrl || 'https://data.cms.gov');
await ai.evaluate(JSON.stringify({
prompt: `Find and open the dataset or catalog entry for: ${datasetNameOrUrl || 'the requested topic'}. Then open the export or download options.`
}));
await ai.evaluate(JSON.stringify({
prompt: 'Select CSV or full export, apply any required filters (e.g., state, year), and start the download. Wait for the file to finish downloading.'
}));
const download = await page.waitForEvent('download');
return await download.path();
};
Scraping CMS Compare and Quality Data
Extract data from CMS Compare tools (e.g., Hospital Compare, Nursing Home Compare) when no API is available:
const scrapeCmsCompareData = async (page, ai, compareToolUrl) => {
await page.goto(compareToolUrl);
await ai.evaluate(JSON.stringify({
prompt: 'Use the search or filter options to set the desired location, facility type, or quality filters.'
}));
const tableData = await ai.evaluate(JSON.stringify({
prompt: 'Extract the visible table or list data (facility name, location, ratings, metrics) into a structured format. If there is pagination, note it so we can loop.'
}));
return tableData;
};
Downloading Public Use Files and Bulk Data
Automate downloads of CMS public use files (PUFs) and bulk data releases:
const downloadCmsPuf = async (page, ai, pufPageUrl) => {
await page.goto(pufPageUrl);
await ai.evaluate(JSON.stringify({
prompt: 'Locate the link or button to download the public use file (or the latest version). Click it and wait for the download to complete.'
}));
const download = await page.waitForEvent('download');
return await download.path();
};
Collecting Provider and NPI Information
When provider or NPI data is only available via the web, automate extraction:
const collectCmsProviderData = async (page, ai, searchCriteria) => {
await ai.evaluate(JSON.stringify({
prompt: `Navigate to the provider or NPI lookup section and apply search criteria: ${JSON.stringify(searchCriteria)}`
}));
const providerData = await ai.evaluate(JSON.stringify({
prompt: 'Extract provider information (NPI, name, taxonomy, address, Medicare/Medicaid participation) from the current view. Return as structured JSON.'
}));
return providerData;
};
Handling Pagination and Large Result Sets
Iterate through paginated results on CMS portals:
const exportCmsPaginated = async (page, ai, baseUrl) => {
const allRows = [];
let hasNext = true;
await page.goto(baseUrl);
while (hasNext) {
const chunk = await ai.evaluate(JSON.stringify({
prompt: 'Extract all rows from the current table or list into a JSON array. Do not navigate away yet.'
}));
allRows.push(...chunk);
hasNext = await ai.evaluate(JSON.stringify({
prompt: 'If there is a Next or page-forward button, click it and return true; otherwise return false.'
}));
}
return allRows;
};
Best Practices for CMS Portal Automation
- Terms of Use and Robots: Respect data.cms.gov and CMS terms of use and any robots.txt or rate limits
- Rate Limiting: Add delays between requests to avoid overloading CMS servers or triggering blocks
- Data Attribution: Follow CMS attribution and usage requirements for published data
- Error Handling: Implement retries for transient failures and handle session timeouts
- Stable Selectors: Prefer stable URLs and parameters; CMS sometimes updates portal layouts
- Compliance: For any beneficiary-level or sensitive data, ensure compliance with CMS data use agreements and HIPAA
When to Use CMS APIs vs. Browser Automation
Use official CMS or Medicare APIs (e.g., Blue Button 2.0, approved dataset APIs) when they fully cover your use case. Use browser automation when:
- A dataset or tool has no public API or only partial API support
- You need custom filters or exports that the API does not support
- You are aggregating data from multiple CMS web tools into a single pipeline
Resources
- Anchor Browser Documentation - Complete API reference and guides
- Anchor Browser Playground - Try browser automation in your browser
- CMS Data Portal (data.cms.gov) - Official CMS data and datasets
Conclusion
Browser automation provides a flexible complement to the Centers for Medicare & Medicaid Services Data API and portal. For datasets and tools that lack full API access—or when you need bulk, filtered, or cross-tool exports—automating the CMS data portal with a browser can streamline data pipelines while respecting CMS terms of use and attribution requirements.
Start automating your CMS data collection and integrate Medicare and Medicaid public data into your workflows.