A Comprehensive Health Data CollectionEasy access to nearly every health dataset for the US and California
In early 2015, we will release a database that contains nearly every significant dataset related to health for California and the US.
550 measures in 146 topics from 90 government, nonprofit and university sources, with ten sources from the State of California. See this spreadsheet for a list of the datasets and their sources.
Full text search
Easily search through every table, column, column description, data dictionary and related document to identify required datasets.
Download data in the most popular, useful formats CSV, SAS or SPSS files, or connect directly to a SQL or NoSQL database. There’s an access method to fit into any workflow.
Several pricing and access options are tailored for grant-funded research, long-term research, or indicator websites. The search system can be used for free to find public source datasets, and many licensed datasets are available for free.
A comprehensive documentation system links to original source websites, data dictionaries, and complete schemas for all tables and columns. Every document associated with every dataset is available in one spot on the web.
Every dataset has an associated crosswalk to link it to anything it can be linked to, including census geographies, local area boundaries and associated datasets. Where possible, additional auxiliary datasets provide pre-computed population density, rates and ratios.
- Files: Download tabular data as files in CSV, SAS, SPSS and STATA format. Download geographic data as Shapefiles, KML or GeoJSON.
- SQL: Direct access to a SQL database.
- NoSQL: Direct access to a MongoDB NoSQL database.
- Web: Fetch results of a query as JSON with a web request.
The database system includes a full text search engine that allows users to quickly locate datasets based on metadata, data dictionaries, documentation and associated web pages. Users can add datasets to collections, export the collections to databases, files or Web APIs, and share collections with others. The data in the system includes the original data, along with common computed values for population rates, ratios, aggregations to larger areas, estimations of counts for smaller areas and links to the census and other associated datasets.
Civic Knowledge expects to begin releasing beta versions of the system for early access use and testing in February 2015. If you are interested in using the database, contact Eric Busboom at firstname.lastname@example.org for more information or to be added to the beta list.