Auctus is an open-source dataset search engine that was designed to support data discovery and augmentation. Users (and systems) can pose a rich set of discovery queries: in addition to keyword-based search, they can specify spatial and temporal queries, and data augmentation queries (i.e., searching for datasets that can be concatenated to or joined with a query dataset). To support these queries, Auctus uses a data profiler that we developed to automatically extract useful information from datasets, including summaries (or sketches) of column contents and their data types.
Number of indexed datasets in the public Auctus instance: 20,255
Socrata: 18,015 (46 different domains including cityofnewyork.us, medicare.gov, sfgov.org, novascotia.ca)
Zenodo “covid”: 1,040 (datasets matching the query term “covid”)
Indicators from University of Arizona: 1,094
Indicators from World Bank: 20
Direct upload: 86
More Information:
Try Auctus Now
Auctus Fact Sheet
Code Repository
Video demo