CMSI 486

Introduction to Database Systems

Fall 2020

Dataset Sources

It’s hard to appreciate the power and potential of general database management systems when you don’t have seed data to populate it—so we’ll want to do our work with a database that can be fully populated from the get-go.

Our case study dataset is the one from the Netflix Prize competition from over a decade (a decade!!!) ago:—find one to call your own from one of these sites, or feel free to identify one independently:

  • Kaggle datasets are generally positioned for data science or machine learning, but sometimes also applicable to pure database work:
  • The Awesome Public Datasets collection is similar in purpose and applicability
  • For the health-/medically-minded: the Drugs@FDA database file set is actually freely available