Data Sets

Image Dataset Name Short Description Tags File Size
Eclipse Megamovie Citizen science covering the 2017 eclipse
  • EarthScience
  • SMS Spam Collection Dataset Collection of SMS messages tagged as spam or legitimate.
  • Data analysis
  • Business
  • 208 KB
    U.S. Educational Finances Revenues and expenditures for U.S. grade schools, by year and state 80 MB
    US Traffic Fatality Records Fatal car crashes for 2015-2016
  • Data analysis
  • Economics
  • Energy Consumption Top 30 Countries in Energy Consumption by Energy source 2016
    Inflation Rate This dataset is about the Inflation for Saudi Arabia for 2016-2017. Data from Saudi Arabian Monetary Agency. Follow for timely data to advance energy economics research.
  • Data analysis
  • World Cities Database A database of coordinates for countries and cities. 43 MB
    GitHub Repos Code and comments from 2.8 million repos
  • Data analysis
  • Historical Air Quality Air Quality Data Collected at Outdoor Monitors Across the US
  • Climate+Weather
  • Chocolate Bar Ratings This dataset contains expert ratings of over 1,700 individual chocolate bars
  • Data analysis
  • 30 KB
    Hacker News This dataset was kindly made publicly available by Hacker News under the MIT license.
  • Data analysis
  • SkillCraft1 Master Table Dataset Data Set This data was used in Thompson et al. (2013). A list of possible game actions is discussed in Thompson, Blair, Chen, & Henrey (2013).
    Games Results Data Set Dota 2 is a popular computer game with two teams of 5 players. At the start of the game each player chooses a unique hero with different strengths and weaknesses.
  • Sports
  • Economic Sanctions Data Set Domain Theory on Economic Sanctions; Undocumented
  • Economics
  • Breast Cancer This is one of three domains provided by the Oncology Institute that has repeatedly appeared in the machine learning literature. (See also lymphography and primary-tumor.) This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.
    Balloons Data Set Data previously used in cognitive psychology experiment; 4 data sets represent different conditions of an experiment
  • data analysis
  • Auto MPG Data Set Revised from CMU StatLib library, data concerns city-cycle fuel consumption
  • data analysis
  • 1000 Genomes Project and AWS The 1000 Genomes Project, initiated in 2008, is an international public-private consortium that aims to build the most detailed map of human genetic variation available.
    Artificial Characters Dataset artificially generated by using first order theory which describes structure of ten capital letters of English alphabet
    Abalone Predict the age of abalone from physical measurements

    © copyright 2017 All Rights Reserved.

    A Product of HunterTech Ventures