Name Last update
Abalone Loading commit data...
Accidental Drug Related Deaths in Connecticut, US Loading commit data...
Amazon Product Reviews Loading commit data...
Autism Screening Adult Loading commit data...
Auto MPG Loading commit data...
Banknote Authentication Loading commit data...
Beijing PM2.5 Loading commit data...
Bike Sharing Loading commit data...
Birmingham Parking Dataset Loading commit data...
Blood Transfusion Service Center Loading commit data...
Breast Cancer Wisconsin Loading commit data...
Car Evaluation Loading commit data...
Census Income Loading commit data...
Concrete Compressive Strength Loading commit data...
Coronavirus Loading commit data...
Daily Demand Forecasting Orders Loading commit data...
Default of Credit Card Clients Loading commit data...
Dow Jones Index Loading commit data...
EEG Eye State Dataset Loading commit data...
EEG Steady State Evoked Potential Dataset Loading commit data...
EU Population Poverty Status Dataset Loading commit data...
Echocardiogram Loading commit data...
Energy Efficiency Loading commit data...
Fertility Loading commit data...
Glass Identification Loading commit data...
Heart Disease Loading commit data...
Hepatitis Loading commit data...
Hepatitis C Virus (HCV) Classification Dataset Loading commit data...
Individual Household Electric Power Consumption Loading commit data...
Interstate-94 (I-94) Traffic Volume Dataset Loading commit data...
Istanbul Stock Exchange Loading commit data...
Liver Disorders Loading commit data...
Occupancy Detection Loading commit data...
Online News Popularity Loading commit data...
Portugal 2019 Election Dataset Loading commit data...
Qualitative Bankruptcy Loading commit data...
Real Estate Valuation Loading commit data...
Risk Factors for Cervical Cancer Loading commit data...
Travel Reviews Loading commit data...
US Tuberculosis Dataset Loading commit data...
User Knowledge Modeling Loading commit data...
Wholesale Customers Loading commit data...
Wireless Indoor Localization Loading commit data...
21.jpg Loading commit data...
README.md Loading commit data...

Data Sets to Uplift your Skills

  • Data Science Dojo has added more than 43 data sets to this repository.
  • The repository carries a diverse range of themes, difficulty levels, sizes and attributes.
  • They offer hands-on practice to boost their skills in exploratory data analysis, data visualization, data wrangling and machine learning.
  • The data sets below have been sorted with increasing level of difficulty for convenience (Beginner, Intermediate, Advanced).

In order to fork this repository, click on the link to the guide How to fork a project on GitLab.

Beginner:

Find out the age of Abalone from physical measurements
Regression Models | Environment

Predict student's knowledge level
Classification/Clustering | Education/Web

Can you predict the price of a house?
Regression Models | Real Estate

Can you estimate location from WIFI Signal Strength
Classification Models | Mobile/Location

Predict acceptability of a car
Classification Models | Automobile

Predict seminal quality of an individual
Regression/Classification Models | Healthcare/Life

Estimate chance of bankruptcy from qualitative parameters by experts
Classification Models | Finance/Banking

Understand driving patterns of Birmingham with respect to time and date
Regression/Classification Models | Transport and Mobility

Explore the effect of time, date and weather on traffic volume on a US Interstate
Regression Models | Transport and Mobility

Explore patterns in drug abuse between cities, age groups and racial groups
Classification Models | Healthcare/Social Sciences


Intermediate:

Can you predict the fuel-efficiency of a car?
Regression Models | Automobiles

Was that chest pain an indicator of a heart disease
Classification Models | Health Sciences

Predict total number of demand of orders
Regression Models | Business

Find out if a donor will give blood in March 2007
Classification Models | Business

Forecast pollution level of a city
Regression Models | Environment

Will the patient survive for at least one year after a heart attack
Classification Models | Automobiles

Estimate compressive strength of concrete
Regression Models | Civil Engineering/Construction

Discover patterns relating liver disorder and alcohol consumption
Classification/Regression/Clustering Models | Healthcare

Predict which stock will provide greatest rate of return
Clustering/Regression/Classification Models | Business/Finance

Assess heating and cooling load requirements of building
Regression/Classification Models | Energy

Determine the type of glass using oxide content
Classification Models | Physical

Predict chance of survival
Classification Models | Healthcare

Find patterns from spending data at wholesale
Classification/Clustering | Business/Retail

Group similar travel reviews
Clustering/Classification Models | Domain: Web

Relate returns of Istanbul Stock Exchange with other international indices
Regression/Classification Models | Business/Finance

Predict bike rental count (hourly/daily) based on the environmental & seasonal settings
Regression Models | Social

Detect Room Occupancy through Light, Temperature, Humidity and CO2 sensors
Classification Models | Energy/Buildings

Estimate whether a person’s income exceeds $50K/year
Classification Models | Social/Government

Predict the condition of a patients liver from their bloodwork
Classification Models | Healthcare

Predict future poverty trends in EU Countries
Regression Models | Social/Government

Predict the spread of Tuberculosis across the US
Regression Models | Healthcare

Determine if smoking, invasive birth control methods and a history of STDs can lead to Cervical Cancer
Classification Models | Healthcare


Advanced:

Detect Autistic Spectrum Disorder (ASD) cases
Classification Models | Healthcare/Social Sciences

Estimate the probability of Default
Classification Models | Business/Finance

Predict if a note is genuine
Classification Models | Banking/Finance

Find a short term forecast on electricity consumption of a single home
Regression/Clustering Models | Electricity

Predict the number of shares on social networks
Regression/Classification Models | Business/Web

Analyze the text or sentiment of products on Amazon, or recommend products
Text Analytics/Sentiment Analysis/Recommender Systems

Explore predictive modelling and numerical forecasting techniques
Regression Models | Social Sciences/Government

Explore changes in brain activity in humans in the presence and absence of a visual stimulus
Classification Models | Neuroscience/Healthcare

Explore patterns in brain activity based on multiple visual and non-visual stimuli
Classification Models | Neuroscience/Healthcare


Queries:

Can I use these datasets for my project?
Sure! You're totally free to do so.

Can I add a dataset here
Send us a pull request and we'll discuss

There seems to be a problem here.
If you find an issue, kindly raise it using help of this link