Data Sets to Uplift your Skills
- Data Science Dojo has added 30 data sets to this repository.
- The repository carries a diverse range of themes, difficulty levels, sizes and attributes.
- They offer hands-on practice to boost their skills in exploratory data analysis, data visualization, data wrangling and machine learning.
- The data sets below have been sorted with increasing level of difficulty for convenience (Beginner, Intermediate, Advanced).
In order to fork this repository, click on the link to the guide How to fork a project on GitLab.
Beginner:
Find out the age of Abalone from physical measurements
Regression Models | Environment
Predict student's knowledge level
Classification/Clustering | Education/Web
Can you predict the price of a house?
Regression Models | Real Estate
Can you estimate location from WIFI Signal Strength
Classification Models | Mobile/Location
Predict acceptability of a car
Classification Models | Automobile
Predict seminal quality of an individual
Regression/Classification Models | Healthcare/Life
Estimate chance of bankruptcy from qualitative parameters by experts
Classification Models | Finance/Banking
Intermediate:
Can you predict the fuel-efficiency of a car?
Regression Models | Automobiles
Was that chest pain an indicator of a heart disease
Classification Models | Health Sciences
Predict total number of demand of orders
Regression Models | Business
Find out if a donor will give blood in March 2007
Classification Models | Business
Forecast pollution level of a city
Regression Models | Environment
Will the patient survive for at least one year after a heart attack
Classification Models | Automobiles
Estimate compressive strength of concrete
Regression Models | Civil Engineering/Construction
Discover patterns relating liver disorder and alcohol consumption
Classification/Regression/Clustering Models | Healthcare
Predict which stock will provide greatest rate of return
Clustering/Regression/Classification Models | Business/Finance
Assess heating and cooling load requirements of building
Regression/Classification Models | Energy
Determine the type of glass using oxide content
Classification Models | Physical
Predict chance of survival
Classification Models | Healthcare
Find patterns from spending data at wholesale
Classification/Clustering | Business/Retail
Group similar travel reviews
Clustering/Classification Models | Domain: Web
Relate returns of Istanbul Stock Exchange with other international indices
Regression/Classification Models | Business/Finance
Predict bike rental count (hourly/daily) based on the environmental & seasonal settings
Regression Models | Social
Detect Room Occupancy through Light, Temperature, Humidity and CO2 sensors
Classification Models | Energy/Buildings
Estimate whether a person’s income exceeds $50K/year
Classification Models | Social/Government
Advanced:
Detect Autistic Spectrum Disorder (ASD) cases
Classification Models | Healthcare/Social Sciences
Estimate the probability of Default
Classification Models | Business/Finance
Predict if a note is genuine
Classification Models | Banking/Finance
Find a short term forecast on electricity consumption of a single home
Regression/Clustering Models | Electricity
Predict the number of shares on social networks
Regression/Classification Models | Business/Web
Queries:
Can I use these datasets for my project?
Sure! You're totally free to do so.
Can I add a dataset here
Send us a pull request and we'll discuss
There seems to be a problem here.
If you find an issue, kindly raise it using help of this link
