# Data Sets to Uplift your Skills + Data Science Dojo has added more than 43 data sets to this repository. + The repository carries a diverse range of themes, difficulty levels, sizes and attributes. + They offer hands-on practice to boost their skills in exploratory data analysis, data visualization, data wrangling and machine learning. + The data sets below have been sorted with increasing level of difficulty for convenience (Beginner, Intermediate, Advanced). ![](21.jpg) ##### In order to fork this repository, click on the link to the guide [How to fork a project](https://docs.gitlab.com/ee/gitlab-basics/fork-project.html) on GitLab. --- ### Beginner: [**Find out the age of Abalone from physical measurements**](Abalone)
Regression Models | Environment [**Predict student's knowledge level**](User Knowledge Modeling)
Classification/Clustering | Education/Web [**Can you predict the price of a house?**](Real Estate Valuation)
Regression Models | Real Estate [**Can you estimate location from WIFI Signal Strength**](Wireless Indoor Localization)
Classification Models | Mobile/Location [**Predict acceptability of a car**](Car Evaluation)
Classification Models | Automobile [**Predict seminal quality of an individual**](Fertility)
Regression/Classification Models | Healthcare/Life [**Estimate chance of bankruptcy from qualitative parameters by experts**](Qualitative Bankruptcy)
Classification Models | Finance/Banking [**Understand driving patterns of Birmingham with respect to time and date**](Birmingham Parking Dataset)
Regression/Classification Models | Transport and Mobility [**Explore the effect of time, date and weather on traffic volume on a US Interstate**](https://code.datasciencedojo.com/datasciencedojo/datasets/tree/patch-1/Interstate-94%20(I-94)%20Traffic%20Volume%20Dataset)
Regression Models | Transport and Mobility [**Explore patterns in drug abuse between cities, age groups and racial groups**](Accidental Drug Related Deaths in Connecticut, US)
Classification Models | Healthcare/Social Sciences --- ### Intermediate: [**Can you predict the fuel-efficiency of a car?**](Auto MPG)
Regression Models | Automobiles [**Was that chest pain an indicator of a heart disease**](Heart Disease)
Classification Models | Health Sciences [**Predict total number of demand of orders**](Daily Demand Forecasting Orders)
Regression Models | Business [**Find out if a donor will give blood in March 2007**](Blood Transfusion Service Center)
Classification Models | Business [**Forecast pollution level of a city**](Beijing PM2.5)
Regression Models | Environment [**Will the patient survive for at least one year after a heart attack**](Echocardiogram)
Classification Models | Automobiles [**Estimate compressive strength of concrete**](Concrete Compressive Strength)
Regression Models | Civil Engineering/Construction [**Discover patterns relating liver disorder and alcohol consumption**](Liver Disorders)
Classification/Regression/Clustering Models | Healthcare [**Predict which stock will provide greatest rate of return**](Dow Jones Index)
Clustering/Regression/Classification Models | Business/Finance [**Assess heating and cooling load requirements of building**](Energy Efficiency)
Regression/Classification Models | Energy [**Determine the type of glass using oxide content**](Glass Identification)
Classification Models | Physical [**Predict chance of survival**](Hepatitis)
Classification Models | Healthcare [**Find patterns from spending data at wholesale**](Wholesale Customers)
Classification/Clustering | Business/Retail [**Group similar travel reviews**](Travel Reviews)
Clustering/Classification Models | Domain: Web [**Relate returns of Istanbul Stock Exchange with other international indices**](Istanbul Stock Exchange)
Regression/Classification Models | Business/Finance [**Predict bike rental count (hourly/daily) based on the environmental & seasonal settings**](Bike Sharing)
Regression Models | Social [**Detect Room Occupancy through Light, Temperature, Humidity and CO2 sensors**](Occupancy Detection)
Classification Models | Energy/Buildings [**Estimate whether a personâ€™s income exceeds \$50K/year**](Census Income)
Classification Models | Social/Government [**Predict the condition of a patients liver from their bloodwork**](https://code.datasciencedojo.com/datasciencedojo/datasets/tree/patch-1/Hepatitis%20C%20Virus%20(HCV)%20Classification%20Dataset)
Classification Models | Healthcare [**Predict future poverty trends in EU Countries**](EU Population Poverty Status Dataset)
Regression Models | Social/Government [**Predict the spread of Tuberculosis across the US**](US Tuberculosis Dataset)
Regression Models | Healthcare [**Determine if smoking, invasive birth control methods and a history of STDs can lead to Cervical Cancer**](Risk Factors for Cervical Cancer)
Classification Models | Healthcare --- ### Advanced: [**Detect Autistic Spectrum Disorder (ASD) cases**](Autism Screening Adult)
Classification Models | Healthcare/Social Sciences [**Estimate the probability of Default**](Default of Credit Card Clients)
Classification Models | Business/Finance [**Predict if a note is genuine**](Banknote Authentication)
Classification Models | Banking/Finance [**Find a short term forecast on electricity consumption of a single home**](Individual Household Electric Power Consumption)
Regression/Clustering Models | Electricity [**Predict the number of shares on social networks**](Online News Popularity)
Regression/Classification Models | Business/Web [**Analyze the text or sentiment of products on Amazon, or recommend products**](Amazon Product Reviews)
Text Analytics/Sentiment Analysis/Recommender Systems --- ### Queries: **Can I use these datasets for my project?**
Sure! You're totally free to do so. **Can I add a dataset here**
Send us a pull request and we'll discuss **There seems to be a problem here.**
If you find an issue, kindly raise it using help of this [link](https://docs.gitlab.com/ee/user/project/issues/create_new_issue.html)