README.md 9.55 KB
Newer Older
Rahim Rasool committed
1
Data Science Dojo <br/>
2
Copyright (c) 2019 - 2020
Rahim Rasool committed
3 4 5 6 7 8 9 10 11

---

**Level:** Intermediate <br/>
**Recommended Use:** Regression Models<br/>
**Domain:** Social<br/> 

## Bike Sharing Data Set 

12
### Predict bike rental count (hourly/daily) based on the environmental & seasonal settings
Rahim Rasool committed
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

---
![](411.jpg)
---

This *intermediate* level dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.
Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions, precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors.
This contains 2 files: Bike sharing counts aggregated on hourly basis (hour.csv - 17379 rows, 17 columns) & bike sharing counts aggregated on daily basis (day.csv - 731 rows, 16 columns)

This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, and **regression modelling techniques**.
This data set could also be used to discover important trends and relationships.
Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set. All columns (except hr) are similar in both the data sets:

---

### Data Dictionary 

| Column   Position 	| Atrribute Name 	| Definition                                                                                                                                                                                                                                                                                                 	| Data Type    	| Example                            	| % Null Ratios 	|
|-------------------	|----------------	|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------	|--------------	|------------------------------------	|---------------	|
| 1                 	| instant        	| Record Index                                                                                                                                                                                                                                                                                               	| Quantitative 	| 190, 7, 17180                      	| 0             	|
| 2                 	| dteday         	| Date (Format: YYYY-MM-DD)                                                                                                                                                                                                                                                                                  	| Quantitative 	| 2012-12-23, 2012-01-01, 2012-06-24 	| 0             	|
| 3                 	| season         	| Season (1:   springer, 2: summer, 3: fall, 4: winter)                                                                                                                                                                                                                                                      	| Quantitative 	| 1, 2, 4                            	| 0             	|
| 4                 	| yr             	| Year (0: 2011,   1:2012)                                                                                                                                                                                                                                                                                   	| Quantitative 	| 0, 1                               	| 0             	|
| 5                 	| mnth           	| Month (1 to 12)                                                                                                                                                                                                                                                                                            	| Quantitative 	| 1, 6, 12                           	| 0             	|
| 6                 	| hr             	| Hour (0 to 23) - Not in day.csv dataset                                                                                                                                                                                                                                                                    	| Quantitative 	| 4, 6, 14                           	| 0             	|
| 7                 	| holiday        	| Weather day is   holiday or not                                                                                                                                                                                                                                                                            	| Quantitative 	| 0, 1                               	| 0             	|
| 8                 	| weekday        	| Day of the   week                                                                                                                                                                                                                                                                                          	| Quantitative 	| 0, 6, 3                            	| 0             	|
| 9                 	| workingday     	| Working Day: If day is neither weekend nor holiday is 1, otherwise is 0                                                                                                                                                                                                                                    	| Quantitative 	| 0, 1                               	| 0             	|
| 10                	| weathersit     	| Weather Situation (1: Clear, Few   clouds, Partly cloudy, Partly cloudy; 2: Mist + Cloudy, Mist + Broken clouds,   Mist + Few clouds, Mist; 3: Light Snow, Light Rain + Thunderstorm + Scattered   clouds, Light Rain + Scattered clouds, 4: Heavy Rain + Ice Pallets +   Thunderstorm + Mist, Snow + Fog) 	| Quantitative 	| 1, 2, 3                            	| 0             	|
| 11                	| temp           	| Normalized   temperature in Celsius. The values are derived via (t-t_min)/(t_max-t_min),   t_min=-8, t_max=+39 (only in hourly scale)                                                                                                                                                                      	| Quantitative 	| 0.08, 0.22, 0.34                   	| 0             	|
| 12                	| atemp          	| Normalized   feeling temperature in Celsius. The values are derived via   (t-t_min)/(t_max-t_min), t_min=-16, t_max=+50 (only in hourly scale)                                                                                                                                                             	| Quantitative 	| 0.0909, 0.2727, 0.303              	| 0             	|
| 13                	| hum            	| Normalized humidity. The values are divided to 100 (max)                                                                                                                                                                                                                                                   	| Quantitative 	| 0.53, 0.8, 0.31                    	| 0             	|
| 14                	| windspeed      	| Normalized wind speed. The values are divided to 67 (max)                                                                                                                                                                                                                                                  	| Quantitative 	| 0.194, 0, 0.2985                   	| 0             	|
| 15                	| casual         	| Count of casual users                                                                                                                                                                                                                                                                                      	| Quantitative 	| 0, 2, 57                           	| 0             	|
| 16                	| registered     	| Count of   registered users                                                                                                                                                                                                                                                                                	| Quantitative 	| 1, 0, 118                          	| 0             	|
| 17                	| cnt            	| Count of total rental bikes including both casual and registered                                                                                                                                                                                                                                           	| Quantitative 	| 1, 2, 175                          	| 0             	|
---

### Acknowledgement


This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Bike Sharing Data Set (UC Irvine)](https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset). 
The UCI page mentions the following publication as the original source of the data set:

*Fanaee-T, Hadi, and Gama, Joao, 'Event labeling combining ensemble detectors and background knowledge', Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg*