Commit 726a60b1 by Rahim Rasool

Add household_electric, glass, daily_demand & concrete dataset

parent f79a29c3
Data Science Dojo <br/>
Copyright (c) 2016 - 2019
**Level:** Intermediate <br/>
**Recommended Use:** Regression Models<br/>
**Domain:** Civil Engineering/Construction<br/>
## Concrete Compressive Strength Data Set
### Estimate compressive strength of concrete
This *intermediate* level data set has 1030 rows and 9 columns.
Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients
The actual concrete compressive strength (MPa) for a given mixture under a specific age (days) was determined from laboratory. Data is in raw form (not scaled).
This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, and **regression modelling techniques**.
It also allows you to practice with non-linear functions. Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set:
### Data Dictionary
| Column Position | Atrribute Name | Definition | Data Type | Example | % Null Ratios |
|------------------- |-------------------------------------------------------- |------------------------------------------------------------------------------- |-------------- |-------------------------------------- |--------------- |
| 1 | Cement (component 1)(kg in a m^3 mixture) | Cement (component 1) -- Kilogram in a meter-cube mixture -- Input Variable | Quantitative | 194.68, 379.5, 167.95 | 0 |
| 2 | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Blast Furnace Slag (component 2) -- kg in a m3 mixture -- Input Variable | Quantitative | 0, 151.2, 42.08 | 0 |
| 3 | Fly Ash (component 3)(kg in a m^3 mixture) | Fly Ash (component 3) -- kg in a m3 mixture -- Input Variable | Quantitative | 100.52, 0, 163.83 | 0 |
| 4 | Water (component 4)(kg in a m^3 mixture) | Water (component 4) -- kg in a m3 mixture -- Input Variable | Quantitative | 165.62, 153.9, 121.75 | 0 |
| 5 | Superplasticizer (component 5)(kg in a m^3 mixture) | Superplasticizer (component 5) -- kg in a m3 mixture -- Input Variable | Quantitative | 7.48, 15.9, 5.72 | 0 |
| 6 | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Coarse Aggregate (component 6) -- kg in a m3 mixture -- Input Variable | Quantitative | 1006.4, 1134.3, 1058.7 | 0 |
| 7 | Fine Aggregate (component 7)(kg in a m^3 mixture) | Fine Aggregate (component 7) -- kg in a m3 mixture -- Input Variable | Quantitative | 905.9, 605, 780.11 | 0 |
| 8 | Age (day) | Age -- Day (1-365) -- Input Variable | Quantitative | 56, 91, 28 | 0 |
| 9 | Concrete compressive strength(MPa, megapascals) | Concrete compressive strength -- MegaPascals -- Output Variable | Quantitative | 33.96358776, 56.49566344, 32.8535314 | 0 |
### Acknowledgement
This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Concrete Compressive Strength Data Set (UC Irvine)](
The UCI page mentions the following publication as the original source of the data set:
*I-Cheng Yeh, "Modeling of strength of high performance concrete using artificial neural networks," Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998)*
Week of the month (first week, second, third, fourth or fifth week;Day of the week (Monday to Friday);Non-urgent order;Urgent order;Order type A;Order type B;Order type C;Fiscal sector orders;Orders from the traffic controller sector;Banking orders (1);Banking orders (2);Banking orders (3);Target (Total orders)
Data Science Dojo <br/>
Copyright (c) 2016 - 2019
**Level:** Intermediate <br/>
**Recommended Use:** Regression Models<br/>
**Domain:** Business<br/>
## Daily Demand Forecasting Orders Data Set
### Predict total number of demand of orders
This *intermediate* level data set has 60 rows and 13 columns.
The dataset was collected during 60 days, this is a real database of a brazilian logistics company.
The dataset has twelve predictive attributes and a target that is the total of orders for daily treatment.
This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, and **regression modelling techniques**.
It also allows you to practice with large number of features. Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set:
### Data Dictionary
| Column Position | Atrribute Name | Definition | Data Type | Example | % Null Ratios |
|------------------- |--------------------------------------------- |------------------------------------------------------------------------------- |-------------- |--------------------------- |--------------- |
| 1 | Week of the month | Week of the month (1: first, 2: second, 3: third, 4: fourth, 5:fifth) | Quantitative | 1, 2, 3 | 0 |
| 2 | Day of the week | Day of the week (2: Monday, 3: Tuesday, 4: Wednesday, 5:Thursday, 6:Friday) | Quantitative | 2, 3, 4 | 0 |
| 3 | Non-urgent order | Non-urgent order | Quantitative | 171.297, 220.343, 127.805 | 0 |
| 4 | Urgent order | Urgent order | Quantitative | 127.667, 141.406, 114.813 | 0 |
| 5 | Order type A | Order type A | Quantitative | 41.542, 46.241, 39.025 | 0 |
| 6 | Order type B | Order type B | Quantitative | 113.294, 120.865, 110.74 | 0 |
| 7 | Order type C | Order type C | Quantitative | 162.284, 196.296, 94.47 | 0 |
| 8 | Fiscal sector orders | Fiscal sector orders | Quantitative | 18.156, 1.653, 1.617 | 0 |
| 9 | Orders from the traffic controller sector | Orders from the traffic controller sector | Quantitative | 49971, 34878, 33366 | |
| 10 | Banking orders (1) | Banking orders (1) | Quantitative | 33703, 32905, 21103 | 0 |
| 11 | Banking orders (2) | Banking orders (2) | Quantitative | 69054, 117137, 84558 | 0 |
| 12 | Banking orders (3) | Banking orders (3) | Quantitative | 18423, 29188, 16683 | 0 |
| 13 | Target (Total orders) | Target (Total orders) | Quantitative | 317.12, 363.402, 244.235 | 0 |
### Acknowledgement
This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Daily Demand Forecasting Orders Data Set (UC Irvine)](
The UCI page mentions the following publication as the original source of the data set:
*Ferreira, R. P., Martiniano, A., Ferreira, A., Ferreira, A., & Sassi, R. J. (2016). Study on daily demand forecasting orders using artificial neural network. IEEE Latin America Transactions, 14(3), 1519-1525*
Glass Identification/211.jpg

339 KB

Data Science Dojo <br/>
Copyright (c) 2016 - 2019
**Level:** Intermediate<br/>
**Recommended Use:** Classification Models<br/>
**Domain:** Physical<br/>
## Glass Identification Data Set
### Predict the type of glass
This *intermediate* level data set has 214 rows and 10 columns.
The data set provides details about 6 types of glass, defined in terms of their oxide content (i.e. Na, Fe, K, etc).
This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, and **classification modelling techniques**.
Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set:
### Data Dictionary
| Column Position | Atrribute Name | Definition | Data Type | Example | % Null Ratios |
|------------------- |---------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------- |--------------------------- |--------------- |
| 1 | Id number | Id number from 1 to 214 | Quantitative | 16, 75, 211 | 0 |
| 2 | RI | RI: Refractive Index | Quantitative | 1.51755, 1.51613, 1.51844 | 0 |
| 3 | Na | NA: Sodium (unit measurement: weight percent in corresponding oxide) | Quantitative | 13.19, 12.79, 14.21 | 0 |
| 4 | Mg | Mg: Magnesium (unit measurement: weight percent in corresponding oxide) | Quantitative | 3.82, 2.87, 3.59 | 0 |
| 5 | Al | Al: Aluminum (unit measurement: weight percent in corresponding oxide) | Quantitative | 1.56, 1.43, | 0 |
| 6 | Si | Si: Silicon (unit measurement: weight percent in corresponding oxide) | Quantitative | 73.20, 71.77, 72.95 | 0 |
| 7 | K | K: Potassium (unit measurement: weight percent in corresponding oxide) | Quantitative | 0.67, 0.57, 0.11 | 0 |
| 8 | Ca | Ca: Calcium (unit measurement: weight percent in corresponding oxide) | Quantitative | 8.09, 7.83, 9.57 | 0 |
| 9 | Ba | Ba: Barium (unit measurement: weight percent in corresponding oxide) | Quantitative | 0.00, 0.11, 0.27 | 0 |
| 10 | Fe | Fe: Iron (unit measurement: weight percent in corresponding oxide) | Quantitative | 0.11, 0.14, 0.00 | 0 |
| 11 | Type of Glass | Glas Type (1: building_windows_float_processed, 2: building_windows_non_float_processed, 3: vehicle_windows_float_processed, 4: vehicle_windows_non_float_processed, 5: containers, 6: tableware, 7: headlamps) | Quantitative | 2, 5, 7 | 0 |
### Acknowledgement
This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Glass Identification Data Set (UC Irvine)](
The UCI page mentions USA Forensic Science Service as the following as the original source of the data set.
Data Science Dojo <br/>
Copyright (c) 2016 - 2019
**Level:** Intermediate <br/>
**Recommended Use:** Regression/Clustering Models<br/>
**Domain:** Electricity<br/>
## Individual household electric power consumption Data Set
### Find a short term forecast on electricity consumption of a single home
This *intermediate* level data set has 2075259 rows and 9 columns.
This dataset provides measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years.
Different electrical quantities and some sub-metering values are available.
This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, **clustering** and **regression modelling techniques**.
It also allows you to practice with large number of features. Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set:
### Data Dictionary
| Column Position | Atrribute Name | Definition | Data Type | Example | % Null Ratios |
|------------------- |----------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------- |---------------------------------- |--------------- |
| 1 | Date | Date: Date in format dd/mm/yyyy | Quantitative | 16/12/2006, 10/5/2007, 24/9/2007 | ? |
| 2 | Time | Time: time in format hh:mm:ss | Quantitative | 17:27:00, 6:56:00, 10:00:00 | ? |
| 3 | Global_Active_Power | Global_active_power: Household global minute-averaged active power (in kilowatt) | Quantitative | 4.216, 5.412, 3.488 | ? |
| 4 | Global_Reactive_Power | Global_reactive_power: Household global minute-averaged reactive power (in kilowatt) | Quantitative | 0.418, 0.47, 0.228 | ? |
| 5 | Voltage | Voltage: Minute-averaged voltage (in volt) | Quantitative | 234.84, 232.78, 233.06 | ? |
| 6 | Global_Intensity | Global_intensity: Household global minute-averaged current intensity (in ampere) | Quantitative | 18.4, 23.2, 15 | ? |
| 7 | Sub_Metering_1 | Sub_metering_1: Energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered). | Quantitative | 1, 38, 17 | ? |
| 8 | Sub_Metering_2 | Sub_metering_2: Energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light. | Quantitative | 1, 36, 5 | ? |
| 9 | Sub_Metering_3 | Sub_metering_3: Energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner | Quantitative | 17, 0, 18 | ? |
(global_active_power*1000/60 - sub_metering_1 - sub_metering_2 - sub_metering_3) represents the active energy consumed every minute (in watt hour) in the household by electrical equipment not measured in sub-meterings 1, 2 and 3
### Acknowledgement
This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Individual household electric power consumption Data Set (UC Irvine)](
The UCI page mentions the following as the source of the data set:
*Georges Hebrail (georges.hebrail '@', Senior Researcher, EDF R&D, Clamart, France*
*Alice Berard, TELECOM ParisTech Master of Engineering Internship at EDF R&D, Clamart, France*
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment