README.md 2.8 KB
Newer Older
Rahim Rasool committed
1 2 3 4 5
Data Science Dojo <br/>
Copyright (c) 2016 - 2019

---

Rahim Rasool committed
6 7
**Level:** Intermediate <br/>
**Recommended Use:** Classification/Regression/Clustering Models<br/>
Tarun Shrivas committed
8
**Domain:** Healthcare<br/>
Rahim Rasool committed
9

Tarun Shrivas committed
10
## Liver Disorders Data Set
Rahim Rasool committed
11

Tarun Shrivas committed
12
### Patterns relating liver disorder and alcohol consumption
Rahim Rasool committed
13 14 15 16 17 18


---
![](1265.jpg)
---

Rahim Rasool committed
19
This *intermediate* level data set has 345 rows and 7 columns, where the 7th column is not a variable but just a train/test selector. The headers have to be manually placed in the data set.
Rahim Rasool committed
20
The dataset does not contain any variable representing presence or absence of a liver disorder.
Tarun Shrivas committed
21 22
This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, **regression**, **classification** and **clustering** modelling techniques.
Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques.
Rahim Rasool committed
23
The Following data dictionary gives more details on this data set:
Rahim Rasool committed
24 25 26 27 28

---

### Data Dictionary

Tarun Shrivas committed
29
| Column   Position 	| Attribute Name            	| Definition                                                           	| Data Type    	| Example       	| % Null Ratios 	|
Rahim Rasool committed
30 31 32 33 34 35 36 37 38 39 40
|-------------------	|---------------------------	|----------------------------------------------------------------------	|--------------	|---------------	|---------------	|
| 1                 	| Mcv                       	| Mean Corpuscular Volume (Blood Test)                                 	| Quantitative 	| 85, 91, 96    	| 0             	|
| 2                 	| Alkphos                   	| Alkaline Phosphotase (Blood Test)                                    	| Quantitative 	| 78, 55, 70    	| 0             	|
| 3                 	| Sgpt                      	| Alamine Aminotransferase (Blood Test)                                	| Quantitative 	| 45, 12, 15    	| 0             	|
| 4                 	| Sgot                      	| Aspartate Aminotransferase (Blood Test)                              	| Quantitative 	| 27, 17, 33    	| 0             	|
| 5                 	| Gammagt                   	| Gamma-Glutamyl Transpeptidase (Blood Test)                           	| Quantitative 	| 31, 54, 36    	| 0             	|
| 6                 	| Drinks                    	| Number of half-pint equivalents of alcoholic beverages drunk per day 	| Quantitative 	| 0.5, 3.0, 8.0 	| 0             	|
| 7                 	| Selector (Not a variable) 	| Field used to split data into two sets                               	| Quantitative 	| 1, 2          	| 0             	|

### Acknowledgement

Tarun Shrivas committed
41 42
This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Liver Disorders Data Set (UC Irvine)](https://archive.ics.uci.edu/ml/datasets/Liver+Disorders).
The UCI page mentions BUPA Medical Research Ltd as the original source of the data set.