Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
tutorials
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Pablo Ruiz
tutorials
Commits
ea4aab51
Unverified
Commit
ea4aab51
authored
6 years ago
by
ningxixu
Committed by
GitHub
6 years ago
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Create dplyr1
parent
c6c3bc6e
master
No related merge requests found
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
28 additions
and
0 deletions
+28
-0
dplyr1
dplyr1
+28
-0
No files found.
dplyr1
0 → 100644
View file @
ea4aab51
wine = read.csv('wine.csv', stringsAsFactors = F, encoding = 'UTF-8')
install.packages('dplyr')
install.packages('ggplot2')
library(dplyr)
library(ggplot2)
wine = wine[,-1]
wine = wine %>% select(-c(description))
wine %>% group_by(country) %>% summarize(count=n()) %>% arrange(desc(count))
selected_countries = wine %>% group_by(country) %>% summarize(count=n()) %>% arrange(desc(count)) %>% top_n(10) %>% select(country)
selected_countries = as.character(selected_countries$country)
select_points=wine %>% filter(country %in% selected_countries) %>% select(country, points) %>% arrange(country)
ggplot(wine, aes(points,price)) + geom_point() + geom_smooth()
ggplot(select_points, aes(x=reorder(country,points,median),y=points)) + geom_boxplot(aes(fill=country)) + xlab("Country") + ylab("Points") + ggtitle("Distribution of Top 10 Wine Producing Countries") + theme(plot.title = element_text(hjust = 0.5))
wine %>% filter(!(country %in% selected_countries)) %>% group_by(country) %>% summarize(median=median(points)) %>% arrange(desc(median))
top=wine %>% group_by(country) %>% summarize(median=median(points)) %>% arrange(desc(median))
top=as.character(top$country)
both=intersect(top,selected_countries)
topwine = wine %>% group_by(variety) %>% summarize(number=n()) %>% arrange(desc(number)) %>% top_n(10)
topwine=as.character(topwine$variety)
wine %>% filter(variety %in% topwine) %>% group_by(variety)%>% summarize(median=median(points)) %>% ggplot(aes(reorder(variety,median),median)) + geom_col(aes(fill=variety)) + xlab('Variety') + ylab('Median Point') + scale_x_discrete(labels=abbreviate)
top15percent=wine %>% arrange(desc(points)) %>% filter(points > quantile(points, prob = 0.85))
cheapest15percent=wine %>% arrange(price) %>% head(nrow(top15percent))
goodvalue = intersect(top15percent,cheapest15percent)
goodvalue
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment