Commit 61edf702 by Rahim Rasool

parent 028e956a

161 KB

 Data Science Dojo
Copyright (c) 2016 - 2019 --- **Level:** Intermediate
**Recommended Use:** Clustering/Regression/Classification Models
## Dow Jones Index Data Set ### Predict which stock will provide greatest rate of return --- ![](9.jpg) --- This *intermediate* level data set has 750 rows and 16 columns. This dataset contains weekly data for the Dow Jones Industrial Index. It has been used in computational investing research. In this dataset, each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, this could be used to determine which stock will produce the greatest rate of return in the following week. This data set is recommended for learning and practicing your skills in **exploratory data analysis**, **data visualization**, **clustering** and **regression/classification modelling techniques**. Feel free to explore the data set with multiple **supervised** and **unsupervised** learning techniques. The Following data dictionary gives more details on this data set: --- ### Data Dictionary | Column Position | Atrribute Name | Definition | Data Type | Example | % Null Ratios | |------------------- |------------------------------------ |----------------------------------------------------------------------------------------------------------------------------------------------------------------- |-------------- |--------------------------------------- |--------------- | | 1 | quarter | Quarter: the yearly quarter (1: Jan-Mar; 2: Apr-Jun). | Quantitative | 1, 2 | 0 | | 2 | stock | Stock: the stock symbol* | Qualitative | INTC, INTC, BA | 0 | | 3 | date | Date: the last business day of the work (this is typically a Friday) | Quantitative | 40564, 40683, 40620 | 0 | | 4 | open | Open: the price of the stock at the beginning of the week | Quantitative | \$21.03, \$23.32, \$71.17 | 0 | | 5 | high | High: the highest price of the stock during the week | Quantitative | \$21.2, \$23.96, \$71.23 | 0 | | 6 | low | Low: the lowest price of the stock during the week | Quantitative | \$20.62, \$23.08, \$67.34 | 0 | | 7 | close | Close: the price of the stock at the end of the week | Quantitative | \$20.82, \$23.22, \$69.1 | 0 | | 8 | volume | Volume: the number of shares of stock that traded hands in the week | Quantitative | 218479469, 387571150, 29746370 | 0 | | 9 | percent_change_price | Percent_Change_Price: the percentage change in price throughout the week | Quantitative | -0.998573, -0.428816, -2.90853 | 0 | | 10 | percent_change_volume_over_last_wk | Percent_Change_Volume_Over_Last_Week: the percentage change in the number of shares of stock that traded hands for this week compared to the previous week | Quantitative | -20.29526016, 12.41924755, 16.3954667 | 4 | | 11 | previous_weeks_volume | Previous_Weeks_Volume: the number of shares of stock that traded hands in the previous week | Quantitative | 274111012, 344755154, 25556296 | 4 | | 12 | next_weeks_open | Next_Weeks_Open: the opening price of the stock in the following week | Quantitative | \$21.03, \$22.92, \$70.29 | 0 | | 13 | next_weeks_close | Next_Weeks_Close: the closing price of the stock in the following week | Quantitative | \$21.46, \$22.21, \$73.34 | 0 | | 14 | percent_change_next_weeks_price | Percent_Change_Next_Weeks_Price: the percentage change in price of the stock in the | Quantitative | 2.0447, -3.09773, 4.33917 | 0 | | 15 | days_to_next_dividend | Following Week Days_to_next_dividend: the number of days until the next dividend | Quantitative | 13, 75, 54 | 0 | | 16 | percent_return_next_dividend | Percent_Return_Next_Dividend: the percentage of return on the next dividend | Quantitative | 0.864553, 0.904393, 0.607815 | 0 | --- *Stock Symbols:
3M MMM
American Express AXP
Alcoa AA
AT&T T
Bank of America BAC
Boeing BA
Caterpillar CAT
Chevron CVX
Cisco Systems CSCO
Coca-Cola KO
DuPont DD
ExxonMobil XOM
General Electric GE
Hewlett-Packard HPQ
The Home Depot HD
Intel INTC
IBM IBM
Johnson & Johnson JNJ
JPMorgan Chase JPM
Kraft KRFT
McDonald's MCD
Merck MRK
Microsoft MSFT
Pfizer PFE
Procter & Gamble PG
Travelers TRV
United Technologies UTX
Verizon VZ
Wal-Mart WMT
Walt Disney DIS
--- ### Acknowledgement This data set has been sourced from the Machine Learning Repository of University of California, Irvine [Dow Jones Index Data Set (UC Irvine)](https://archive.ics.uci.edu/ml/datasets/Dow+Jones+Index). The UCI page mentions the following publication as the original source of the data set: *Brown, M. S., Pelosi, M. & Dirska, H. (2013). Dynamic-radius Species-conserving Genetic Algorithm for the Financial Forecasting of Dow Jones Index Stocks. Machine Learning and Data Mining in Pattern Recognition, 7988, 27-41*
This diff is collapsed. Click to expand it.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!