Commit 10cd74d4 by Rebecca Merrett

Changing directory example and multi-line comments

parent 814cc5fb
......@@ -6,7 +6,7 @@ library(tseries)
# Set your working directory to where your script and
# data files sit on your local computer
setwd("C:\\Users\\Rebecca\\Desktop\\time_series")
setwd("C:\\Your\\Directory\\Here")
# Read csv file of train dataset as a univariate
# (single variable) series, with datetime
......@@ -145,36 +145,36 @@ mean_absolute_error
# is stationary
adf.test(hourly_sentiment_series_diff2)
#-Need to better transform these data:
# You could look at stabilizing the variance by applying
# the cube root for neg and pos values and then
# difference the data
#-You might compare models with different AR and MA terms
#-This is a very small sample size of 24 timestamps,
# so might not have enough to spare for a holdout set
# To get more use out of your data for training, rolling over time
# series or timestamps at a time for different holdout sets
# allows for training on more timestamps; doesn't stop the model from
# capturing the last chunk of timestamps stored in a single holdout set
#-The data only looks at 24 hours in one day
# Would we start to capture more of a trend in hourly sentiment if we
# collected data over several days?
# How would you go about collecting more data?
# Take on the challenge and further improve this model:
# You have been given a head start, now take this example
# and improve on it!
# To study time series further:
#-Look at model diagnostics
#-Use AIC to search best model parameters
#-Handle any datetime data issues
#-Try other modeling techniques
# Learn more during a short, intense bootcamp:
# Time Series to be introduced in Data Science Dojo's
# post bootcamp material
# Data Science Dojo's bootcamp also covers some other key
# machine learning algorithms and techniques and takes you through
# the critical thinking process behind many data science tasks
# Check out the curriculum: https://datasciencedojo.com/bootcamp/curriculum/
"-Need to better transform these data:
You could look at stabilizing the variance by applying
the cube root for neg and pos values and then
difference the data
-You might compare models with different AR and MA terms
-This is a very small sample size of 24 timestamps,
so might not have enough to spare for a holdout set
To get more use out of your data for training, rolling over time
series or timestamps at a time for different holdout sets
allows for training on more timestamps; doesn't stop the model from
capturing the last chunk of timestamps stored in a single holdout set
-The data only looks at 24 hours in one day
Would we start to capture more of a trend in hourly sentiment if we
collected data over several days?
How would you go about collecting more data?
Take on the challenge and further improve this model:
You have been given a head start, now take this example
and improve on it!
To study time series further:
-Look at model diagnostics
-Use AIC to search best model parameters
-Handle any datetime data issues
-Try other modeling techniques
Learn more during a short, intense bootcamp:
Time Series to be introduced in Data Science Dojo's
post bootcamp material
Data Science Dojo's bootcamp also covers some other key
machine learning algorithms and techniques and takes you through
the critical thinking process behind many data science tasks
Check out the curriculum: https://datasciencedojo.com/bootcamp/curriculum/"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment