R Tutorial Quick
R Tutorial Quick
Before R, there wa s S. S-Plus implements the S Language. ( this is expensive) The free version of S-Plus is R. We deal with sets of data. So we need to make a vector of values. R is case-sensitive.
Length: Useful to find out how much data I have We can do arithmetic with vectors. x+y
Perform x 3
Mean|
Rnorm(100)
[1] indicates the first element, [6] indicates the 6th element and so on. Mean(y) If you have multiple commands on one line, separate by semi-colns
Pearson correlation
Enter commands in the script file and see the outpyut in the R window.
Output
Conclusion: Failed to reject Null Hypothesis. You can believe that the sample came from a population of mu=60.
If you see the + sign in R console, either give )(missing bracket) or press ESC.
Mean is close to 0 Std deviation is 1. You may not always want this.
Data=rnorm(n, mean = 50, sd = 10)
Scatter Plots
d1 is a data frame.
Using attach
dont use it too often. There could be two data frames with the same column name. So in that case, it will pick the column in the last attached frame. Similarly detach
Well put data in excel and save it as csv files. first change directory:
change it to the directory where you created the csv file. use getwd() to confirm your working directory
read.csv("Rdata_import",header=T)
Save it in a variable
you can use the tab feature to autocomplete. like you type my..and press tab, it automatically fills up mydata. and so on..
import Record2.csv
The file is too big. So I just want to see the column names. Do names(rec). I want to see the first few rows. head(rec): default shows 6 rows. head(rec,10): show me the first 10 rows tail(rec)
Conclusion: Reject the null hypothesis. Class2 1 patient-satis.csv a) Plot histogram of age,severity,anxiety b) Find the mean,sd of these 3 variables Use par(mfrow=c(3,1)) to define a window of 3 rows and 1 column
After entering data, click close. It automatically saves it. Then type the variable name to view it
Data above. Model is mod = lm( y ~ x1+x2) here x1,x2 are predictors.
Exercise: Build a linear regression model for the Record2.csv data. Use the . notation to keep things simple.
music.mod = lm(sales~adverts+airplay+attract, data=music) Basically we are saying that this is a model object for our music dataset. data=music is used to save typing. use the .mod notation to keep it easy for bookkeeping. Plot the residuals
Now we find that the vac_rate isnt a good predictor. So we run the model without that.
Make predictions now use predict(name of the model, dataframe which contains the new data) M
Work on forest.csv Predict popdens 400 500 600 700 cropch 30 35 35 5 pasturech 10 12 15 5
Install Packages install foreign package which allows you to import excel files.