0% found this document useful (0 votes)
284 views

R - How To Do Web Scraping of Key Stats From FinViz Tables Per Stock - Stack Overflow

1. The user wants to scrape key stats from FinViz tables for individual stocks into a tidy format. 2. The expert recommends using the rvest package to scrape the data, which retrieves the relevant table as a data frame. 3. Some data wrangling is required to reshape the data into a long format with one row per stock-variable pair. This includes using stack, cbind, and as.data.frame functions. 4. The final output is a tidy data frame with one column for the stock and another for the variable value.

Uploaded by

dopiorototo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
284 views

R - How To Do Web Scraping of Key Stats From FinViz Tables Per Stock - Stack Overflow

1. The user wants to scrape key stats from FinViz tables for individual stocks into a tidy format. 2. The expert recommends using the rvest package to scrape the data, which retrieves the relevant table as a data frame. 3. Some data wrangling is required to reshape the data into a long format with one row per stock-variable pair. This includes using stack, cbind, and as.data.frame functions. 4. The final output is a tidy data frame with one column for the stock and another for the variable value.

Uploaded by

dopiorototo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

4/28/2020 r - How to do web scraping of key stats from FinViz tables per stock?

- Stack Overflow

How to do web scraping of key stats from FinViz tables per stock?
Asked 2 years, 1 month ago Active 1 year, 6 months ago Viewed 416 times

Is anyone experienced in scraping data with R?

I would like to extract the corresponding data for given stocks. I did this in the following way:
0
library(XML)

stocks <- c("AAPL","MSFT")

for (s in stocks) {
url <- paste0("http://finviz.com/quote.ashx?t=", s)
webpage <- readLines(url)
html <- htmlTreeParse(webpage, useInternalNodes = TRUE, asText = TRUE)
tableNodes <- getNodeSet(html, "//table")

# ASSIGN TO STOCK NAMED DFS


assign(s, readHTMLTable(tableNodes[[9]],
header= c("data1", "data2", "data3", "data4", "data5",
"data6",
"data7", "data8", "data9", "data10", "data11",
"data12")))

# ADD COLUMN TO IDENTIFY STOCK


df <- get(s)
df['stock'] <- s
assign(s, df)
}

# COMBINE ALL STOCK DATA


stockdatalist <- cbind(mget(stocks))
stockdata <- do.call(rbind, stockdatalist)
# MOVE STOCK ID TO FIRST COLUMN
stockdata <- stockdata[, c(ncol(stockdata), 1:ncol(stockdata)-1)]

The problem, however, is that I have obtained it in the wrong format:

stock data1 data2 data3 data4 data5 data6 data7 data8


data9 data10 data11 data12
1 AAPL Index DJIA S&P500 P/E 16.13 EPS (ttm) 10.22 Insider Own 0.06%
Shs Outstand 5.09B Perf Week -7.35%
2 AAPL Market Cap 839.87B Forward P/E 12.50 EPS next Y 13.20 Insider Trans -7.80%
Shs Float 5.07B Perf Month -4.38%
3 AAPL Income 53.13B PEG 1.38 EPS next Q 2.71 Inst Own 63.20%
Short Float 1.16% Perf Quarter -5.40%
4 AAPL Sales 239.18B P/S 3.51 EPS this Y 10.80% Inst Trans 0.98%
Short Ratio 1.60 Perf Half Y 7.53%
5 AAPL Book/sh 27.42 P/B 6.02 EPS next Y 14.97% ROA 13.80%
Target Price 192.54 Perf Year 17.05%
6 AAPL Cash/sh 15.15 P/C 10.89 EPS next 5Y 11.68% ROE 37.40%
52W Range 138.62 - 183.50 Perf YTD -2.54%

What I would like to do is that a certain stock only appears once in the row name and that the
datanames are then displayed as column names where the columns then contain the
corresponding numbers...

r site,
By using our web-scraping rvest that
you acknowledge quantmod quandland understand our Cookie Policy, Privacy Policy, and
you have read
our Terms of Service.

https://stackoverflow.com/questions/49465312/how-to-do-web-scraping-of-key-stats-from-finviz-tables-per-stock 1/3
4/28/2020 r - How to do web scraping of key stats from FinViz tables per stock? - Stack Overflow

edited Mar 25 '18 at 13:05 asked Mar 24 '18 at 13:36


Jaap Frank
65.6k 25 145 159 11 5

Your question is similar to (although not an exact duplicate of) stackoverflow.com/questions/40490717/


… -- check over there and see if that helps. Advice: use rvest for scraping. – mysteRious Mar 24 '18
at 13:57

@mysteRious , I did see the particular post before posting, however it did not help me further with my
problem... :( – Frank Mar 24 '18 at 14:08

Oh, I finally see what it is you're trying to do. Will work on it. – mysteRious Mar 24 '18 at 18:38

@mysteRious The goal is to get it in some kind of standard wide format.. in order to make the data
analysis easier.. – Frank Mar 24 '18 at 18:44

1 Answer Active Oldest Votes

This is not super elegant but it gets the job done. It relies on the knowledge that the data
you're interested in is in the 6th table on this page:
1
library(rvest)
url <- read_html("http://finviz.com/quote.ashx?t=AAPL")
tables <- html_nodes(url,"table")
scraped <- tables %>% html_nodes("table") %>% .[6] %>%
html_table(fill=TRUE) %>% data.frame()

This yields data that looks like:

> str(scraped)
'data.frame': 12 obs. of 12 variables:
$ X1 : chr "Index" "Market Cap" "Income" "Sales" ...
$ X2 : chr "DJIA S&P500" "839.87B" "53.13B" "239.18B" ...
$ X3 : chr "P/E" "Forward P/E" "PEG" "P/S" ...
$ X4 : chr "16.13" "12.50" "1.38" "3.51" ...
... the rest is truncated

You can get the format you want using stack :

new_df <- as.data.frame(scraped[, seq(from=2, to=ncol(scraped), by=2)])


new_labels <- as.data.frame(scraped[, seq(from=1, to=ncol(scraped), by=2)])
results_df <- as.data.frame(cbind(stack(new_labels),stack(new_df))[,c(1,3)])
names(results_df) <- c("stock","AAPL")

This provides a results_df that looks like this, which contains all 72 variable/value pairs:

> head(results_df)
Variable Value
1 Index DJIA S&P500
2 Market Cap 839.87B
3 Income 53.13B
4 Sales 239.18B
5 Book/sh 27.42
6 Cash/sh 15.15

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
Then you can transpose it and add stock labels with
our Terms of Service.
as.data.frame(t(rbind(c("stock","AAPL"),results_df))) .
https://stackoverflow.com/questions/49465312/how-to-do-web-scraping-of-key-stats-from-finviz-tables-per-stock 2/3
4/28/2020 r - How to do web scraping of key stats from FinViz tables per stock? - Stack Overflow

edited Mar 24 '18 at 19:16 answered Mar 24 '18 at 19:07


mysteRious
2,788 2 8 22

By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service.

https://stackoverflow.com/questions/49465312/how-to-do-web-scraping-of-key-stats-from-finviz-tables-per-stock 3/3

You might also like