Posted by CANbike on Sat, 10 Aug 2013

Yahoo! Finance Key Statistics: Download to a CSV File

Yahoo! Finance Key Statistics provides a wealth of financial information for publicy traded companies. It’s a free and useful statistical resource for investors to research a company’s stock.

The following is my solution to downloading key statistics to a CSV file.

No Key Statistics Download?

Yahoo! Finance officially supports downloading of stock data to a CSV (Comma-Separated Values) file. The download link can be found in a stock’s Summary page under the Toolbox header. Though unofficial, there is also a list of tags to select the stock data (price, P/E, etc) to be downloaded.

Unfortunately this feature is not available under a stock’s Key Statistics page. Moreover, the unofficial list of tags only includes some of the key statistics data. See Yahoo! Finance: URL Download to a CSV File for details.

As a result, the most feasible solution (least amount of work) for personal use was to write a script to scrape the information from a stock’s key statistics web pages. A major disadvantage, though, is the rewrites required when Yahoo! changes or rewrites their finance web pages. However, that has not happened for a while.

Nonetheless, with a lack of options, a script getKeystats was written. The script is provided as is, and obtains it’s data from Yahoo! Finance.


About getKeystats

Yahoo Finance-Download Key Statistics-01-thumb.jpgYahoo Finance-Download Key Statistics-02-thumb.jpgYahoo Finance-Download Key Statistics-03-thumb.jpgYahoo Finance-Download Key Statistics-04-thumb.jpg

A Yahoo! Finance Key Statistics URL is of the form http://finance.yahoo.com/q/ks?s=[TICKER SYMBOl]+Key+Statistics, where [TICKER SYMBOL] is a variable.

The script getKeystats parses the key statistics to a CSV file, a text file, and a master spreadsheet file.

Output Structure:

  • Data is saved into a directory Stock Data. If the directory does not exist, then it is created.
./Stock\ Data/
 |
 |-csv/
 |  |-[TICKER SYMBOL].csv
 |
 |-txt/
 |  |-[TICKER SYMBOL].txt
 |
 |-ALL.csv

where
[TICKER SYMBOL].csv - contains the key statistics in a horizontal layout, separated by a comma
[TICKER SYMBOL].txt - contains the key statistics in a vertical layout with labels
ALL.csv - contains key statistics for all downloaded ticker symbols

  • When the script getKeystats is run, new data is added to ALL.csv. Duplicate/outdated data is deleted.

Usage:

  • Command line usage is simple. Enter a ticker symbol, or multiple ticker symbols separated by a space.
usage: getKeysats [ticker symbol]
Separate multiple tickers with a space.

Script: getKeystats

#!/bin/sh

# Variables
#
FOLDER=Stock\ Data
FILE=ALL.csv	# Spreadsheet

function getData(){
	# Date & Time
	DATETIME=$(date)

	if [ -n "$TICKER" ]
	then
		#==================== Get Data From Yahoo! Finance ========================
		#
		# Download key statistics from web
		wget http://finance.yahoo.com/q/ks?s=$TICKER+Key+Statistics -O out.txt -nv &> /dev/null

		# Add newlines
		sed -i "s/<\/tr\>/\\`echo -e '\n\r'`/g" out.txt
		sed -i "s/<\/span\>/\\`echo -e '\n\r'`/g" out.txt

		# Adjust ":" for later replacement with ","
		sed -i "s/:<\/td>/: <\/td>/g" out.txt

		# Clean key statistics
		sed -i "s/><tr><td class=\"yfnc_tablehead1\" width=\"74%\">//g" out.txt
		sed -i "s/<\/td><td class=\"yfnc_tabledata1\">//g" out.txt
		sed -i "s/<\/td>//g" out.txt
		sed -i "s/<font size=\"-1\"><sup>[0-9]<\/sup><\/font>//g" out.txt
		sed -i "s/.*yfs_l84.*\">/Quote: /g" out.txt

		# Delete non key statistics
		sed -i "s/.*#yfi.*//g" out.txt
		sed -i 's/^[ \t].*//' out.txt
		sed -i "s/>.*//g" out.txt
		sed -i "s/<.*//g" out.txt
		sed -i "s/.*}.*//g" out.txt
		sed -i "s/.*{.*//g" out.txt
		sed -i "s/.*\*.*//g" out.txt

		# Delete empty lines
		sed -i "s/\r//g" out.txt
		sed -i '/^$/d' out.txt
		#--------------------------------------------------------------------------

		#==================== Parse Quote ==========================================
		#
		# Parse share price
		QUOTE=$(grep -Po 'Quote.*' out.txt | sed "s/Quote://g")
		#--------------------------------------------------------------------------

		#==================== Check Quote ==========================================
		#
		# Check if quote exists
		if [ -n "$QUOTE" ]
		then
			#==================== Output Data =========================================
			#
			# Output results to screen
			echo "Ticker "$TICKER" found"
			echo ""

			# Create "Stock Data" directory if it does not exist
			if [ ! -d "$FOLDER" ]; then
				mkdir "$FOLDER"
			fi

			if [ ! -d "$FOLDER/txt" ]; then
				mkdir "$FOLDER/txt"
			fi

			if [ ! -d "$FOLDER/csv" ]; then
				mkdir "$FOLDER/csv"
			fi			

			# Output key statistics to files "<Ticker Symbol>.csv" "<Ticker Symbol>.txt"
			if [ -n "$QUOTE" ]
			then
				# Add ticker symbol and timestamp
				sed -i "1i Date and Time: $DATETIME" out.txt
				sed -i "1i Ticker: $TICKER" out.txt

				# Output to a text file
				cat out.txt > $FOLDER/txt/$TICKER.txt

				# Output to a csv file
				sed -i "s/,//g" out.txt
				sed -i "s/.*: //" out.txt
				tr '\n' ',' < out.txt > $FOLDER/csv/$TICKER.csv
			fi

			# Output to CSV list
			if [ -f "$FOLDER/$FILE" ]
			then
				sed -i "s/$TICKER.*//g" "$FOLDER/$FILE"
				sed -i '/^$/d' "$FOLDER/$FILE"
				cat "$FOLDER/csv/$TICKER.csv" >> "$FOLDER/$FILE"
				echo "" >> "$FOLDER/$FILE"
			else
				echo "Ticker,Date and Time,Quote,Enterprise Value,Trailing P/E (ttm intraday),Forward P/E,PEG Ratio (5 yr expected),Price/Sales (ttm),Price/Book (mrq),Enterprise Value/Revenue (ttm),Enterprise Value/EBITDA (ttm),Fiscal Year Ends,Most Recent Quarter (mrq),Profit Margin (ttm),Operating Margin (ttm),Return on Assets (ttm),Return on Equity (ttm),Revenue (ttm),Revenue Per Share (ttm),Qtrly Revenue Growth (yoy),Gross Profit (ttm),EBITDA (ttm),Net Income Avl to Common (ttm),Diluted EPS (ttm),Qtrly Earnings Growth (yoy),Total Cash (mrq),Total Cash Per Share (mrq),Total Debt (mrq),Total Debt/Equity (mrq),Current Ratio (mrq),Book Value Per Share (mrq),Operating Cash Flow (ttm),Levered Free Cash Flow (ttm),Beta,52-Week Change,S&P500 52-Week Change,52-Week High,52-Week Low,50-Day Moving Average,200-Day Moving Average,Avg Vol (3 month),Avg Vol (10 day),Shares Outstanding,Float,% Held by Insiders,% Held by Institutions,Shares Short,Short Ratio,Short % of Float,Shares Short (prior month),Forward Annual Dividend Rate,Forward Annual Dividend Yield,Trailing Annual Dividend Yield,Trailing Annual Dividend Yield,5 Year Average Dividend Yield,Payout Ratio,Dividend Date,Ex-Dividend Date,Last Split Factor (new per old),Last Split Date," > "$FOLDER/$FILE"
				cat "$FOLDER/csv/$TICKER.csv" >> "$FOLDER/$FILE"
				echo "" >> "$FOLDER/$FILE"
			fi
			#--------------------------------------------------------------------------
		else
			echo "Ticker \""$TICKER"\" not found"
			echo ""
		fi

		# Remove temp files
		rm out.txt
		#--------------------------------------------------------------------------
	else
		# Missing argument. Output instructions.
		echo "usage: getKeystats [ticker symbol]"
		echo ""
	fi
}

if [ -z "$1" ]
then
	echo
	echo "usage: getKeystats [ticker symbol]"
	echo "Separate multiple tickers with a space."
	echo ""
else
	for var in "$@"
	do
		# Convert argument to uppercase
		TICKER=$(echo $var | tr "[a-z]" "[A-Z]")
		# Remove comma
		TICKER=$(echo $TICKER | sed -e 's/,//g')

		if [ -n "$TICKER" ]
		then
			getData
		fi
	done
fi

Related Item(s):