finds.structured.crsp

CRSP daily and monthly stock files

MIT License

class finds.structured.crsp.CRSP(sql: SQL, bd: BusDay, rdb: RedisDB | None = None, monthly: bool | None = None, verbose: int = 1)[source]

Bases: Stocks

Implements an interface to CRSP structured stocks dataset

Parameters:

sql – Connection to mysql database
bd – Business dates object
rdb – Optional connection to Redis for caching selected query results
monthly – Use monthly (True) or daily file; default (None) autoselects

Notes:

Earliest CRSP prc is 19251231

build_lookup(source: str, target: str, date_field='date', dataset: str = 'names', fillna: Any = 0) → Any[source]: Build lookup function to return target identifier from source

cache_ret(dates: List[Tuple[int, int]], replace: bool, dataset: str = 'daily', field: str = 'ret', date_field: str = 'date')[source]: Pre-generate compounded returns from daily for redis store

get_cap(date: int, cache_mode: str = 'rw', use_shares: bool = False, use_permco: bool = False) → Series[source]

Compute a cross-section of market capitalization values

Parameters:

date – YYYYMMDD int date of market cap
cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache
use_shares – If True, use shrout from ‘shares’ table, else ‘daily’
use_permco – If True, sum caps by permco, else by permno

Returns:

Series of market cap indexed by permno

get_divamt(beg: int, end: int) → DataFrame[source]

Accmumulates total dollar dividends between beg and end dates

Parameters:

beg – Inclusive start date (YYYYMMDD)
end – Inclusive end date (YYYYMMDD)

Returns:

DataFrame with accumulated divamts = per share divamt * shrout

get_dlret(beg: int, end: int, dataset: str = 'delist') → Series[source]

Compounded delisting returns from beg to end dates for all permnos

Parameters:

beg – Inclusive start date (YYYYMMDD)
end – Inclusive end date (YYYYMMDD)
dataset – either ‘delist’ or ‘monthly’ containing delisting returns

Returns:

Series of delisting returns

Notes

Sets to -0.3 if missing and code in [500, 520, 551…574, 580, 584]

get_ret(beg: int, end: int, dataset: str = 'daily', field: str = 'ret', **kwargs) → Series[source]

Get compounded returns, with option to include delist returns

Parameters:

beg – starting returns date
end – ending returns date
dataset – name of returns dataset (ignore if initialized as ‘monthly’)
field – Name of returns field in dataset, in {‘ret’, ‘retx’)

get_universe(date: int, cache_mode: str = 'rw') → DataFrame[source]

Return standard CRSP universe of US-domiciled common stocks

Parameters:

date – Rebalance date (YYYYMMDD)
cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

Returns:

market cap “decile” (1..10), “nyse” bool, “siccd”, “prc”, “cap”

Return type:

DataFrame of screened universe, indexed by permno, with columns

Notes:

Market cap must be available on date, with non-missing prc
shrcd in [10, 11], exchcd in [1, 2, 3]

classmethod is_dlstcode(dlstcd: Series | int) → Series | int[source]: Delisting returns if missing for these codes should be -0.3

dlstcodes_ = {500, 520, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 580, 584}

class finds.structured.crsp.CRSPBuffer(stocks: Stocks, beg: int, end: int, fields: List[str], dataset: str)[source]

Bases: StocksBuffer

Cache returns into memory, and provide Stocks-like interface

get_ret(beg: int, end: int, field: str = 'ret') → Series[source]

Return compounded stock returns between beg and end dates

Parameters:

beg – Begin date to compound returns
end – End date (inclusive) to compound returns
field – Name of returns field in dataset, in {‘ret’, ‘retx’)

get_universe(date: int, cache_mode: str = 'rw') → DataFrame[source]: Simply pass through original method to retrieve universe