finds.structured.crsp
CRSP daily and monthly stock files
Copyright 2022-2024, Terence Lim
MIT License
- class finds.structured.crsp.CRSP(sql: SQL, bd: BusDay, rdb: RedisDB | None = None, monthly: bool | None = None, verbose: int = 1)[source]
Bases:
Stocks
Implements an interface to CRSP structured stocks dataset
- Parameters:
sql – Connection to mysql database
bd – Business dates object
rdb – Optional connection to Redis for caching selected query results
monthly – Use monthly (True) or daily file; default (None) autoselects
Notes:
Earliest CRSP prc is 19251231
- build_lookup(source: str, target: str, date_field='date', dataset: str = 'names', fillna: Any = 0) Any [source]
Build lookup function to return target identifier from source
- cache_ret(dates: List[Tuple[int, int]], replace: bool, dataset: str = 'daily', field: str = 'ret', date_field: str = 'date')[source]
Pre-generate compounded returns from daily for redis store
- get_cap(date: int, cache_mode: str = 'rw', use_shares: bool = False, use_permco: bool = False) Series [source]
Compute a cross-section of market capitalization values
- Parameters:
date – YYYYMMDD int date of market cap
cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache
use_shares – If True, use shrout from ‘shares’ table, else ‘daily’
use_permco – If True, sum caps by permco, else by permno
- Returns:
Series of market cap indexed by permno
- get_divamt(beg: int, end: int) DataFrame [source]
Accmumulates total dollar dividends between beg and end dates
- Parameters:
beg – Inclusive start date (YYYYMMDD)
end – Inclusive end date (YYYYMMDD)
- Returns:
DataFrame with accumulated divamts = per share divamt * shrout
- get_dlret(beg: int, end: int, dataset: str = 'delist') Series [source]
Compounded delisting returns from beg to end dates for all permnos
- Parameters:
beg – Inclusive start date (YYYYMMDD)
end – Inclusive end date (YYYYMMDD)
dataset – either ‘delist’ or ‘monthly’ containing delisting returns
- Returns:
Series of delisting returns
Notes
Sets to -0.3 if missing and code in [500, 520, 551…574, 580, 584]
- get_ret(beg: int, end: int, dataset: str = 'daily', field: str = 'ret', **kwargs) Series [source]
Get compounded returns, with option to include delist returns
- Parameters:
beg – starting returns date
end – ending returns date
dataset – name of returns dataset (ignore if initialized as ‘monthly’)
field – Name of returns field in dataset, in {‘ret’, ‘retx’)
- get_universe(date: int, cache_mode: str = 'rw') DataFrame [source]
Return standard CRSP universe of US-domiciled common stocks
- Parameters:
date – Rebalance date (YYYYMMDD)
cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache
- Returns:
market cap “decile” (1..10), “nyse” bool, “siccd”, “prc”, “cap”
- Return type:
DataFrame of screened universe, indexed by permno, with columns
Notes:
Market cap must be available on date, with non-missing prc
shrcd in [10, 11], exchcd in [1, 2, 3]
- classmethod is_dlstcode(dlstcd: Series | int) Series | int [source]
Delisting returns if missing for these codes should be -0.3
- dlstcodes_ = {500, 520, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 580, 584}
- class finds.structured.crsp.CRSPBuffer(stocks: Stocks, beg: int, end: int, fields: List[str], dataset: str)[source]
Bases:
StocksBuffer
Cache returns into memory, and provide Stocks-like interface