finds.structured.crsp

CRSP daily and monthly stock files

Copyright 2022-2024, Terence Lim

MIT License

class finds.structured.crsp.CRSP(sql: SQL, bd: BusDay, rdb: RedisDB | None = None, monthly: bool | None = None, verbose: int = 1)[source]

Bases: Stocks

Implements an interface to CRSP structured stocks dataset

Parameters:
  • sql – Connection to mysql database

  • bd – Business dates object

  • rdb – Optional connection to Redis for caching selected query results

  • monthly – Use monthly (True) or daily file; default (None) autoselects

Notes:

  • Earliest CRSP prc is 19251231

build_lookup(source: str, target: str, date_field='date', dataset: str = 'names', fillna: Any = 0) Any[source]

Build lookup function to return target identifier from source

cache_ret(dates: List[Tuple[int, int]], replace: bool, dataset: str = 'daily', field: str = 'ret', date_field: str = 'date')[source]

Pre-generate compounded returns from daily for redis store

get_cap(date: int, cache_mode: str = 'rw', use_shares: bool = False, use_permco: bool = False) Series[source]

Compute a cross-section of market capitalization values

Parameters:
  • date – YYYYMMDD int date of market cap

  • cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

  • use_shares – If True, use shrout from ‘shares’ table, else ‘daily’

  • use_permco – If True, sum caps by permco, else by permno

Returns:

Series of market cap indexed by permno

get_divamt(beg: int, end: int) DataFrame[source]

Accmumulates total dollar dividends between beg and end dates

Parameters:
  • beg – Inclusive start date (YYYYMMDD)

  • end – Inclusive end date (YYYYMMDD)

Returns:

DataFrame with accumulated divamts = per share divamt * shrout

get_dlret(beg: int, end: int, dataset: str = 'delist') Series[source]

Compounded delisting returns from beg to end dates for all permnos

Parameters:
  • beg – Inclusive start date (YYYYMMDD)

  • end – Inclusive end date (YYYYMMDD)

  • dataset – either ‘delist’ or ‘monthly’ containing delisting returns

Returns:

Series of delisting returns

Notes

Sets to -0.3 if missing and code in [500, 520, 551…574, 580, 584]

get_ret(beg: int, end: int, dataset: str = 'daily', field: str = 'ret', **kwargs) Series[source]

Get compounded returns, with option to include delist returns

Parameters:
  • beg – starting returns date

  • end – ending returns date

  • dataset – name of returns dataset (ignore if initialized as ‘monthly’)

  • field – Name of returns field in dataset, in {‘ret’, ‘retx’)

get_universe(date: int, cache_mode: str = 'rw') DataFrame[source]

Return standard CRSP universe of US-domiciled common stocks

Parameters:
  • date – Rebalance date (YYYYMMDD)

  • cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

Returns:

market cap “decile” (1..10), “nyse” bool, “siccd”, “prc”, “cap”

Return type:

DataFrame of screened universe, indexed by permno, with columns

Notes:

  • Market cap must be available on date, with non-missing prc

  • shrcd in [10, 11], exchcd in [1, 2, 3]

classmethod is_dlstcode(dlstcd: Series | int) Series | int[source]

Delisting returns if missing for these codes should be -0.3

dlstcodes_ = {500, 520, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 580, 584}
class finds.structured.crsp.CRSPBuffer(stocks: Stocks, beg: int, end: int, fields: List[str], dataset: str)[source]

Bases: StocksBuffer

Cache returns into memory, and provide Stocks-like interface

get_ret(beg: int, end: int, field: str = 'ret') Series[source]

Return compounded stock returns between beg and end dates

Parameters:
  • beg – Begin date to compound returns

  • end – End date (inclusive) to compound returns

  • field – Name of returns field in dataset, in {‘ret’, ‘retx’)

get_universe(date: int, cache_mode: str = 'rw') DataFrame[source]

Simply pass through original method to retrieve universe