finds.structured.stocks

Stocks subclass for defining stocks datasets

Copyright 2022, Terence Lim

MIT License

class finds.structured.stocks.Stocks(sql: SQL, bd: BusDay, tables: Dict[str, Table], identifier: str, name: str, rdb: RedisDB | None = None, verbose: int = 1)[source]

Bases: Structured

Provide interface to structured stock price datasets

cache_ret(dates: List[Tuple[int, int]], replace: bool, dataset: str, field: str = 'ret', date_field: str = 'date')[source]

Pre-generate compounded returns for redis store

get_compounded(periods: List[Tuple[int, int]], permnos: List[int], field: str = 'ret', cache_mode: str = 'rw') DataFrame[source]

Compound returns within list of periods, for given permnos

Parameters:
  • periods – Tuples of inclusive begin and end dates of returns period

  • permnos – List of permnos

  • cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

Returns:

DataFrame of compounded returns in rows, for permnos in cols

get_many(permnos: List[str | int], fields: List[str], dates: List[int], dataset: str = 'daily', date_field: str = 'date', exact: bool = True) DataFrame[source]

Retrieve multiple fields for lists of permnos and dates

Parameters:
  • permnos – List of identifiers to retrieve

  • dates – List of corresponding dates of center of event window

  • field – Names of fields to retrieve

  • dataset – Name of dataset, default ‘daily’

  • date_field – Name of date field in database, default ‘date’

  • exact – Whether require exact date match, or allow most recent

Returns:

DataFrame with permno, date, and retrieved fields across columns

get_range(fields: List[str] | Dict[str, str], beg: int, end: int, dataset: str = 'daily', date_field: str = 'date', cache_mode: str = 'rw') DataFrame[source]

Return field values within a date range

Parameters:
  • fields – Names of columns to return (and optionally rename as)

  • beg – Inclusive start date in YYYYMMDD format

  • end – Inclusive end date in YYYYMMDD format

  • dataset – Name of dataset to extract from, default ‘daily’

  • date_field – Name of date column in the table, default ‘date’

  • cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

Returns:

DataFrame multi-indexed by permno, date

get_ret(beg: int, end: int, dataset: str = 'daily', field: str = 'ret', date_field: str = 'date', cache_mode: str = 'rw') Series[source]

Compounded returns between beg and end dates of all stocks

Parameters:
  • beg – Inclusive start date (YYYYMMDD)

  • end – Inclusive end date (YYYYMMDD)

  • dataset – Name of dataset to retrieve, default is daily

  • field – Name of returns field

  • date_field – Name of date field

  • cache_mode – ‘r’ to try read from cache first, ‘w’ to write to cache

Series:

DataFrame with prod(min_count=1) of returns in column ret, with rows indexed by permno

get_section(fields: List[str], date: int, dataset: str = 'daily', date_field: str = 'date', start: int = -1) DataFrame[source]

Return a cross-section of values of fields as of a single date

Parameters:
  • fields – list of columns to return

  • date – Desired date in YYYYMMDD format

  • date_field – Name of date column in the table, default ‘date’

  • dataset – Dataset to extract from, default ‘daily’

  • start – Non-inclusive date of starting range; if -1 then exact date

Returns:

Most recent row within date range, indexed by permno

Note:

  • If start is not -1, then the latest prevailing record for each between (non-inclusive) start and (inclusive) date is returned

Examples:

>>> t = crsp.get_section('shares', ['shrenddt','shrout'], 'shrsdt', dt)
>>> u = crsp.get_section('names', ['comnam'], 'date', dt-10000)
get_series(permnos: int | str | List[str | int], field: str, dataset: str = 'daily', date_field: str = 'date', beg: int = 19000000, end: int = 29001231) DataFrame | Series[source]

Return time series of a field for multiple permnos as DataFrame

Parameters:
  • permnos – Identifiers to filter

  • dataset – Name of dataset to retrieve from, default daily

  • field – Name of column to extract, e.g. ‘ret’

  • beg – Inclusive start date (YYYYMMDD)

  • end – Inclusive end date (YYYYMMDD)

Returns:

DataFrame indexed by date with permnos in columns

get_window(field: str, permnos: List[Any], dates: List[int], left: int, right: int, dataset: str = 'daily', date_field: str = 'date', avg: bool = False) DataFrame[source]

Retrieve field values for permnos in window centered around dates

Parameters:
  • field – Name of field to retrieve

  • permnos – List of identifiers to retrieve

  • date_field – Name of date field in database

  • dates – List of corresponding dates of center of event window

  • left – Relative (inclusive) offset of start of event window

  • right – Relative (inclusive) offset of end of event window

  • dataset – Name of dataset, default ‘daily’

Returns:

(right-left)] of field values in event window

Return type:

DataFrame columns [0

class finds.structured.stocks.StocksBuffer(stocks: Stocks, beg: int, end: int, dataset: str, fields: List[str], identifier: str, date_field: str = 'date')[source]

Bases: Stocks

Cache daily returns into memory, and provide Stocks-like interface

get_ret(beg: int, end: int, field: str = 'ret') Series[source]

Return compounded stock returns between beg and end dates

Parameters:
  • beg – Begin date to compound returns

  • end – End date (inclusive) to compound returns

  • field – Name of returns field in dataset, in {‘ret’, ‘retx’)

get_section(fields: List[str], date: int, dataset: str = '', date_field: str = '', start: int = -1) DataFrame[source]

Return a cross-section of values of fields as of a single date

Parameters:
  • dataset – Dataset to extract from

  • fields – list of columns to return

  • date_field – Name of date column in the table

  • date – Desired date in YYYYMMDD format

  • start – Non-inclusive date of starting range (ignored)

Returns:

Most recent row within date range, indexed by permno

class finds.structured.stocks.StocksFrame(df: DataFrame, rsuffix: str = '', identifier: str = 'permno')[source]

Bases: Stocks

Mimic Stocks object given an input DataFrame of returns

Parameters:
  • df – DataFrame of returns with date in index and permno in columns

  • rsuffix – replicate output columns and append rsuffix to column name

  • identifier – name of identifier column

Notes:

  • Limited interface to manipulate DataFrame of asset returns as Stocks-like. Use when sql and BusDay not available.

class bd[source]

Bases: object

Class to mimic basic behavior of BusDay object

static begmo(date: int | List[int]) int | List[int][source]

Returns same date

static date_tuples(dates: List[int]) List[Tuple[int, int]][source]

Returns adjacent dates as the holding date tuples

static endmo(date: int | List[int]) int | List[int][source]

Returns same date

get_ret(beg: int, end: int, field: str = 'ret', **kwargs) Series[source]

Compounded returns between beg and end (inclusive) dates

get_series(permnos: int | str | List[str | int], *arg, **kwarg) Series[source]

Return the series for target permnos