finds.structured.pstat
Compustat annual, quarterly and key developments
Copyright 2022-2024, Terence Lim
MIT License
- class finds.structured.pstat.PSTAT(sql: SQL, bd: BusDay, name: str = 'PSTAT', identifier: str = 'gvkey', verbose: int = 0)[source]
Bases:
Structured
Provide interface to Compustat structured data sets
- Parameters:
sql – Connection to mysql database
bd – Custom business days object
name – Name of dataset is “PSTAT”
identifier – Stocks identifier field name is “gvkey”
Notes:
Screen on (INDFMT= ‘INDL’, DATAFMT=’STD’, POPSRC=’D’, and CONSOL=’C’) keeps majority of records and uniquely identifies GVKEY, DATADATE. also include INDFMT= ‘FS’, CURRENCY=’USD+CAD’, STATUS=’ACTIVE+INACTIVE’)
- build_lookup(source: str, target: str, date_field='linkdt', dataset: str = 'links', fillna: Any = None) Any [source]
Build lookup function to return target identifier from source
- get_linked(dataset: str, fields: List[str], date_field: str = 'datadate', link_perm: str = 'lpermno', link_date: str = 'linkdt', where: str = '', limit: int | str | None = None) DataFrame [source]
Query a pstat table, and return with linked crsp permno
- Parameters:
dataset – pstat dataset to query
fields – Names of fields to return
date_field – Name of date field in pstat table to query
link_date – Name of link date field in ‘links’ table
link_perm – Name of permno field in ‘links’ table
where – Sql where clause, as sql string (optional)
limit – Maximum number of records to return (optional)
- Returns:
DataFrame containing result of query
Examples:
>>> df = pstat.get_linked(dataset='annual', date_field='datadate', fields=['ceq','pstkrv','pstkl','pstk'], where='ceq > 0 and datadate>=19930104 and datadate<=20991231') >>> df = keydev.get_linked(dataset='keydev', date_field='announcedate', fields=['companyname', 'keydeveventtypeid', 'keydevtoobjectroletypeid'], where='', limit=''):
Notes:
select keydev.companyname, keydev.keydeveventtypeid, keydev.keydevtoobjectroletypeid, keydev.announcedate, keydev.gvkey, lpermno as permno from keydev left join links on keydev.gvkey = links.gvkey and links.linkdt = (select max(c.linkdt) as linkdt from links c where c.gvkey = keydev.gvkey and c.linkdt <= keydev.announcedate) where lpermno is not null and keydev.gvkey > 0 and links.gvkey > 0 and announcedate >= 20180301 limit 100;
- get_permnos(keys: List[str], date: int, link_perm='lpermno', link_date: str = 'date', permno='permno') DataFrame [source]
Return list of permnos mapped to gvkeys as of a date
- Parameters:
keys – Input list of gvkeys to lookup
date – Prevailing date of link
- property event
Maps keydev event id to description
- property role
Maps keydev role id to description