For Monday’s PSL Demo Day, I showed how to use the
microdf PSL Python packages from the Google Colab web-based Jupyter notebook interface.
scf package extracts data from the Federal Reserve’s Survey of Consumer Finances, the canonical source of US wealth microdata.
scf has a single function:
load(years, columns) , which then returns a
DataFrame with the specified column(s), each record’s survey weight, and the year (when multiple years are requested).
microdf package analyzes survey microdata, such as that returned by the
It offers a consistent paradigm for calculating statistics like means, medians, sums, and inequality statistics like the Gini index.
Most functions are structured as follows:
f(df, col, w, groupby) where
df is a
DataFrame of survey microdata,
col is a column(s) name to be summarized,
w is the weight column, and
groupby is the column(s) to group records in before summarizing.
Using Google Colab, I showed how to use these packages to quickly calculate mean, median, and total wealth from the SCF data, without having to install any software or leave the browser.
I also demonstrated how to use the
groupby argument of
microdf functions to show how different measures of wealth inequality have changed over time.
Finally, I previewed some of what’s to come with
microdf : imputations, extrapolations, inflation, visualization, and documentation, to name a few priorities.