For Monday’s PSL Demo Day, I showed how to use the `scf`

and `microdf`

PSL Python packages from the Google Colab web-based Jupyter notebook interface.

The `scf`

package extracts data from the Federal Reserve’s Survey of Consumer Finances, the canonical source of US wealth microdata. `scf`

has a single function: `load(years, columns)`

, which then returns a `pandas`

`DataFrame`

with the specified column(s), each record’s survey weight, and the year (when multiple years are requested).

The `microdf`

package analyzes survey microdata, such as that returned by the `scf.load`

function. It offers a consistent paradigm for calculating statistics like means, medians, sums, and inequality statistics like the Gini index. Most functions are structured as follows: `f(df, col, w, groupby)`

where `df`

is a `pandas`

`DataFrame`

of survey microdata, `col`

is a column(s) name to be summarized, `w`

is the weight column, and `groupby`

is the column(s) to group records in before summarizing.

Using Google Colab, I showed how to use these packages to quickly calculate mean, median, and total wealth from the SCF data, without having to install any software or leave the browser. I also demonstrated how to use the `groupby`

argument of `microdf`

functions to show how different measures of wealth inequality have changed over time. Finally, I previewed some of what’s to come with `scf`

and `microdf`

: imputations, extrapolations, inflation, visualization, and documentation, to name a few priorities.

Resources: