🐙 Evan Azevedo

Search

Search IconIcon to open search

28 - Using Meltano to connect DuckDB to Apache Superset

Last updated Oct 29, 2022 Edit Source

The awesome tool Meltano made it very easy for me to connect the data files I’ve downloaded with my DuckDB database and the Apache Superset dashboard that I’m going to create. It really simplifies the tak of creating an analytics data stack.

I also played around with Apache Superset a little bit, and noodled on the idea of how to represent the Opportunity Atlas data, which is separated by Census Tract with the other data which is separated by Zip code. The problem is that there can be multiple zip codes per census tract, and multiple census tracts per zip code (many-to-many). The simplest solution would be to average the census tracts up to the zip code.

So if census tract $A$ overlapped with zip codes $[z_1, z_2]$ and census tract $B$ overlaps with zip code $z_2$, then the data for zip code $z_2$ would be $\textrm{Opportunity Atlas}(z_2) = \frac{A + B}/{2}$. A more sophisticated calculation would weight each zip code based on the fraction belonging in each census tract.


I'm a freelance software developer located in Denver, Colorado. If you're interested in working together or would just like to say hi you can reach me at me@ this domain.