This project re-creates the UNData exploration exercise using live data from the World Bank API instead of static CSV files.
It demonstrates how to programmatically gather, clean, and merge international development indicators using Python and pandas.
- Retrieve GDP per capita (PPP, constant 2017 international $) data using the World Bank Indicators API.
- Retrieve life expectancy at birth (total years) data using the same API.
- Merge datasets to analyze the relationship between economic and health outcomes.
- Enhance the dataset by adding country metadata (region, income level, capital city) from the World Bank Country API.
- Filter and visualize specific country data such as the United States and Canada (2000–2021).
- Use pagination (
pageparameter) to pull all available records without missing data. - Identify and retrieve the indicator code for Public Expenditure on Education (% of GDP).
- Python 3
- Jupyter Notebook
- pandas
- requests
- matplotlib (optional for visualization)
All data is retrieved live from the World Bank Open Data API:
| Indicator Name | Code | Description |
|---|---|---|
| GDP per capita, PPP (constant 2017 international $) | NY.GDP.PCAP.PP.KD |
Measures economic productivity per person. |
| Life expectancy at birth, total (years) | SP.DYN.LE00.IN |
Average lifespan at birth. |
| Public Expenditure on Education (% of GDP) | SE.XPD.TOTL.GD.ZS |
Percentage of GDP spent on public education. |
The final dataset (final_df) includes:
- Country information (name, region, income level, capital city)
- GDP per capita
- Life expectancy
- Education spending (optional)
- Yearly observations from 2000–2021