The tentative project title:
World Life Expectancy—Live to thrive
The motivation for this project:
It’s a survival instinct in blood that keeps us asking ourselves, how
do we live, and how do we live longer. People in different regions have
different life expectancies. Even in New York, life expectancy varies
over regions due to different economic statuses, different
infrastructures, etc. It is natural to wonder about the factors that
could potentially have the most influential impact on life expectancy:
social-economic factors, public health factors, or both.
The intended final products:
- A website with homepage for overview of project and our screencast,
a page for some plots for exploratory data, a page for analysis of some
factors and approaches we used, a page for interactive dashboard by
R-Shiny, a page for report and link to github.
- A report showing how we completed the project, and any changes
during completion.
- A short screencast describing our project.
The planned analyses / visualizations / coding challenges
Analyses:
- Hypothesis test : difference in mean life expectancy among regions,
e.g. developed v.s. developing countries
- Linear model : Fit a linear model with Y = life expectancy and
X=economic/environmental / health determinants
Visualizations:
- EDA Plots: variable distribution / scatter plots with best line fit
/ correlation plot of variable
- Time Series Plot : life expectancy over time
- Interactive World Map : life expectancy in different countries and
years
Challenges :
- Missing data:Many variables in our major dataset have missing
values Select reasonable variables: We need to find more variables for
our dataset, in order to cover aspects of society, economics,
environment, health, and so on.
- Website Design : We want to build a fancier website with advanced
settings
- R-Shiny interactive dashboard : We plan to use R-Shiny to build an
interactive platform.
- Model fitting : Our dataset contains many predictors of life
expectancy. Variable selection and interpretation could be a
challenge
The planned timeline
- 11.12 Proposal
- 11.12~11.15 Find more related datasets to include more determinants
for life expectancy. Determine the variables/predictors. Explore the
data and decide what exact plots that we should draw.
- 11.15 Meeting with TA.
- 11.15~11.20 Assign individual tasks and schedule weekly meetings.
Finish analysis test part and begin EDA Section.
- 11.21~11.27 Finish EDA Section, Time Series Plot and analysis linear
model part, beginning Interactive World Map.
- 11.28~12.4 Finish the World Map, begin Website design and begin to
write the report.
- 12.5~12.10 Finish the report and video.
- 12.11~ Enjoy a life without DS project. (So sad TAT)