Project Overview
100%This project aims to analyze the average temperatures of the United States over time using linear regression. The dataset used in this project is the "Global Land Temperatures By Country" dataset, which includes temperature records for various countries. The project involves data preprocessing, fitting a linear regression model to the data, and visualizing trends to make future predictions.
Objectives
Goal of this project was to demonstrate a basic linear regression in python, using a dataset of global temperatures.
Design
Project was set up in one main python file, that read in the data, performed preprocessing, and fitted a linear regression model.
Data & Analysis
Features
- Data Cleaning: Handling missing values using forward fill and filtering out inaccurate data before 1820.
- Yearly Resampling: Resampling the temperature data to obtain yearly average temperatures.
- Linear Regression: Using linear regression to fit a trend line and predict future temperatures for 50 years.
- Uncertainty Calculation: Calculating the upper and lower bounds of uncertainty in the average temperature estimates.
Conclusion
Project successfully demonstrated the use of linear regression to analyze global temperatures over time.
Results
Challenges
N/A
Future Work
- Implement additional statistical tests to evaluate model assumptions such as linearity, independence, homoscedasticity, and normality.
- Explore different models such as polynomial regression or time series analysis to better capture non-linear patterns in temperature trends.
References
- Dataset provided by Kaggle.