John Curran

Project Overview

100%

This project aims to analyze the average temperatures of the United States over time using linear regression. The dataset used in this project is the "Global Land Temperatures By Country" dataset, which includes temperature records for various countries. The project involves data preprocessing, fitting a linear regression model to the data, and visualizing trends to make future predictions.

Objectives

Goal of this project was to demonstrate a basic linear regression in python, using a dataset of global temperatures.

Design

Project was set up in one main python file, that read in the data, performed preprocessing, and fitted a linear regression model.

Data & Analysis

Features

Data Cleaning: Handling missing values using forward fill and filtering out inaccurate data before 1820.
Yearly Resampling: Resampling the temperature data to obtain yearly average temperatures.
Linear Regression: Using linear regression to fit a trend line and predict future temperatures for 50 years.
Uncertainty Calculation: Calculating the upper and lower bounds of uncertainty in the average temperature estimates.

Conclusion

Project successfully demonstrated the use of linear regression to analyze global temperatures over time.

Results

Linear Regression Graph

Challenges

N/A

Future Work

Implement additional statistical tests to evaluate model assumptions such as linearity, independence, homoscedasticity, and normality.
Explore different models such as polynomial regression or time series analysis to better capture non-linear patterns in temperature trends.

References

Dataset provided by Kaggle.

View on GitHub