## Polynomial Regression

### Introduction

Polynomial regression is a form of regression analysis in which the relationship between the independent variable *x* and the dependent variable *y* is modeled as an *n*-degree polynomial function. Unlike simple linear regression, which assumes a linear relationship between the variables, polynomial regression allows for more complex relationships to be modeled.

### Here's an explanation of polynomial regression:

**Linear Regression**

In simple linear regression, we model the relationship between one independent variable *x* and one dependent variable *y* as a straight line:

*y***=***β***0********+***β***1*******x***+***ε*

**where:**

*y*is the dependent variable,*x*is the independent variable,*β*0 is the intercept,*β*1 is the slope,*ε*is the error term.

**Polynomial Regression**

Polynomial regression extends this concept by allowing the relationship between *x* and *y* to be modeled as an *n*-degree polynomial function:

*y***=***β***0********+***β***1*******x***+***β***2*******x***2****+***β***3*******x***3****+****…****+***βn******xn***+***ε*

**where:**

*y*is the dependent variable,*x*is the independent variable,*β*0 is the intercept,*β*1,*β*2,…,*βn* are the coefficients for each degree of the polynomial,*ε*is the error term.

**Degree of Polynomial**

The degree of the polynomial, denoted as *n*, determines the complexity of the curve that fits the data. A higher degree polynomial can capture more intricate patterns in the data but may also lead to overfitting, where the model learns noise in the data rather than the underlying relationship. Choosing the appropriate degree of the polynomial is crucial to balance between bias and variance.

**Model Fitting**

To fit a polynomial regression model to the data, we use techniques similar to linear regression. The coefficients of the polynomial are estimated using methods like ordinary least squares (OLS) or gradient descent, minimizing the sum of squared residuals between the observed and predicted values.

**Model Evaluation**

After fitting the polynomial regression model, we evaluate its performance using metrics like mean squared error (MSE), R-squared (coefficient of determination), or cross-validation techniques. These metrics help assess how well the model fits the data and how effectively it generalizes to unseen data.

In summary, polynomial regression is a flexible technique that allows us to capture non-linear relationships between variables by fitting polynomial functions to the data. However, care must be taken in choosing the appropriate degree of the polynomial to prevent overfitting and ensure the model's generalizability.

### Let's illustrate polynomial regression with an example. Suppose we have a dataset containing information about the temperature and the number of ice creams sold on a particular day. We want to predict the number of ice creams sold based on the temperature.

**Here's a step-by-step explanation of how to perform polynomial regression:**

**Import Necessary Libraries**

We need libraries like numpy for numerical computations, pandas for data manipulation, matplotlib for data visualization, and sklearn for polynomial regression modeling.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.preprocessing import PolynomialFeatures

from sklearn.linear_model import LinearRegression

**Load and Explore Data**

** **

Let's assume we have a CSV file named 'ice_cream_sales.csv' containing the temperature and the number of ice creams sold on different days. We load and explore the data to understand its structure.

data = pd.read_csv('ice_cream_sales.csv')

print(data.head())

**Prepare Data**

We extract the independent variable (temperature) and the dependent variable (number of ice creams sold) from the dataset.

X = data['Temperature'].values.reshape(-1, 1) # Independent variable (temperature)

y = data['IceCreamsSold']

**Polynomial Features**

We use PolynomialFeatures from sklearn to create polynomial features up to a specified degree. This transforms our original features into polynomial features.

degree = 3

poly_features = PolynomialFeatures(degree=degree)

X_poly = poly_features.fit_transform(X)

**Split Data**

** **

We split the data into training and testing sets to evaluate the model's performance.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_poly, y, test_size=0.2, random_state=42)

**Create and Fit Data**

We create a LinearRegression model and fit it to the polynomial features.

model = LinearRegression()

model.fit(X_train, y_train)

**Make Predictions**

We use the trained model to make predictions on the test data.

y_pred = model.predict(X_test)

**Evaluate the Model**

We evaluate the model's performance using metrics like mean squared error (MSE) or R-squared.

from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(y_test, y_pred)

r_squared = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)

print("R-squared:", r_squared)

**Visualize Results**

We visualize the actual vs. predicted values to understand how well the model fits the data.

plt.scatter(X_test[:,1], y_test, color='blue', label='Actual') # Plot test data

plt.scatter(X_test[:,1], y_pred, color='red', label='Predicted') # Plot predicted data

plt.xlabel("Temperature")

plt.ylabel("Ice Creams Sold")

plt.title("Polynomial Regression")

plt.legend()

plt.show()

In this example, we used polynomial regression to model the relationship between temperature and the number of ice creams sold. By transforming the original features into polynomial features, we were able to capture non-linear relationships between the variables. The resulting model can then be used to make predictions and understand how changes in temperature affect ice cream sales.