Residual Calculation:
From: | To: |
Residuals represent the difference between observed values and predicted values in statistical models. They are crucial for assessing model fit and identifying patterns that the model doesn't capture.
The basic calculation in R is:
Or using R's built-in function:
Where:
observed
- Actual measured valuespredicted
- Values predicted by your modelmodel
- An R model object (e.g., from lm())Details: Examining residuals helps verify model assumptions (linearity, homoscedasticity, normality) and identify outliers or influential points.
Tips: Enter comma-separated values for both observed and predicted. Ensure both lists have the same number of values.
Q1: What do positive/negative residuals mean?
A: Positive means observed > predicted (underprediction), negative means observed < predicted (overprediction).
Q2: How should residuals be distributed in a good model?
A: Ideally, residuals should be randomly distributed around zero with constant variance.
Q3: What's the difference between raw and standardized residuals?
A: Standardized residuals are scaled by their standard deviation, making them more comparable.
Q4: When should I use studentized residuals?
A: For identifying outliers, as they account for the observation's influence on the model.
Q5: How can I visualize residuals in R?
A: Use plot(model)
for diagnostic plots or ggplot2
to create custom residual plots.