Tags:
create new tag
, view all tags, tagging instructions
Return to Course Materials

Title: UCSF - Simple Linear Regression

Lead Author(s): David Glidden, PhD

Start Presentation

Slide 1: Simple Linear Regression

Example
  • HERS: Randomized clinical trial n=2763
  • Post menopausal women with hx of MI
  • Randomized to placebo v. hormone therapy
  • Outcome: Second MI
  • Wealth of baseline data
  • HDL associated w/ waist circumference?
  • Sample of 221 subjects - baseline data
Scatterplot

Slide 2: Why regression?

  • Take as given that mean is a good summary
  • How does mean HDL depend on waist circumference?
  • Other methods are not ideal
  • Try forming groups based on value of waist circumference...
Scatterplot: Mean & 95% CI

Slide 3: Grouping Approach

Scatterplot - Waist Circumference
  • Advantages:
    • Simplicity
    • Interpretable
  • Disadvantages:
    • Choice of groups
    • More groups: resolution but variability
    • Maybe important differences w/in groups
    • Hard to describe

Slide 4: Linear Regression

Linear Regression
  • No groups
  • Mean changes continuously with predictor
  • Uses all the data to predict mean at any point (borrows strength)
  • Can work around linear assumption
  • Has a straightforward interpretation!
(See also: Linear Regression)

Slide 5: Linear Regression Mean

Is Line Reasonable?

Slide 6: Example

  • Study of 52 HIV+ individuals
  • Recruited from the SFGH Neurology clinic
  • Studied cross-sectionally
  • Not on anti retroviral therapy
  • HIV-RNA determined in plasma and CSF
  • How are the two related?
Does line fit well?

Slide 7: Scatterplot Smoother

Scatterplot with line
  • Nonparametric method
  • Draws and connects a series of local lines
  • Result: flexible smooth curve
  • Mean of y as a function of x non-linear regression line
  • Useful tool for exploring association

Slide 8: Suppose - Let's consider 4 equal-sized groups

Slide 9: Scatterplot Smoother

  • Many methods for smoothing.
  • LOWESS (locally weighted scatterplot smoothing) is the most popular
  • Depends on inputs how smooth to make the line
  • Oversmooth: linear fit
  • Undersmooth: connects the dots
  • Programs have defaults (80% smoothing)
  • Methods work by "local" regression

Slide 10: STATA

  • twoway scatter csfrna hivrna scatterplot
  • lowess csfrna hivrna lowess curve for data
  • Menu: Graphics, Twoway Graphs
Linear regression variability

Slide 11: Next Example

  • Based on a 19 subject subsample of the HERS data
  • Makes it easier to visualize data
  • Effects of the outliers is more vivid

Slide 12: Scatterplot + Fitted

Scatterplot and Fitted

Slide 13: Least Squares

Simple Linear Regression - Scatterplot
(See also: Least Squares)

Slide 14: Effect of Outlier

Which fits better?
  • Large if....
  • Predictor value is far from mean
  • Large residual in regression
  • Relatively few values
  • Same outlier has little effect in big dataset

Slide 15: Interpreting Regression

Less influence of outlier

Slide 16: Intercept/Slope

Intercept and Slope

Slide 17: Questions a Regression Can Answer

Questions a regression can answer

Slide 18: Answers - Question 1

Question 1: How does mean HDL vary with waist circumference?

Question 1 Answer

Slide 19: Answers - Question 2

Question 2: Is the association significant?

Question 2 Answer

Slide 20: Answers - Question 3

Question 3: How much of HDL variation is explained by variation in waist circumference?

Question 3 Answer

Slide 21: Answers - Question 3

Question 3

Slide 22: Sample Paragraph

There is an inverse association between waist circumference and HDL (p=0.002) with each one cm increase in waist circumference associated with a -0.196 mg/dL decrease in HDL, 95% CI (-0.32,-0.07). Even though the relationship was significant, waist circumference accounted for only 4% of the observed variance in HDL.

Slide 23: Good Paragraph

  • Doesn't focus solely on significance
  • Quotes slope and 95% CI
  • Perhaps R-squared
  • Don't bother with sum of squares

Slide 24: Confounding

  • Is the effect of waist on HDL causal? -1cm change in person translate to +.45?
  • Maybe waist circ. reflecting different populations? with different diet and exercise patterns?
  • How can we isolate the effect of waist? adjusting for diet and exercise! *Solution... Multiple Linear Regression
(See also: Bias or Confounding)

Slide 25: Summary

  • Linear regression is a powerful tool for interpreting associations
  • Extends familiar methods (t-test, ANOVA)
  • Series of `assumptions' (Lines, normality) all of which can be relaxed

EducationalMaterialsForm edit

Title UCSF - Simple Linear Regression
Contributor/Contact David Glidden, PhD
Institution UCSF
Acknowledgment Please cite the appropriate contributors/authors/contacts when using or adapting these materials.
Format PPT slides
Attachment Slide show above
URL_Web_Link

Type of Course Single Presentation
Level of Course Beginning
Audience Graduate Student, Clinical Researcher
Topics Description

Software Program Stata
Datasets

Data

Keywords Scatterplot
Mean
95% CI
Linear regression
Least squares
Outlier
Bias
Confounding
See Also

Type of Activity Course Slides
Disclaimer The views expressed within CTSpedia are those of the author and must not be taken to represent policy or guidance on the behalf of any organization or institution with which the author is affiliated.
Topic revision: r7 - 08 Oct 2012 - 15:44:31 - MaryBanach
 

Copyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CTSPedia? Send feedback