0% found this document useful (0 votes)
19 views25 pages

2 Linear Fcts

The document discusses linear and affine functions, highlighting their properties such as superposition and the inner product function. It explains how to determine if a function is linear or affine, and provides examples to illustrate these concepts. Additionally, it introduces a problem statement related to estimating function values using derivatives.

Uploaded by

puayf2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views25 pages

2 Linear Fcts

The document discusses linear and affine functions, highlighting their properties such as superposition and the inner product function. It explains how to determine if a function is linear or affine, and provides examples to illustrate these concepts. Additionally, it introduces a problem statement related to estimating function values using derivatives.

Uploaded by

puayf2012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

2: Linear functions

Linear and affine functions

Affine approximation

Regression model

Source: Introduction to Applied Linear Algebra, Boyd & Vandenberghe

D. Matsypura, A. Wong, QBUS1040 1/25


Function notation

• f : Rn → R means f is a function mapping n-vectors to numbers

• If f (x) is an n-vector then f (x) is a scalar.

• Note that x is a vector.


This means that f (x) is really f (x1 , x2 , . . . xn )

Example
f (x) = x1 + x2 − x23
Given the vector x = (1, 2, 3):

f (x) = 1 + 2 − 32
= −6

2: Linear functions 2/25


Superposition and linear functions

• f : Rn → R means f is a function mapping n-vectors to numbers

• f satisfies the superposition property if

f (αx + βy) = αf (x) + βf (y)

for any numbers α, β and n-vectors x, y

• be sure to parse this very carefully!

• a function that satisfies superposition is called linear

2: Linear functions 3/25


Superposition and linear functions

By extension: if f is linear, then

f (α1 x1 + · · · + αk xk ) = α1 f (x1 ) + · · · + αk f (xk )

for any n-vectors x1 , . . . , xk , and any scalars α1 , . . . , αk .

Why?
Consider the case with 3 terms: f (αx + βy + γz)
Well βy + γz is just a vector! And any arbitrary vector can be written in the
form of a scalar of 1 times a vector. Let w = βy + γz and δ = 1.

f (αx + βy + γz) = f (αx + δw) = αf (x) + δf (w) = αf (x) + f (βy + γz)

Hence we can keep just keep applying the superposition property!

2: Linear functions 4/25


Superposition and linear functions

What does this all mean...?


• f satisfies the superposition property if

f (αx + βy) = αf (x) + βf (y)

Example: Consider the function f (z) = 2z1 .


Our function satisfies the superposition property if applying our function to
(αx + βy) is the same as applying it to each of the terms.

Let’s say α = 2 and β = 3.

LHS: f (2x1 + 3y1 ) = 2(2x1 + 3y1 ) = 4x1 + 6y1

RHS: 2f (x1 ) + 3f (y1 ) = 4x1 + 6y1

LHS = RHS
Therefore our function is linear because it satisfies the superposition property.

2: Linear functions 5/25


Superposition and linear functions
• f satisfies the superposition property if
f (αx + βy) = αf (x) + βf (y)
Another example: Consider the function f (z) = 2z1 + z2 .

Let’s say α = 2 and β = 3.


! " # " #$ !" #$
x1 y1 2x1 + 3y1
LHS: f (2x + 3y) = f 2 +3 =f
x2 y2 2x2 + 3y2
= 2(2x1 + 3y1 ) + (2x2 + 3y2 )
= 4x1 + 6y1 + 2x2 + 3y2
!" #$ !" #$
x1 y1
RHS: 2f (x) + 3f (y) = 2f +3
x2 y2
= 2(2x1 + x2 ) + 3(2y1 + y2 )
= 4x1 + 6y1 + 2x2 + 3y2

LHS = RHS
Therefore our function is linear because it satisfies the superposition property.
2: Linear functions 6/25
Superposition and linear functions
• f satisfies the superposition property if
f (αx + βy) = αf (x) + βf (y)
Another example: Consider the function f (z) = z1 + z22 . Does it satisfy the
superposition property?
! " # " #$ !" #$
x1 y1 αx1 + βy1
LHS: f (αx + βy) = f α +β =f
x2 y2 αx2 + βy2
= (αx1 + βy1 ) + (αx2 + βy2 )2

!" #$ !" #$
x1 y1
RHS: αf (x) + βf (y) = αf +β
x2 y2
= α(x1 + x22 ) + β(y1 + y22 )
= αx1 + βy1 + αx22 + βy22

LHS ∕= RHS
Therefore our function is not linear because it does not satisfy the
superposition property.
2: Linear functions 7/25
The inner product function

• with a an n-vector, the function

f (x) = aT x = a1 x1 + a2 x2 + · · · + an xn

is the inner product function


• f (x) is a weighted sum of the entries of x
• the inner product function is linear:

f (αx + βy) = aT (αx + βy)


= aT (αx) + aT (βy)
= α(aT x) + β(aT y)
= αf (x) + βf (y)

2: Linear functions 8/25


The inner product function

Writing out the proof explicitly


We want to show that f (αx + βy) = αf (x) + βf (y) for f (x) = aT x.

% (
+ ( +.
x1 y1
& ) x2 , ) y2 ,/
& ) , ) ,/
LHS: f (αx + βy) = aT (αx + βy) = aT &α ) . , + β ) . ,/
' * .. - * .. -0
xn yn
%( + ( +. %( +.
αx1 βy1 αx1 + βy1
&) αx2 , ) βy2 ,/ &) αx2 + βy2 ,/
&) , ) ,/ &) ,/
= a & ) . , + ) . ,/ = a T
T
&) .. ,/
'* .. - * .. -0 '* . -0
αxn βyn αxn + βyn
= a1 (αx1 + βy1 ) + a2 (αx2 + βy2 ) + · · · + an (αxn + βyn )

= α(a1 x1 + a2 x2 + . . . an xn ) + β(a1 y1 + a2 y2 + · · · + an yn )

= α(aT x) + β(aT y) = αf (x) + βf (y) = RHS


2: Linear functions 9/25
. . . and all linear functions are inner products

• suppose f : Rn → R is linear

• then it can be expressed as f (x) = aT x for some a

• specifically: ai = f (ei )

• follows from

f (x) = f (x1 e1 + x2 e2 + · · · + xn en )
= x1 f (e1 ) + x2 f (e2 ) + · · · + xn f (en )
= x1 a1 + x2 a2 + · · · + xn an

Note: the representation of a linear function f as f (x) = aT x is unique.

2: Linear functions 10/25


Examples in R3

• f (x) = 13 (x1 + x2 + x3 ) is linear: f (x) = aT x with a = ( 13 , 13 , 13 )

• f (x) = −x1 is linear: f (x) = aT x with a = (−1, 0, 0)

• f (x) = max{x1 , x2 , x3 } is not linear.


Example where superposition does not hold:
( + ( +
1 0
x = * 0 - , y = * 0 - , α = −1, β = 1
0 0
% ( + ( +. %( +.
1 0 −1
LHS: f (αx + βy) = f '(−1) * 0 - + (1) * 0 -0 = f '* 0 -0 = 0
0 0 0
% ( +. % ( +.
1 0
RHS: αf (x) + βf (y) = (−1)f '* 0 -0 + (1)f '* 0 -0 = −1
0 0

LHS ∕= RHS
2: Linear functions 11/25
Affine functions

• a function that is linear plus a constant is called affine

• general form is f (x) = aT x + b, with a an n-vector and b a scalar

• a function f : Rn → R is affine if and only if

f (αx + βy) = αf (x) + βf (y)

holds for all α, β with α + β = 1, and all n-vectors x, y

• sometimes (ignorant) people refer to affine functions as linear

2: Linear functions 12/25


Affine functions

Why do we have the condition α + β = 1 ?

Recall from slide 9 for linear functions of the form f (x) = aT x:


% ( + ( +.
x1 y1
& ) x2 , ) y2 ,/
& ) , ) ,/
LHS: f (αx + βy) = aT (αx + βy) = aT &α ) . , + β ) . ,/
' * .. - * .. -0
xn yn
%( + ( +. %( +.
αx1 βy1 αx1 + βy1
&) αx2 , ) βy2 ,/ &) αx2 + βy2 ,/
&) , ) ,/ &) ,/
= a T & ) . , + ) . ,/ = a T &) .. ,/
'* .. - * .. -0 '* . -0
αxn βyn αxn + βyn
= a1 (αx1 + βy1 ) + a2 (αx2 + βy2 ) + · · · + an (αxn + βyn )
= α(a1 x1 + a2 x2 + . . . an xn ) + β(a1 y1 + a2 y2 + · · · + an yn )
= α(aT x) + β(aT y) = αf (x) + βf (y) = RHS

2: Linear functions 13/25


Affine functions
Why do we have the condition α + β = 1 ?

Now consider affine functions of the form f (x) = aT x+b:


% ( + ( +.
x1 y1
& ) x2 , ) y2 ,/
& ) , ) ,/
LHS: f (αx + βy) = aT (αx + βy)+b = aT &α ) . , + β ) . ,/ +b
' * .. - * .. -0
xn yn
%( + ( +. %( +.
αx1 βy1 αx1 + βy1
&) αx2 , ) βy2 ,/ &) αx2 + βy2 ,/
&) , ) ,/ &) ,/
= aT &) . , + ) . ,/ +b = aT &) .. ,/ +b
'* .. - * .. -0 '* . -0
αxn βyn αxn + βyn
= a1 (αx1 + βy1 ) + a2 (αx2 + βy2 ) + · · · + an (αxn + βyn )+b
= α(a1 x1 + a2 x2 + . . . an xn ) + β(a1 y1 + a2 y2 + · · · + an yn )+b

RHS = αf (x) + βf (y) = α(aT x+b) + β(aT y+b)


= α(aT x)+αb + β(aT y)+βb = α(aT x) + β(aT y)+(α + β)b
2: Linear functions 14/25
Affine functions and inner products

for fixed a ∈ Rn , b ∈ R, define a function f : Rn → R by

f (x) = aT x + b = a1 x1 + a2 x2 + · · · + an xn + b

i.e., an inner-product function plus a constant (offset)

• any function of this type is affine: if α + β = 1 then

aT (αx + βy)+b = aT (αx + βy)+(α + β)b = α(aT x + b) + β(aT y + b)


• every affine function can be written as f (x) = aT x + b with:

a = (f (e1 ) − f (0), f (e2 ) − f (0), . . . , f (en ) − f (0))


b = f (0)

2: Linear functions 15/25


By Extension: if f is affine, then

f (α1 u1 + α2 u2 + · · · + αm um ) = α1 f (u1 ) + α2 f (u2 ) + · · · + αm f (um )

for all vectors u1 , . . . , um and all scalars α1 , . . . , αm with

α1 + α2 + · · · + αm = 1

2: Linear functions 16/25


Outline

Linear and affine functions

Affine approximation

Regression model

2: Linear functions 17/25


Problem statement
• You have a function f (x).
• You also know the value of the function at z, i.e. you know f (z).
• You want to know the value of f (x′ ).
• Calculate ∂f /∂x′ (z).
• Estimate fˆ(x′ )
f (x)
20
∂f h
(z) = ′
∂x ′ (x − z)

10
fˆ(x′ )
∂f ∂f
= f (z) + (z)(x′ − z) h = (z)(x′ − z)
∂x′ ∂x′

f (z) 0

(x′ − z)

−10 x
−2 −1.5 −1 −0.5 0 0.5 1 1.5
z x′
2: Linear functions 18/25
Affine approximation

• suppose f : Rn → R

• first-order Taylor approximation of f , near point z:

∂f ∂f
fˆ(x) = f (z) + (z)(x1 − z1 ) + · · · + (z)(xn − zn )
∂x1 ∂xn

• fˆ(x) is very close to f (x) when xi are all near zi

• fˆ is an affine function of x

• can write using inner product as

fˆ(x) = f (z) + ∇f (z)T (x − z)

where n-vector ∇f (z) is the gradient of f at z,


! $
∂f ∂f
∇f (z) = (z), . . . , (z)
∂x1 ∂xn

2: Linear functions 19/25


Example with two variables

f (x1 , x2 ) = x1 − 3x22

Gradient " #
1
∇f (x) =
−6x2

First order Taylor approximation around z = 0

fˆ(x) = f (0) + ∇f (0)T (x − 0)


= 0 + (1)x1 + (0)x2

2: Linear functions 20/25


Outline

Linear and affine functions

Affine approximation

Regression model

2: Linear functions 21/25


Regression model

• regression model is (the affine function of x)

ŷ = xT β + v
= β1 x 1 + · · · + βn x n + v

• x is a feature vector

• elements xi are called regressors, independent variables or inputs

• β = (β1 , . . . , βn ) is the vector of weights or coefficients

• scalar v is the offset or intercept

• coefficients β1 , . . . , βn , v are the parameters of the regression model

• scalar ŷ is the prediction (of some actual outcome or dependent variable,


denoted y)

2: Linear functions 22/25


Example

• y is selling price of a house in $1000 (in some location, over some period)

• regressor is

x = (house area in 1000 sq.ft., # bedrooms)

• regression model weight vector and offset are

β = (148.73, −18.85), v = 54.40

• we’ll see later how to guess β and v from sales data


(you should remember this from BUSS1020)

2: Linear functions 23/25


Example

ŷ = 148.73x1 − 18.85x2 + 54.40

• ŷ is predicted selling price in thousands of dollars


• x1 is area (1000 square feet); x2 is number of bedrooms

House x1 (area) x2 (beds) y (price) ŷ (prediction)


1 0.846 1 115.00 161.37
2 1.324 2 234.50 213.61
3 1.150 3 198.00 168.88
4 3.037 4 528.00 430.67
5 3.984 5 572.50 552.66

2: Linear functions 24/25


Example
scatter plot shows sale prices for 774 houses in Sacramento

y
2: Linear functions 25/25

You might also like