Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Optimization Part 1 Basics and Least Squares
Published:
In common engineering problems, the optimization problem is to minimize the residual of a parametric model with respect to some observed data points.
Preliminaries
Given measurements where and . We define a prediction model parametrized by as and the residual of a single datapoint as , where . The full residual is simply a stack of every single residuals, i.e.,
Note that we require 𝑚≥𝑛 because otherwise the prediction model cannot be fully defined, i.e., there are infinitely many 𝜃 that perfectly fit the data.
The goal of an optimization problem is to find a model that best fits the data. To formalize this, we need to define some function that properly maps the residual vector to a minimizable scalar, which is known as the objective function. In general, the function is assumed to be continuously differentiable. Some of the common objective functions are
- Least Squares
- Least Squares with regularization
- Least Squares with regularization
- Weighted Least Squares
- Cauchy Loss
The following is a list of some of the theorems related to optimality conditions. Suppose is differentiable, and define the gradient (first-order derivative) and Hessian (second-order derivative) as
- Descent direction: If there is a vector such that , then for all sufficiently small , . Here, is a descent direction of at .
Intuitively, the dot product less than 0 means that vector is in the opposite direction as the gradient.
- First-order necessary condition: If is a local minimum, then .
- Second-order necessary condition: If is a local minimum, then and (positive semi-definite).
- Second-order sufficient condition: If and (positive definite), then is a local minimum.
- Convex function: is convex if and only if for all .
- Global minimum: Suppose is convex, then is a global minimum if an only if .
Linear Least Squares
Let be a matrix and be a vector, where and . The Linear Least Squares (LLS) optimization problem is to find the optimal solution to the linear system of equations . Define the objective function as
and the goal is to find .
First, we can write out the gradient and Hessian matrix of , which are
Derivation
Simply expand the objective function,
Note that the last step is true because and .
Further notice that because of , the Hessian matrix is positive definite (and thus invertible), i.e., . Then is the global minimum if and only if ,
Although the above equation gives a closed-form formula for finding the global minimum , computing the inverse of is not practical. Instead, we solve a linear system of equations
which is known as the normal equations.
Therefore, solving an LLS problem is basically solving a linear system of equations. Typical methods include Cholesky decomposition and QR factorization, rather than directly inverting .
Cholesky Solver
The Cholesky decomposition of a positive-definite matrix is a decomposition into the product of an upper triangular matrix and a lower triangular matrix, i.e., , where
The elements of can be obtained as follows
For a linear system of equations, , apply Cholesky decomposition and get . Then, we first let , and solve for using forward substitution (). Specifically,
Note that can be easily found as , and substituting into the second equation solves . In general,
Finally, solve for using backward substitution (). Specifically,
Now, can be easily found as , and substituting into the second to last equation solves . In general,
In practice, we can use numpy.linalg.cholesky or similar packages to perform Cholesky decomposition. The time complexity is .
QR Solver
The QR factorization of a matrix is a decomposition into an orthogonal matrix and an upper triangular matrix, i.e., , where
where the element of can be found via Gram-Schmidt diagonalization.
For a linear system of equations, , apply QR factorization and get . Multiply both sides by and get . Let , we can solve for by backward substitution, i.e.,
Non-linear Least Squares
In general, let be a smooth residual function. We can define an objective function as
The goal is still to find the minimizer of the function , i.e., . Unfortunately, there’s no closed-form formula for finding as we did for the LLS problem. Common methods usually start from an initial guess, and iteratively approximate the optimum based on “descent direction” mentioned before.
Specifically, if we start with an initial guess , linearize around , and find a descending direction . Then, we update the guess by moving along direction , and we can expect the objective will become smaller, i.e., and . Hopefully, this process will converge to a local minimum. Common methods include Gauss-Newton, Levenberg-Marquardt, or gradient descent, distinguished by how they compute .
Gauss-Newton Method
The steps of the Gauss-Newton method are as follows:
- Start from an initial guess , iterate until convergence
- Linearize the residual at the current guess using first-order Taylor expansion,
where
is the Jacobian matrix. Let .
- Solve the resulting linear least squares to find the step by solving
As discussed before, this is a linear system of equations, and should be solved by using Cholesky or QR decomposition rather than inverting .
- Update the current guess, .
Implementation example:
def gauss_newton(x0, max_iter, tol):
x = x0.copy()
for _ in range(max_iter):
r = residual(x)
J = jacobian(x)
d = np.linalg.solve(J.T @ J, -J.T @ r) # Solve (J^T J) d = -J^T r
x = x + d
if np.linalg.norm(d) < tol:
break
return xGradient Descent
The steps of the gradient descent method are as follows:
- Start from an initial guess , iterate until convergence
- Compute the gradient of the objective function
Derivation
Note that
and the -th entry of is
Then, stack all the entries of and we can get the result .
- Let the step be the negative of the gradient, scaled by a factor called the learning rate , i.e., .
- Update the current guess, .
Implementation example:
def gradient_descent(x0, max_iter, tol, eta):
x = x0.copy()
for _ in range(max_iter):
r = residual(x)
J = jacobian(x)
d = -eta * J.T @ r # Directly set d to be along the opposite gradient
x = x + d
if np.linalg.norm(d) < tol:
break
return x
Levenberg-Marquardt Method
The step of the Levenberg-Marquardt method is as follows:
- Start from an initial guess , iterate until convergence
- Linearize the residual at the current guess .
- Solve a regularized linear least squares problem by .
Derivation
Denote , the new objective function is
Take the gradient of , and let
- Update the current guess, .
Implementation example:
def levenberg_marquardt(x0, max_iter, tol, lam):
x = x0.copy()
for _ in range(max_iter):
r = residual(x)
J = jacobian(x)
A = J.T @ J + lam * np.eye(len(x))
d = np.linalg.solve(A, -J.T @ r) # Solve (J^T J + λI) d = -J^T r
x = x + d
if np.linalg.norm(d) < tol:
break
return xNote that the Levenberg-Marquardt method is an interpolation between Gauss-Newton and gradient descent.
- As , it reduces to Gauss-Newton: .
- As , it reduces to gradient descent: .
Example
Now we compare these three methods using an example. Consider the residual and objective as
Occupancy Grid Mapping
Published:
Besides state estimation or localization, which provide a robot with knowledge of where it is, it’s equally important for a mobile robot to perceive its surrounding and form a knowledge of where is occupied, that’s where mapping comes in. This note covers a fundamental mapping algorithm, occupancy grid mapping.
Problem Statement
In occupancy grid mapping, a map is discretized into different cells, where each cell is a binary random variable that models the occupancy, and is the probability of a cell being occupied. The map is assumed to be static. The goal is to find the belief , where is the observation and is the state of the robot.
Two assumptions are made for tractability
- The robot state are known.
- The occupancy of individual cells are independent, and thus
At each timestep , the sensor performs some measurement and provide the information of whether a cell is believed to be occupied or not, in terms of , which is known as the inverse sensor model.
One simple inverse sensor model is shown in the figure above. Consider a laser beam range sensor that provides the reading , where and are the range and bearing of the -th beam. For grid , let and be the range and bearing of the grid as observed from the current robot pose , and be the beam going through that gird. Based on the relation of a grid’s position relative to the robot, we have three different cases:
- If or , then return , the sensor has no information about this grid.
- Else if and , then return , the sensor think the grid is occupied.
- Else if and , then return , the sensor think the grid is free.
where is a sensor-specific parameter representing the maximum detectable range, is the grid size and is the width of each beam.
Mapping Algorithm
Following the Bayes filtering framework, a general algorithm can be derived for the occupancy grid mapping [2]. Starting from equation 1,
Further notice that the map is independent of the robot state , we obtain
Similarly, for the negation
Compute the ratio of equation (2) and (3), we get
Then, change to log odds form by defining
and the product chain turns into a sum
Bayes Inference using Conjugate Prior
The previous section derived an equation for updating the belief of the map under the Bayes framework, however, it lacks the notion of uncertainty (i.e., how confident we are about the map). With the notion of conjugate prior, a proper selection of the underlying distribution function for can both simplifies the updating rule and provide a closed form formula for uncertainty.
Recall that from Bayes rule
Bernoulli Likelihood and Beta Prior
First consider the case where the occupancy of a grid is binary, i.e., . Let be the unknown occupancy probability. From the inverse sensor model described above, we know that each sensor measurement provide a binary result, denoted as , where
Then, the likelihood is modeled as a Bernoulli with parameter , i.e.,
Furthermore, if we model the prior as a Beta distribution, i.e.,
where is the beta function.
Then, it follows from the Bayes rule that the posterior is
which still follows a Beta distribution, , where
In this case, we call beta distribution the conjugate prior for the Bernoulli likelihood.
Conjugate Distribution and Conjugate Prior [2]
- If the posterior distributions are in the same probability distribution family as the prior probability distribution, the prior and posterior are then called conjugate distributions.
- The prior is called a conjugate prior for the likelihood function.
Note that the choice of modeling the prior as a Beta distribution here is an algebraic convenience, which provides a closed-form expression for the posterior. Then the mean and covariance can be easily obtained
Lie Theory in Robot Motion
Published:
This note covers some of the fundamental concepts in Lie group and Lie algebra, and their applications to representing rigid body motion in robotics.
Matrix Lie Groups
A group is a set equipped with a binary map satisfying the following properties:
- Closure: .
- Associativity: .
- Identity: .
- Inverse: .
Here, it’s better to understand the element of a group as a kind of operation or transformation (e.g., a rotation or a rigid body transformation) rather than a number or a vector. These groups turn out to be also differentiable manifolds, which makes them Lie groups.
Special Orthogonal Group,
To represent a rotation formally, we need to consider the properties that define a rotation. In general, a rotation should be
- Linear, so it can be represented as a matrix .
- Preserves length and angles, so for any vector and , .
Note, from this property,
which means the matrix has to satisfy the following
- Preserves orientation (no mirroring), so .
Given the previous property
we have
In fact, these properties give rise to a group called the special orthogonal group,
which is a group of all rotations about the origin in an dimensional Euclidean space.
Special Euclidean Group,
The general way to represent the motion of a rigid body is a transformation that
- Preserves the Euclidean distance, .
- Preserves the orientation.
All such transformation forms a group called the special Euclidean group, .
Intuitively, such transformation can be represented by a function , where
However, the function is not linear and thus cannot be represented by a matrix, which causes inconvenience. The general trick to solve this problem is to augment the dimension by 1. For , define and such that
so the transformation becomes matrix vector multiplication.
The special Euclidean group now can be written as
which represents all poses an object can take in an dimensional space.
Lie Algebra
A Lie algebra is a vector space together with a binary operation called the Lie bracket satisfying the following properties
- Bilinearity: , .
- Alternating: .
- Jacobi identity: .
Importantly, every Lie algebra is associated to a Lie group , which is the tangent space of the identity of the Lie group, denoted as , and it completely captures the local structure of the group. In the case of robot motion, we can think of a trajectory that lives in the Lie group while the velocity lives in the Lie algebra.
Exponential Map
Let be a Lie group and be its Lie algebra, first we need to define a few notations:
- A curve in is defined as a function that maps each real number to an element of the group .
The derivative of , denoted as , evaluated at a specific time , , is a vector in the tangent space. Suppose , then .
- Left translation: For a specific element , define the map as the left translation. Example: if , and , then .
- Left-invariant vector field: For a vector , the left-invariant vector field is defined at any point by .
Basically, start from a velocity vector , which has to stars at the origin (identity). Then, you map this velocity vector to another vector in the tangent space to a group element , i.e., .
Now, suppose you have a curve that starts at the identity with velocity , then, for any , we have
The above equation can be thought of as a “differential equation” whose solution is
Let , the exponential map can be defined as
Bayes Filtering and State Estimation
Published:
In robot state estimation, the Bayes filter is a probabilistic approach that estimates the state from a sequence of controls and measurements by recursively performing prediction with a motion model and correction with a measurement model. In practice, some common instantiations of the Bayes framework are the Kalman filter, information filter, and particle filter.
Mathematical Formulation of Bayes Filter
Given a sequence of measurements and controls , and a measurement model and a motion model . The Bayes filter provides a recursive framework for estimating the state in terms of a probability distribution function, known as the belief , or posterior.

The filtering algorithm is built upon two assumptions of conditional independence:
- Motion model is (1st-order) Markov:
- The measurement model is memoryless:
Given these two assumptions, we can propagate the belief over time:
We can define the predicted belief of the state at time to be the probability distribution before knowing the current measurement, i.e., . Then, the above equation becomes
where is the normalization coefficient, and
which is essentially a correction applied to the belief after observing the current measurement.
What remains to be done, obviously, is to predict the belief from the previous state:
Then, recognizing , we have
In summary, the Bayes filter provides a two-step framework for updating the belief of the current state from the belief of the previous state :
- Prediction using the motion model (Equation 1);
- Correction using the measurement model (Equation 2).
These two steps can be performed recursively to estimate the state over time.
Kalman Filter
The Kalman filter is an instantiation of the Bayes filter with the assumption of linear Gaussian models. This means the motion model and the measurement model should be
- linear with respect to their arguments:
Specifically, suppose , , and . Let , , and be matrices (linear maps).
- Motion model:
- Measurement model:
where and are white noise with variance and , respectively.
White noise
A discrete-time random process (a sequence of random vectors), , is called white noise if
- following a multivariate Gaussian distribution:
Specifically,
- Motion model: ;
- Measurement model: .
Note: here we need to interpret as a realization rather than a random variable.
Multivariate Gaussian PDF
is the probability distribution function of a random variable that follows a Gaussian distribution with mean and covariance matrix ,
Before deriving the algorithm for the Kalman filter, there are two important properties of the multivariate Gaussian distribution that are useful here.
- Affine transformation:
Let be a random vector following a Gaussian distribution , then the random vector follows a Gaussian distribution .
Proof
Will come later
- Conditional distribution:
Let be a random vector of dimension . Partition into a vector of dimension , and a vector of dimension , , and partition and respectively,
then, the conditional distribution , where
Proof
Will come later
Now, if we assume the initial belief to be Gaussian, i.e., . Then, from equations (1) and (2), we can see that all subsequent beliefs will be Gaussian. Thus,
Then, we only need to track in the prediction step and in the correction step.
For the prediction step, use the joint distribution and realize
Derivation [2]
From equation (3),
Compute the mean and covariance,
Then can be obtained by marginalize ,
For the correction step, form a joint distribution and realize
Derivation [2]
Since and , where . From property 1, we have
Furthermore,
Thus, , where
Finally, normalize for numerical stability
Extended Kalman Filter (EKF)
One way to handle the case where motion model and measurement model are not linear is to approximate the functions via local linearization (Taylor expansion). Specifically, we extend the Kalman filter algorithm to cases where models are non-linear:
- Motion model
- Measurement model
Let
The EKF algorithm is
Unscented Kalman Filter (UKF)
Another non-linear Kalman filter is the unscented Kalman filter which uses deterministic sampling to approximate the distribution.
Let a function be a non-linear transformation. Suppose and . We draw a set of sigmapoints, , and compute a weight for each sigma point by executing
Further let , then , where
The UKF algorithm is
Reference
[1] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics. Mit Press, 2010.
[2] M. Ghaffari, ROB530 Lecture Slides — Kalman Filtering. University of Michigan, 2026.
Acoustic Damping with a Helmholtz Resonator
Published:
Wave Theory of Sound
Two fundamental equations in acoustics are
- Continuity Equation
The continuity equation states that the rate at which mass enters a system is equal to the rate at which mass leaves the system plus the accumulation of mass within the system.
- is fluid density
- is the velocity vector of the fluid
This equation is essentially .
- Momentum Equation
The momentum equation states that the rate of change of momentum of a fluid element is equal to the sum of the pressure gradient force, convective acceleration, and body forces.
- is the pressure
- is the body forces per unit mass.
This equation is essentially .
Wave Equation
The ambient state is characterized by its pressure , density , and fluid velocity . Acoustic disturbances are regarded as perturbations to the ambient state.
It can be derived from the previous two equations that the acoustic pressure , which is the difference between the pressure at and the ambient pressure, satisfies the wave equation
where is the speed of propagation. For simplicity, we only consider the one-dimensional case.
Sinusoidal Plane-wave
Suppose the sound we use in this experiment is a sinusoidal wave with a constant angular frequency . By introducing phasors, then by separation of variable, the solution is
Plug back into the wave equation to get
or simply
which is a very familiar second-order linear ODE. The general solution is
where is called the wave number. Finally, the total solution is
where are determined by initial and boundary conditions.
Acoustical Two-ports
Consider a model in the figure below where an acoustic element is located between two straight ducts.
This two-port system is fully characterized by an acoustic transfer matrix, which can be written as a relation between the pressure and velocities on each side of the two-port system.
We need to consider two variables , the acoustic pressure and the volume velocity , which are analogous to voltage and current in electric circuits respectively. Then, similar to Kirchhoff’s voltage and current law, we have the continuity of pressure law and continuity of volume law.
Continuity of Pressure
This law states that acoustic pressure does not vary appreciably over distances much less than a wavelength. Consider a path from to ,
Starting from the momentum equation (Equ. 2), neglecting nonlinear term , and integrate along the path, we get,
where is the density of the medium, and is the particle velocity vector.
Continuity of Volume Velocity
This law states that the net volume velocity flowing out of a volume is zero.
Starting from the continuity equation (Equ. 2), with an application of Gauss’s theorem, we obtain
Acoustic Impedance
We can define the acoustic impedance as the ratio of acoustic pressure and volume velocity,
Notice that a proper definition can be found here.
For a Helmholtz Resonator, geometry shown below,
we have,
where is the acoustic compliance, and is the acoustic inertance, is the “effective neck length”.
Reflection and Transmission
The reason why Helmholtz Resonator can reduce noise is that it can “reflect” incident acoustic wave almost completely when the frequency of the wave is at the resonance frequency.
We define the reflection coefficient as the ratio of the intensity of the transmitted wave () to the incident wave (),
It can be derived that (but I don’t know how), for Helmholtz Resonator,
At the resonance frequency,
The acoustic impedance of HR is
Then,
Thus the resonator has the potentially useful property of causing nearly total reflection of acoustic waves at frequencies near its resonance frequency. That’s why it is useful for acoustic damping.
The transmission loss is defined as
By setting proper value for , we can plot the transmission loss vs. frequency
Experiment
In the experiment, we want to measure the sound transmission loss of a Helmholtz Resonator for different frequency. The setup is shown below
where test specimen is a HR connected as a side branch.
We use the loudspeaker to generate a sinusoidal wave at frequency , then on the left hand side (upstream),
and on the right hand side (downstream),
Here is the incident wave, is the reflected wave, and is the transmitted wave. By definition, the transmission loss is
To solve for and , we use four microphone to measure the acoustic pressure at four different places,
Solving for ,
For convenience, we can take , then the transmission loss is
The microphone record voltage signal, which is proportional to the acoustic pressure , apply Fourier transform, take
Thus, we can see that are just the Fourier coefficient at the given frequency. Since we only need to ratio, by applying a Fast Fourier Transform directly to the voltage signal, and take their ratio will give the same result.
Solving Basic PDEs
Published:
A partial differential equation (PDE) is a relation involving one or more functions of several variables, and their partial derivatives. The order of a partial differential equation is the order of the highest partial derivative that appears in the equation.
There are three classical partial differential equations of order two which are very common in physical applications. They are known as the heat equation (1), the wave equation (2), and the Laplace equation (3).
Heat Equation
Consider the boundary-value problem
First, it’s easy to see that the equation is linear, which means any linear combination of solutions is again a solution. This suggests the following “plan” for solving this boundary-value problem.
- Find as many solutions as we can that satisfies the boundary condition .
- Find a proper linear combination of to satisfies the initial condition .
An important method to solve partial differential equations is separation of variables, which means we first assume . Then
Plug this in the original equation and get
divide both side by gives
Notice that the left-hand side only depend on , while the right-hand side only depend on . The only way that they can equal for any and is that both of them are constant. Thus,
By separation of variable, we have transformed a PDE into two solvable ODEs,
Now, plug in the boundary condition.
Since (not identically zero, otherwise would be identically zero, and we have a trivial solution), we have .
Now, we only need to solve
and
The first one is a second-order, linear, constant coefficient ODE, which we can directly write out the solution. There are three different cases:
- : The general solution is for some constant . Plug in the boundary condition, we have . Thus, there are no non-trivial solution for .
- : The general solution is for some constant . The boundary condition implies
This linear system of equation has non-trivial solution if and only if
This implies , which is impossible since . Therefore, there are also no non-trivial solution for .
- : The general solution is for some constant The boundary condition implies
For , we need to have for positive integer , or . Then the equation has non-trivial solutions
Then, we have the second equation,
which has solution
Hence, we have
Finally, we need to use the initial condition to find a proper linear combination of ’s, i.e., find coefficients ’s such that
satisfies . This means that
It’s easily seen that ’s should be the Fourier coefficient of the pure sine expansion, which is
Finally,
is the desired solution.
Wave Equation
Consider the boundary-value problem
This problem can also be solved by separation of variable. Let , then
Plug into the original equation, we have
or two ODEs,
and
We’ve seen in the previous problem that the first equation has solutions
For the second equation, it becomes
The solutions are
Thus,
is a non-trivial solution of the BVP for every positive integer , and every suitable constants . The linear combination
satisfies
To satisfy the initial condition, we only need to have
By Fourier sine expansion, we have
Solving ODEs by Series
Published:
Now, we consider a somehow more general case, where we want to solve a linear second-order homogeneous equation with coefficients that are polynomials. Consider the following equation
where in some interval and are polynomials of .
Properties of Power Series
An infinite series
is called a power series about . The following are some of its properties.
- All power series have an interval of convergence. This means that there exists a nonnegative number such that the infinite series converges for , and diverges for . The number is called the radius of convergence of the power series.
- The power series can be differentiated and integrated term by term, and the resultant series have the same radius of convergence.
- One simple method to determine the radius of convergence of a power series is the Cauchy ratio test. Suppose
Then the radius of convergence is .
- Many functions that arise in applications can be expanded in power series, i.e., there exists a series of coefficients so that
Such functions are said to be analytic at , the series expansion is called the Taylor series of about , and the coefficients can be found by
- The radius of convergence can be determined if we treat the function in the complex plane. Let be the point that is closest to that fails to exist, then the radius of convergence is .
- The product of two power series is still a power series, this product is called the Cauchy product
where the coefficient is
which is called the convolution of and .
- Important series expansions (Remember them!)
Series Solution
For a second-order equation with variable coefficients (but are analytic functions, at least), although it’s not very rigorous, we can first expect that the solution is also a power series. Then, by doing series expansion and setting the polynomial coefficients properly, we can expect the equation to hold. A formal statement of this is as follows.
Let the functions and have convergent Taylor series expansions about , for . Then, every solution of the differential equation
is analytic at , and the radius of convergence of its Taylor series expansion about is at least . Set
the coefficients are determined by plugging the expression into the differential equation and setting the sum of the coefficients of like powers of in this expression equal to zero.
However, it is possible that sometime , and the above method of series expansion will fail to work, but we still want to find a solution around the point . One possible case is Euler’s equation.
Euler’s Equation
The differential equation
where are constant is known as Euler’s equation.
This equation has a singular point at and the series solution method above cannot work. However, we can treat and separately.
First, assume that .
For , let (), and let . Then the equation is transformed into
which should have the same solution as the case when . Thus, for the final solution, we only need to put a modulus on .
Observe that , , and will have the same order if . Our goal is to find two independent solutions. Therefore, we plug in into the equation.
The original equation then becomes,
Hence, is a solution if and only if is a solution of the quadratic equation
Two Different Real Roots,
The general solution is given by
where .
Two Equal Roots,
One solution that is obvious is
where .
To find the second independent solution, applying the differential operator
and rewrite the equation as . Plug in , we have
Taking partial derivatives of both sides with respect to gives
That is
We can see that if , then the right-hand side will vanish. Therefore, the second solution is
The general solution is
Two Complex Solutions,
Let
and
It can be shown that the real and imaginary parts are two independent solutions. Therefore,
and the general solution is
where , .
Method of Frobenius
Here, we want to find a class of singular differential equations that is more general than Euler’s equation. A very natural generalization is the following
where and can be expanded in series of the form
In this case, the equation is said to have a regular singular point at . Equivalently, an equation has a regular singular point at if functions and are analytic at . It has a regular singular point at if functions and are analytic at .
We focus on the interval and multiply through the equation by and get
Compare the above equation with Euler’s equation, and find that it seems like adding higher order terms, i.e., by changing to , and changing to . Therefore, we may be able to obtain a solution by plugging in higher order term to the original solution. Specifically, let
Then,
and
Plug these in the original equation at get
If we set
and set the coefficient of each like power equal to zero, we have an indicial equation
and a recurrence relation
From these two relations, we can (hopefully) obtain two sets of coefficients and and the two independent solutions are
if we can obtain two different real solutions and , with .
Next, we discuss the possible situations that the above conditions are not satisfied.
Two Complex Roots
If is a complex root of the indicial equation, then
is a complex-valued solution, whose real and imaginary part are both real-valued solutions.
Equal Roots
If is the only root of the indicial equation, then we have only one independent solution.
To find the other solution, use the similar method of dealing with Euler’s equation. The second solution is
Roots Differing by a Positive Integer
Suppose that and , where , are two roots of the indicial equation. Then, we can certainly find one solution
It may be impossible to find another solution with coefficients .
In this case, the equation will have a second solution of the form
Elementary Complex Analysis
Published:
Holomorphic Functions
A function is complex differentiable, or holomorphic, at if
exists. The following are some of the properties of holomorphic functions.
Cauchy-Riemann Equations
Suppose that is holomorphic, and
then, the functions must satisfies the Cauchy-Riemann differential equations,
In addition, we define
then
Cauchy’s Theorem
Let be an open set, holomorphic on and and oriented smooth curve. We define the integral of along by
where is a parametrization.
One of the most important properties of holomorphic function is known as Cauchy’s theorem.
If is holomorphic in a disc, then,
for any closed curve in that disc.
With this properties, we can actually calculate many integrals along the real axis by extending it into an integration along a contour in the complex plane. The situation that we integrate over a semi-circle of radius and want that part to vanish is quite common, and Jordan’s lemma present a useful general result.
Assume that for some the function is holomorphic. Let
for some . Let
be a semi-circle segment in the upper half-plane and assume that
as . Then
Cauchy’s Integral Formula
Suppose is a holomorphic function in an open set . If a closed disc defined as
is completely contained in . Let be the positively oriented circle forming the boundary of , i.e., . Then, for all
This formula is known as the Cauchy’s integral formula. Furthermore, has infinitely many complex derivatives in , and for all ,
It then follows that holomorphic functions are analytic, specifically, for all
where
Residue Calculus
If has a pole of order at , then there exists a neighborhood of , numbers and a holomorphic function such that
for all . The term
is called the principal part of at the pole . To calculate the coefficients of the principal part, we have
The coefficient of is called the residue of at the pole , denoted as
In fact, we can calculate the residue without knowing the series expansion. In the case where has a simple pole at , it is clear that
If has a pole of order at , then
Residues are very useful at calculating contour integration in the complex plane.
Suppose that is holomorphic in an open set containing a (positively oriented) toy contour and its interior, except for poles at points inside . Then,
In addition, let and be polynomials of degree and respectively, where . If for , if has a zero of order at most 1 at the orgin, and if
for , then
where are the non zero poles of .
Laplace Transform and Inverse
Published:
The Laplace transform was first introduced by Pierre-Simon Laplace in his work on probability theory. The transform turns out to have many applications in science and engineering, especially in solving linear differential equations.
The Laplace Transform
Let be defined for . The Laplace transform of , which is denoted by , or is given by
where the improper integral is understood as
Remarks:
Two conditions that should be satisfied to have a Laplace transform (for sufficiently large ) are
- The function is of exponential order, i.e., there exist constants and such that for all ,
- The function is piecewise continuous, i.e., has at most a finite number of discontinuities on any interval , and both the limit from the right and the limit from the left of exist at every point of discontinuity.
The Laplace transform doesn’t care how the function behaves when , this means that different functions can have the same Laplace transform as long as they agree on .
Properties
In the following, we will assume , , or denotes as , . We have the following important properties.
- Linearity
- Frequency domain derivatives
- Time domain derivatives
- Frequency shifting
- Time shifting
where is the Heaviside step function (unit step function), and .
- Time scaling
where .
- Convolution
The convolution between two functions is defined as
Then, the Laplace transform of the convolution is the ordinary product of the Laplace transforms,
Laplace Transform of Common Functions
The following lists the Laplace transform of some common functions:
- Dirac Delta function (unit impulse)
- Heaviside step function (unit step)
- Polynomial
For integers, the following two formulas hold
where
is the Euler’s Gamma function, with .
- Exponential decay
- Sine and Cosine
- Hyperbolic Sine and Cosine
Inverse Laplace Transform
The inversion of the Laplace Transform is given by the Mellin inversion formula. In particular, if is continuous on , continuously differentiable on , and satisfies the condition for the Laplace transform to exist, then
for all , where
is called the Bromwich integral of , and the (positively oriented) contour is defined as
where is any number such that is analytic for all with .
To calculate the Bromwich integral, we should consider the case when and separately.
For , the integration contour closed on the left.
Let be the straight line and be the semicircle (orientation indicated by the arrows). Then, by the residue theorem
Letting , the first integral be the Bromwich integral we want. For the second integral, by applying the substitution rule,
where is a semi-circle of radius in the upper half-plane. In most cases, we can use Jordan’s Lemma to show that the right-hand side goes to 0 as .
Then, the Bromwich integral of is just the sum of all the residues of ,
For , the integration contour is closed on the left,
Let be the straight line and be the semi-circle (orientation indicated by the arrows). Using a similar argument, we see that
Since the integral over the whole closed contour is zero, and the orientation of the straight line is in the opposite direction, the Bromwich integral is
which in most cases vanishes according to Jordan’s Lemma.
Reference
[1]. VV286 Lecture Slides, Horst Hohberger, UM-SJTU JI.
Linear System of ODEs Theory
Published:
An important kind of equation to learn is those linear equations, which are those can be written in the form of
for some linear differential operator and unknown . We begin by the example of second order homogeneous linear equations.
Second Order Linear Equations
Given a linear, second order, homogeneous ODE
Now, we are going to show the following two things:
- If are solution to (1), then is also a solution to (1) for some constants .
- If and are two linearly independent solutions to (1), then all solution to (1) have the form for some constants .
The first one relies on the linearity of differential operator
We can show that this operator is linear by observing that
- for any constant .
- .
Then, for any constant , we have
and the first one is proved.
The second one relies the Picard-Lindelof theorem also known as existence and uniqueness theorem:
Let , where is open and let , where be an interval.
Suppose is continuous in and Lipschitz continuous in , i.e. there exist a constant , such that for all and all ,
Then, the initial value problem
has one and only one solution in some interval containing .
An outline of the proof is as follows:
We can always construct a solution by using an iterative approximation method known as Picard iteration, which works as follows:
First, a solution to (2) will have to satisfy the integral equation
Then, by setting , and
yields a sequence of function . Then, under some condition on function , we can use Contraction Mapping Principle to show that the sequence converges to a unique function .
Algebraic Structure of Linear Equation
An interesting question to consider in the proof is that why is there exactly two independent solutions to a second linear order equation? A simple answer to this might be that since it contain second order derivative, when we solve it we shall integrate twice and each gives an integrating constant which makes there are two constants to determine.
In fact, if we apply some knowledge from linear algebra, we may easy to think of the solutions as a vector space, known as solution space which happens to be 2 dimensional. And the general case would be, for a linear equation of order , the dimension of its solution space is exactly . Now, we want to prove this.
First notice that given an explicit ODE of order ,
and set new variables
Equation (3) can be written as a linear system of equations of order 1,
or, simply denoted as
Therefore, for any linear equation of order , it is always equivalent to a system of first order linear equation.
Thence, we only need to consider the first order linear system of equations,
where , and .
Consider the associated homogeneous equation
Notice that given any two solution , and any number , their linear combination is also a solution.
Suppose that a basis of vectors , of is given and fix . Denote by the unique solution (theorem of Picard-Lindelof) to the initial value problem
and denote
Then, by superposition principle, any element of is a solution to (4)
Conversely, let be any solution to (4), and suppose that . We have
for some numbers . Therefore, by theorem of Picard-Lindelof,
This shows that the solution space is given by .
In addition, we have,
which means is a basis.
Now, let’s explain why a second order linear equation will have exactly two independent solutions. For equation (1), let , and denote , it is equivalent to
which is a first order linear system of equation in two dimension.
Solution to Linear System of Equations
Homogeneous, Constant Coefficients
First, consider the most basic case of linear system of equations, that is a homogeneous one with constant coefficients:
where and .
With proper definition, it can be shown that the solution to the initial value problem (6) is
where
Now, the problem is how to calculate this matrix exponential. Recall from linear algebra that diagonal matrix, or Jordan matrix, are relative easy to calculate the matrix exponential. Fortunately, we have the Jordan Normal Form theorem which states that every complex matrix is similar to a Jordan matrix, i.e., for any matrix , there exists a matrix and a Jordan matrix such that
Then, we can easily see that
and then
The problem now becomes calculating .
Let
be a Jordan matrix, where is a Jordan block of size and diagonal element , i.e.,
It’s easy to show that
Hence, it’s sufficient to calculate the exponential of Jordan blocks. Notice that
where is some nilpotent matrix whose power will vanish after finite many times, and that unit matrix commutes with any other matrix in terms of multiplication, i.e., . Therefore,
However, the above calculation seems too complicated that it would be impractical. If we trace back what we are actually doing, we know that we are actually finding a basis of the solution space.
From previous discussion of the structure of the solution, we know immediately that for any basis of , the system of functions is a fundamental system of solution. We collect them into one matrix and denoted as , and is also called the fundamental matrix.
It turns out that if we choose the basis wisely, we don’t even need to calculate the whole to get a fundamental matrix. From theory of Jordan normal form, we know that there exists a basis consists of (generalized) eigenvectors , such that the Jordan matrix of is given by . Using this basis, we have the fundamental matrix simply given by
The columns of will therefore be the fundamental system of solutions.
Inhomogeneous, Constant Coefficients
Next, we solve an inhomogeneous equation with constant coefficients. The general solution is the superposition of a particular solution and the solution to the associated homogeneous equation.
Consider the following linear system of differential equation
suppose is a fundamental system for the associated homogeneous equation, then the general solution is given by
Variable Coefficients
There is no general method of solving homogeneous equations with variable coefficients,
However, if we are given a fundamental system of solution , then we can find a particular solution to the inhomogeneous equation
by method of variation of parameter.
Let
and plug into equation (7), and get
leading to
Let be the fundamental matrix, and then the previous equation is indeed the following linear system of equation
Using Cramer’s rule, we can find the solution to .
where
is called the Wronskian.
Solving First and Second Order ODEs
Published:
First Order ODEs
First order ODE is equations that has the highest derivative of order one. It takes the form
The general solution to this equation is some function that satisfies the equation.
Separable Equation
A separable equation is of the form
Linear Equation
A linear equation is of the form
For now, we assume for all , then the equation above is simplified into
- If for all , it is called a homogeneous equation;
- Otherwise, it is called an inhomogeneous equation.
Homogeneous
A homogeneous linear equation is of the form
which is a separable equation. The general solution to it is
If given an initial condition , then the solution is
Inhomogeneous
An inhomogeneous linear equation is of the form
Let
and multiply , which is called the integrating factor, to both sides of the equation, we get . Therefore the solution is
If given initial condition , then we can find a specific constant .
Method of substitution
In addition to separable equations and linear equations for which we have a direct solving techniques, some other first order equations can be transformed into separable equations or linear equations by method of substitution. The following are some types of those transformable equations.
Equation:
Consider a equation of the form
where , .
Let , then,
which is a separable equation of .
Homogeneous Equation
A homogeneous equation has the form
This type of equation has an important characteristic that its invariant under zoom, i.e., if make change , , the equation does not change.
Tip:
Sometimes the equation is put into a strange form that it may not be obvious the right hand side is a function of . A quick judge would be to check whether the sum of power of and in the denominator and numerator of the right hand side are equal.
We solve this kind of equations by transforming them into a separable equation.
First, let , and therefore, , and . Substitute that in the equation and,
which is a separable equation. Here are some examples,
Example
Find the general solution to the following differential equation
Solution:
First, make the left hand side life a homogeneous equation,
and let . Then, the equation becomes,
integrate both side
which is
Change back into , and get
Now, suppose we are in the polar coordinate, where , and ,
or equivalently,
which is known as the exponential spiral.
Bernoulli’s Equation
A Bernoulli’s equation has the form
where . We solve this kind of equations by transforming them into a linear equation.
First, multiply the equation by , and get
Next, let , and observe that , thus the equation becomes,
which is a linear equation. Here is an example.
Example
Find the general solution of the following equation
Solution:
First, divide the equation by , which gives
Then, the substitution should be , and . Thus, the equation becomes
which is a linear equation.
Next, the integrating factor is
Multiply the last equation by gives
which is in fact,
Therefore,
and substitute back to get
Integral Curves
Even with the previous cases, there are not very much equations that we can solve. Now, we generalize the solution of a differential by not requiring it to be a explicit function. That is, we can solve (1) if we can found a function , such that (1) can be transformed into the form
Then, we integrate it with respect to and get a general solution
for some constant . This kind of solution is know as the integral curve.
Now, we discuss when can we write an equation into (2). By chain rule of partial differentiation, we have
compare this with equation (1), we know that for such a nice function to exist, we require
portfolio
INSIGHT: Smart Assistive Glass for Visually Impaired
In-device navigation and scene interpretation glasses for low-vision users.
Lumen Grid: Competitive Multi-Robot Parking Game
A fast-paced, interactive parking game using four Zumo robots on an LED-marked field.
Transformable Wheel
An autonomous vehicle equipped with transformable wheels for complex terrain locomotion.
research
Highly Deformable Proprioceptive Membrane for Real-Time 3D Shape Reconstruction
A soft, stretchable optical waveguide sensor with embedded LEDs and photodiodes for learning-based 3D deformation reconstruction.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.












