A while ago i wrote a document on how you could use a Kalman Filter to merge the sensor readings of a gyro and an accelerometer into pitch and roll for a quadrocopter. This is pretty much a copy/paste of this document with some small adaptions for the blog format. I haven’t got my quadrocopter to fly yet using this filter but I’m pretty sure that it’s possible if I just find the time to trim all paramters of the filter and controller.
The aim of this post is not to present all the equations proving the optimality of a Kalman filter, there is enough papers and books doing that. Instead you should, after reading this, understand my filter implementation enough to be able to tune it for your purposes. Because of this the document may not always be 100% mathematically correct and concepts I do not find important may be left out completely.
To understand this paper you need to know some basic trigonometry and linear algebra. For those of you not comfortable with linear algebra I suggest you to read a little. The Wikipedia articles about matrix multiplication, matrix inversion and matrix transpose could be a good start. I will not go into detail on explaining exactly how the Kalman filter works but I will explain the steps needed to construct such a filter. For a more indepth view on the Kalman filter I recommend you to read the Wikipedia article or one of the several good books on the subject. This document will however use the same notations as in the Wikipedia article.
The reason I constructed this filter in the first place was to control the pitch and roll of a quadrocopter where neither the accelerometer nor the gyro could be used by itself to get accurate readings. The accelerometer can be used to estimate the direction of the gravity acceleration vector while stationary but this becomes difficult when the quadrocopter is moving since acceleration composants will appear in other directions than downwards. The gyro reading on the other hand can be integrated over time to get the angle relative to starting position but each measurement error will integrate into the angle estimation and the error will build up over time, especially since most gyros have a small bias. One of the solutions of this problem is to fuse the readings from both sensors using a Kalman filter. This filter uses the gyro readings for quick angular changes and the accelerometer to correct the error over longer periods of time. Other solutions like DCM or complementary filters exist but are not explained in this document.
The filter
A composite Kalman filter for estimating both pitch and roll could be created, taking into account the fact that they are coupled through the accelerometer gravity vector. Such a filter would be very complex and highly nonlinear because of the trigonometry involved forcing us to use an Extended Kalman Filter (EKF). It will probably require a large amount of trigonometry function evaluation and will not, on a microcontroller without hardware floatingpoint capabilities, yield enough performance to balance a quadrocopter. Instead the filter could be divided into two identical filters one for pitch and one for roll. This not only makes the filter linear but also removes almost all computationally heavy trigonometry functions.
This simplification is based on the fact that the changes in gravity acceleration vector for pitch and roll are more or less independent of each other for small angles when the angle approaches ±90° there will be errors in the output. The filter constructed will work best for small changes of pitch or roll which generally isn’t a problem when stabilizing a quadrocopter. Hence this filter may not be optimal for acrobatic flight performance and maybe more suitable for aerial photography or similar tasks.
Steps of evaluating the filter
The Kalman filter consists in six steps executed over and over again with a sample time between the executions. In the notation used in this document a variable is subscripted with it refers to the value during previous execution and if it is subscripted with it refers to the value during the current execution. The steps are:
1. State prediction based on dynamic model.
One of the most important parts of the Kalman filter is the model of the system. This is the part which keeps the output sane while the sensors are feeding the filter with insane readings. For example reducing the effect of measurement noise or the acceleration vector pointing somewhere else than down due to other acceleration than gravity. The model is used to predict the next state of the model based on the current state and the control signals.
Since we separate the estimation of pitch and roll the quadrocopter state x according to one of the filters can be described by three different variables. Keep in mind that this is for pitch OR roll. Two identical filters are created.
[latex]
\boldsymbol{x} = \left[ \begin{array}{c}
\theta \\
\dot{\theta} \\
\dot{\theta}_b \end{array} \right]
[/latex]
The angle the angular velocity and the bias in the angular velocity measurements Using this state representation the equation
[latex]
\boldsymbol{x}_k = \boldsymbol{Fx}_{(k1)} + \boldsymbol{Bu}_k + \boldsymbol{w}_k \: \: (1)
[/latex]
Can be used to make a good guess of the states in this sample based on the state in the previous sample and the current control signal The matrix represents the dynamics of the system. Multiplied with the old states the result is a guess of the new states. For the quadrocopter the following F matrix is used.
[latex]
\boldsymbol{F} = \left[ \begin{array}{ccc}
1 & \Delta t & \Delta t \\
0 & 1 & 0 \\
0 & 0 & 1 \end{array} \right]
[/latex]
If you expand the matrix equation for x_{k} the result will be
[latex]
\begin{array}{l}
\theta_k = \theta_{(k1)} + \Delta t(\dot{\theta}_{(k1)} – \dot{\theta}_{b_{(k1)}} ) \\
\dot{\theta}_k = \dot{\theta}_{(k1)} \\
\dot{\theta}_{b_k} = \dot{\theta}_{b_{(k1)}} \end{array}
[/latex]
If you think about it, this seems fairly reasonable. The angle in the next sample will be the angle in the sample before plus the unbiased angular velocity multiplied with the sampling time. The model also says that the other two states will remain the same as in the previous sample. This is a simplification saying that the angular velocity and angular velocity measuring bias will change slow enough to be barely noticeable between two sample times.
The next part of equation (1) involves the control signal input to the system. The multiplication with the B matrix can be expanded in exactly the same way as the F matrix but in the case of the quadrocopter inputs are ignored thus
[latex]
\boldsymbol{B} = 0
[/latex]
Inputs could be taken into consideration but this will increase the complexity and computational requirements of the filter and the accuracy gain will probably not be significant.
Just to explain how this works (you can skip this section if you want), let’s say you also feed the filter with the input to the motors affecting the filter angle u_{1} and u_{2} and you also know that the force generated by the motor is linearly dependent of the input by the constant k. You also know the inertia Jaround the axis of the angle. You could use
[latex]
\boldsymbol{u} = \begin{bmatrix}
u_1 \\
u_2
\end{bmatrix}
[/latex]
[latex]
\boldsymbol{B}=\begin{bmatrix}
0 & 0 \\
\frac{k}{J} & \frac{k}{J} \\
0 & 0
\end{bmatrix}
[/latex]
resulting in a model for the estimation of the angular velocity looking like this
[latex]
\dot{\theta}_k = \dot{\theta}_{k1} + \frac{k}{J}\left ( u_1 – u_2 \right )
[/latex]
Since the model is pretty coarse there is a need for something more to take model errors into consideration. It is easy to see that the model for the 2^{nd} and 3^{rd} state will not be 100% accurate all the time. This is where w, which is called model noise, comes in. A zero mean Gaussian noise is added to each state to represent changes that doesn’t agree with the model. In mathematical terms
[latex]
\boldsymbol{w} \sim N\left ( 0, \boldsymbol{Q} \right )
[/latex]
Q is the covariance matrix of the noise in other words how much noise there is and if the noise on one state depends on another state. For the quadrocopter the noise on each state is considered to be independent which makes Q diagonal
[latex]
\boldsymbol{Q} = \begin{bmatrix}
q_1 & 0 & 0 \\
0 & q_2 & 0 \\
0 & 0 & q_3
\end{bmatrix}
[/latex]
The elements on the diagonal represent how large model errors we expect in the different states and will be an important tuning parameter in the filter. The fact that this matrix is diagonal will allow us to do a lot of optimizations while realizing the filter in Ccode later on.
2. State covariance matrix update based on model
Another important part of the Kalman filter is the matrix P which represents how much we trust the current values of the state variables. A small P matrix means that the filter has converged close to the real value. When the model predictions in step 1 have been done the P matrix has to be updated according to the uncertainties induced by the model noise w with the covariance Q. This is done with the equation
[latex]
\boldsymbol{P}_k = \boldsymbol{FP}_{k1}\boldsymbol{F}^T + \boldsymbol{Q}
[/latex]
where the superscript T means that the matrix is transposed. You scale the current value of P with F^{2} and then add the covariance matrix of the model noise. This hopefully makes sense since you have made a guess on the current states based on a coarse model the uncertainty must increase.
3. Model and measurement differences
The next part of the filter is where the new measurements come in. In this step the difference between the states calculated from the model is compared to what is measured using the equation
[latex]
\boldsymbol{y}_k = \boldsymbol{z}_k – \boldsymbol{Hx}_k + \boldsymbol{v}_k
[/latex]
which introduces the H matrix. This is a matrix which maps the current state on the measurements. In the quadrocopter case we measure the angle based on the gravity vector by using the atan2(b, a) on the acceleration vectors perpendicular to the filters rotational axis and the angular rate measured by the gyro. This results in a measurement vector
[latex]
\boldsymbol{z} = \begin{bmatrix}
\theta \\
\dot{\theta}
\end{bmatrix}
[/latex]
Since the gyro bias will be a part of the angular rate and cannot be measured we will leave it to the Kalman filter to find out. To map or state vector x onto the measurement vector z we multiply by the matrix
[latex]
\boldsymbol{H} = \begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0
\end{bmatrix}
[/latex]
As with Q the zeros in this matrix reduces the number of multiplications needed when this is realized in Ccode allowing further optimizations.
The vector v is similar to w it represents errors in the measurements which are assumed to be zero mean and Gaussian distributed as well
[latex]
\boldsymbol{v} \sim N\left ( 0, \boldsymbol{R} \right )
[/latex]
R is similar to Q the covariance matrix of the measurements. This can be directly related to the sensor quality. As with the Q matrix the sensor errors are assumed to be independent which make the R matrix diagonal as well
[latex]
\boldsymbol{R} = \begin{bmatrix}
r_1 & 0 \\
0 & r_2
\end{bmatrix}
[/latex]
As you may have noticed by now, we like matrices with a lot of zeroes since this allow us to optimize the filter implementation.
4. Measurement covariance update
After including new data in form of measurements we need to calculate the covariance matrix of this data which is known as S and is similar in many ways to P. This is calculated with the equation
[latex]
\boldsymbol{S}_k = \boldsymbol{HP}_k \boldsymbol{H}^T + \boldsymbol{R}
[/latex]
You can see that the value of S depends on the covariance of the previous model predictions transformed to the measurement vector through H plus the covariance of the sensor readings. Once again I want to point out that a large covariance matrix means that we have low confidence in the values while a small covariance would mean that the values should be fairly accurate.
5. Calculate Kalman gain
Now we are getting close to the part where we merge the knowledge from the model with the measurements. This is done through a matrix called the Kalman gain K This matrix will help us weigh the measurements and the model together. K is calculated by
[latex]
\boldsymbol{K}_k = \boldsymbol{P}_k \boldsymbol{H}^T \boldsymbol{S}_k^{1}
[/latex]
What is done here may not be obvious at the first glance but makes sense if you think about it. You take the H matrix which maps the states onto the measurements and multiply it with the state covariance and the inverse of the measurement covariance. If we play with the thought that we have large confidence in our model and low confidence in our measurements the resulting K will small. On the other hand if we have low confidence in the model and high confidence in the measurements K will be large.
6. Improve model prediction
Now it is time to improve the model prediction by adding the difference between measurements and predictions to the states predicted in step 1 after scaling it with the gain matrix K
[latex]
\boldsymbol{x}_k = \boldsymbol{x}_k + \boldsymbol{K}_k \boldsymbol{y}
[/latex]
The result will be the filter output at this sample time. Remember from the previous step that a larger confidence in the model than the measurements will result in a small K decreasing the importance of the measurements and the opposite if we trust the measurements more than the model.
7. Update state covariance with new knowledge
Since we added data in form of measurements to the state vector we need to update the state covariance matrix P which is done by the equation
[latex]
\boldsymbol{P}_k = \left ( \boldsymbol{I} – \boldsymbol{K}_k \boldsymbol{H} \right )\boldsymbol{P}_k
[/latex]
Where I is the identity matrix
[latex]
\boldsymbol{I} = \begin{bmatrix}
1 & 0 & 0\\
0 & 1 & 0\\
0 & 0 & 1
\end{bmatrix}
[/latex]
in this case. This can be seen as scaling P down because we are a little bit more certain of the state values after adding measurement knowledge. As you can see P will be growing in step 2 after making predictions based on the model which of course makes us more uncertain of the states and then shrink again in this step after we corrected the predictions with measurements. After this step is done we can wait for the next sample and then start over at step 1.
Summary
To sum up the filter in the quadrocopter case consists in the following calculations performed at each sample
[latex]
\boldsymbol{x}_k = \boldsymbol{Fx}_{(k1)} + \boldsymbol{Bu}_k + \boldsymbol{w}_k
[/latex]
[latex]
\boldsymbol{P}_k = \boldsymbol{FP}_{k1}\boldsymbol{F}^T + \boldsymbol{Q}
[/latex]
[latex]
\boldsymbol{y}_k = \boldsymbol{z}_k – \boldsymbol{Hx}_k + \boldsymbol{v}_k
[/latex]
[latex]
\boldsymbol{S}_k = \boldsymbol{HP}_k \boldsymbol{H}^T + \boldsymbol{R}
[/latex]
[latex]
\boldsymbol{K}_k = \boldsymbol{P}_k \boldsymbol{H}^T \boldsymbol{S}_k^{1}
[/latex]
[latex]
\boldsymbol{x}_k = \boldsymbol{x}_k + \boldsymbol{K}_k \boldsymbol{y}
[/latex]
[latex]
\boldsymbol{P}_k = \left ( \boldsymbol{I} – \boldsymbol{K}_k \boldsymbol{H} \right )\boldsymbol{P}_k
[/latex]
The filter is tuned by modifying the Q and R matrices
[latex]
\boldsymbol{Q} = \begin{bmatrix}
q_1 & 0 & 0 \\
0 & q_2 & 0 \\
0 & 0 & q_3
\end{bmatrix}
[/latex]
[latex]
\boldsymbol{R} = \begin{bmatrix}
r_1 & 0 \\
0 & r_2
\end{bmatrix}
[/latex]
Variable 
Description 
q_{1} 
How well do the model represent the changes of 
q_{2} 
How well do the model represent the changes of 
q_{3} 
How well do the model represent the changes of 
r_{1} 
How much do we trust the measurement of 
r_{2} 
How much do we trust the measurement of 
These variables are relative meaning that increasing one of the variables yields the same result as decreasing all other variables. In the quadrocopter case some pointers could be given about the magnitude of these variables.
 r_{2} should be larger than r_{ 1} since the gyro readings will not be affected by the quadrocopters acceleration. This will lead to the gyro responding quicker than the accelerometer.
 q_{3} should be low since the gyro bias change extremely slow.
 r_{1} should be larger than q_{1} to prevent acceleration from affecting the angle.
 The relation between q_{2} and r_{2} will control how sensitive the angle output is to noise and how quick it will respond to changes.
Implementation
In some MATLAB inspired pseudocode you could write the filter as:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

dt = 0.01; x = [0; 0; 0]; F = [1 dt dt; 0 1 0; 0 0 1]; Q = [q1 0 0; 0 q2 0; 0 0 q3]; R = [r1 0; 0 r2]; loop { x = F*x; P = F*P*F' + Q; z = readSensors(); y = z  H*x; S = H*P*H' + R K = P*H*inv(S); x = x + K*y; P = (IK*H)P; updateControl(x); wait(dt); } 
Since this filter is meant to be run on a microcontroller (a 16bit Microchip dsPIC in my case), together with some form of control loop for feedback control of the quadrocopter pitch and roll, it is essential to keep execution time down to a minimum. C also lack native support for matrix operations. Because of this I have put both time and effort in expanding and optimizing the equations above.
I also choose to use floating point math instead of fixed point for the cimplementation. There are two reasons for this, I once heard someone say that Kalman filtering needs the dynamic range provided by floating point math. Maybe Q15/Q16 implementation would be enough for this application but the main reason of using floating point match would be me being lazy and the resulting execution time is good enough.
The resulting filter were finally implemented in a dsPIC33 processor running at 40 MIPS allowing for at least 500 Hz sampling still leaving plenty of cycles for executing control algorithms.
Code example
The filter is divided into two different files. The first is a header file (kalman.h) which contains declarations and definitions. The second is the main file which contains all code (kalman.c).
I will also show an example on how this code can be called on from another file (main.c). Keep in mind that the direction of the axes differs from setup to setup.
kalman.h
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

#ifndef _KALMAN_H #define _KALMAN_H #define DT 0.01f // 100Hz // Q diagonal 3x3 with these elements on diagonal #define Q1 5.0f #define Q2 100.0f #define Q3 0.01f // R diagonal 2x2 with these elements on diagonal #define R1 1000.0f #define R2 1000.0f struct _kalman_data { float x1, x2, x3; float p11, p12, p13, p21, p22, p23, p31, p32, p33; }; typedef struct _kalman_data kalman_data; void kalman_innovate(kalman_data *data, float z1, float z2); void kalman_init(kalman_data *data); #endif 
kalman.c
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101

#include "kalman.h" // Setup the kalman data struct void kalman_init(kalman_data *data) { data>x1 = 0.0f; data>x2 = 0.0f; data>x3 = 0.0f; // Init P to diagonal matrix with large values since // the initial state is not known data>p11 = 1000.0f; data>p12 = 0.0f; data>p13 = 0.0f; data>p21 = 0.0f; data>p22 = 1000.0f; data>p23 = 0.0f; data>p31 = 0.0f; data>p32 = 0.0f; data>p33 = 1000.0f; data>q1 = Q1; data>q2 = Q2; data>q3 = Q3; data>r1 = R1; data>r2 = R2; } void kalman_innovate(kalman_data *data, float z1, float z2) { float y1, y2; float a, b, c; float sDet; float s11, s12, s21, s22; float k11, k12, k21, k22, k31, k32; float p11, p12, p13, p21, p22, p23, p31, p32, p33; // Step 1 // x(k) = Fx(k1) + Bu + w: data>x1 = data>x1 + DT*data>x2  DT*data>x3; //x2 = x2; //x3 = x3; // Step 2 // P = FPF'+Q a = data>p11 + data>p21*DT  data>p31*DT; b = data>p12 + data>p22*DT  data>p32*DT; c = data>p13 + data>p23*DT  data>p33*DT; data>p11 = a + b*DT  c*DT + data>q1; data>p12 = b; data>p13 = c; data>p21 = data>p21 + data>p22*DT  data>p23*DT; data>p22 = data>p22 + data>q2; //p23 = p23; data>p31 = data>p31 + data>p32*DT  data>p33*DT; //p32 = p32; data>p33 = data>p33 + data>q3; // Step 3 // y = z(k)  Hx(k) y1 = z1data>x1; y2 = z2data>x2; // Step 4 // S = HPT' + R s11 = data>p11 + data>r1; s12 = data>p12; s21 = data>p21; s22 = data>p22 + data>r2; // Step 5 // K = PH*inv(S) sDet = 1/(s11*s22  s12*s21); k11 = (data>p11*s22  data>p12*s21)*sDet; k12 = (data>p12*s11  data>p11*s12)*sDet; k21 = (data>p21*s22  data>p22*s21)*sDet; k22 = (data>p22*s11  data>p21*s12)*sDet; k31 = (data>p31*s22  data>p32*s21)*sDet; k32 = (data>p32*s11  data>p31*s12)*sDet; // Step 6 // x = x + Ky data>x1 = data>x1 + k11*y1 + k12*y2; data>x2 = data>x2 + k21*y1 + k22*y2; data>x3 = data>x3 + k31*y1 + k32*y2; // Step 7 // P = (IKH)P p11 = data>p11*(1.0f  k11)  data>p21*k12; p12 = data>p12*(1.0f  k11)  data>p22*k12; p13 = data>p13*(1.0f  k11)  data>p23*k12; p21 = data>p21*(1.0f  k22)  data>p11*k21; p22 = data>p22*(1.0f  k22)  data>p12*k21; p23 = data>p23*(1.0f  k22)  data>p13*k21; p31 = data>p31  data>p21*k32  data>p11*k31; p32 = data>p32  data>p22*k32  data>p12*k31; p33 = data>p33  data>p22*k32  data>p13*k31; data>p11 = p11; data>p12 = p12; data>p13 = p13; data>p21 = p21; data>p22 = p22; data>p23 = p23; data>p31 = p31; data>p32 = p32; data>p33 = p33; } 
main.c
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

#include #include "kalman.h" // Structs for containing filter data kalman_data pitch_data; kalman_data roll_data; // Main function int main() { float acc_x, acc_y, acc_z; float gyr_x, gyr_y, gyr_z; float acc_pitch, acc_roll; float pitch, roll; kalman_init(&pitch_data); kalman_init(&roll_data); while(1) { // Read sensor read_accelerometer(&acc_x, &acc_y, &acc_z); read_gyro(&gyr_x, &gyr_y, &gyr_z); // Calculate pitch and roll based only on acceleration. acc_pitch = atan2(acc_x, acc_z); acc_roll = atan2(acc_y, acc_z); // Perform filtering kalman_innovate(&pitch_data, acc_pitch, gyr_x); kalman_innovate(&roll_data, acc_roll, gyr_y); // The angle is stored in state 1 pitch = pitch_data.x1; roll = roll_data.x1; delay(/* some time */); } return 0; } 
Thanks Ercan for pointing out my programming error in kalman.c