Introduction

I recently had the opportunity to use Padé approximants. There is a lot of good information available on line on the theory and applications of using Padé approximants, but I had trouble finding a good example explaining just how to calculate the co-efficients.

Basic Background

Hearken back to undergraduate calculus for a moment. For a given function, its Taylor series is the “best” polynomial representations of that function. If the function is being evaluated at 0, the Taylor series representation is also called the Maclaurin series. The error is proportional to the first “left-off” term. Also, the series is only a good estimate in a small radius around the point for which it is calculated (e.g. 0 for a Maclaurin series).

Padé approximants estimate functions as the quotient of two polynomials. Specifically, given a Taylor series expansion of a function $T(x)$ of order $m + n$ , there are two polynomials, $P(x)$ of order $m$ and $Q(x)$ of order $n$ , such that $\frac{P(x)}{Q(x)}$ , called the Padé approximant of order $[m/n]$ , “agrees” with the original function in order $m + n$ . You may ask, “but the Taylor series from whence it is derived is also of order $m + n$ ?” And you would be correct. However, the Padé approximant seems to consistently have a wider radius of convergence than its parent Taylor series, and, being a quotient, is composed of lower-degree polynomials. With the normalization that the first term of $Q(x)$ is always 1, there is a set of linear equations which will generate the unique Padé approximant coefficients. Letting $a_n$ be the coefficients for the Taylor series, one can solve:

(1) $\begin{align*} &a_0 &= p_0\\ &a_1 + a_0q_1 &= p_1\\ &a_2 + a_1q_1 + a_0q_2 &= p_2\\ &a_3 + a_2q_1 + a_1q_2 + a_0q_3 &= p_3\\ &a_4 + a_3q_1 + a_2q_2 + a_1q_3 + a_0q_4 &= p_4\\ &\vdots&\vdots\\ &a_{m+n} + a_{m+n-1}q_1 + \ldots + a_0q_{m+n} &= p_{m+n} \end{align*}$

remembering that all $p_k, k > m$ and $q_k, k > n$ are 0.

There is a lot of research on the theory of Padé approximants and Padé tables, how they work, their relationship to continued fractions, and why they work so well. For example, the interested reader is directed to Baker (1975), Van Assche (2006), Wikipedia, and MathWorld for more.

Practical Example

The function $\log(1 + x)$ will be used as the example. This function has a special representation in almost all computer languages, often called log1p(x), as naïve implementation as log(1 + x) will suffer catastrophic floating point errors for $x$ near 0.

The Maclaurin series expansion for $\log(1 + x)$ is:

$\sum_{k=1}^\infty -1^{k+1}\frac {x^k}{k} = x - \frac{1}{2}x^2 + \frac{1}{3}x^3 - \frac{1}{4}x^4 + \frac{1}{5}x^5 - \frac{1}{6}x^6\ldots$

The code below will compare a Padé[3,3] approximant with the 6-term Maclaurin series, which would actually be the Padé[6/0] approximant. First to calculate the coefficients. We know the Maclaurin coefficients, they are $0, 1, -\frac{1}{2}, \frac{1}{3}, -\frac{1}{4}, \frac{1}{5}, -\frac{1}{6}$ . Therefore, the system of linear equations looks like this:

(2) $\begin{align*} &0 &= p_0\\ &1 + 0q_1 &= p_1\\ &-\frac{1}{2} + q_1 + 0q_2 &= p_2\\ &\frac{1}{3} - \frac{1}{2}q_1 + q_2 + a_0q_3 &= p_3\\ &-\frac{1}{4} + \frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 &= 0\\ &\frac{1}{5} - \frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 &= 0\\ &-\frac{1}{6} + \frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 &= 0 \end{align*}$

Rewriting in terms of the known $a_n$ coefficients, we get:

(3) $\begin{align*} &-p_0 =& 0\\ &0q_1 - p_1 =& -1\\ &q_1 + 0q_2 -p_2 =& \frac{1}{2}\\ &-\frac{1}{2}q_1 + q_2 + a_0q_3 - p_3 =& -\frac{1}{3}\\ &\frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 =& \frac{1}{4}\\ &-\frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 =& -\frac{1}{5}\\ &\frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 =& \frac{1}{6} \end{align*}$

We can solve this in R using solve:

A &lt;- matrix(c(0, 0, 1, -.5 ,1 / 3 , -.25, .2, 0, 0, 0, 1, -.5, 1 / 3, -.25, 0, 0, 0, 0, 1, -.5, 1 / 3, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0), ncol = 7)
B &lt;- c(0, -1, .5, -1/3, .25, -.2, 1/6)
P_Coeff &lt;- solve(A, B)
print(P_Coeff)

A <- matrix(c(0, 0, 1, -.5 ,1 / 3 , -.25, .2, 0, 0, 0, 1, -.5, 1 / 3, -.25, 0, 0, 0, 0, 1, -.5, 1 / 3, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0), ncol = 7)

B <- c(0, -1, .5, -1/3, .25, -.2, 1/6)

P_Coeff <- solve(A, B)

print(P_Coeff)

## [1] 1.5000000 0.6000000 0.0500000 0.0000000 1.0000000 1.0000000 0.1833333

1	## [1] 1.5000000 0.6000000 0.0500000 0.0000000 1.0000000 1.0000000 0.1833333

Now we can create the estimating functions:

ML &lt;- function(x){x - .5 * x ^ 2 + x ^ 3 / 3 - .25 * x ^ 4 + .2 * x ^ 5 - x ^ 6 / 6}

PD33 &lt;- function(x){
  NUMER &lt;- x + x ^ 2 + 0.1833333333333333 * x ^ 3
  DENOM &lt;- 1 + 1.5 * x + .6 * x ^ 2 + 0.05 * x ^ 3
  return(NUMER / DENOM)
}

ML <- function(x){x - .5 * x ^ 2 + x ^ 3 / 3 - .25 * x ^ 4 + .2 * x ^ 5 - x ^ 6 / 6}

PD33 <- function(x){

NUMER <- x + x ^ 2 + 0.1833333333333333 * x ^ 3

DENOM <- 1 + 1.5 * x + .6 * x ^ 2 + 0.05 * x ^ 3

return(NUMER / DENOM)

}

Let’s compare the behavior of these functions around 0 with the naïve and sophisticated implementations of $\log(1+x)$ in R.

library(dplyr)
library(ggplot2)
library(tidyr)
D &lt;- seq(-1e-2, 1e-2, 1e-6)
RelErr &lt;- tbl_df(data.frame(X = D, Naive = (log(1 + D) - log1p(D)) / log1p(D), MacL = (ML(D) - log1p(D)) / log1p(D), Pade = (PD33(D) - log1p(D)) / log1p(D)))
RelErr2 &lt;- gather(RelErr, Type, Error, -X)
RelErr2 %&gt;% group_by(Type) %&gt;% summarize(MeanError = mean(Error, na.rm = TRUE)) %&gt;% knitr::kable(digits = 18)

library(dplyr)

library(ggplot2)

library(tidyr)

D <- seq(-1e-2, 1e-2, 1e-6)

RelErr <- tbl_df(data.frame(X = D, Naive = (log(1 + D) - log1p(D)) / log1p(D), MacL = (ML(D) - log1p(D)) / log1p(D), Pade = (PD33(D) - log1p(D)) / log1p(D)))

RelErr2 <- gather(RelErr, Type, Error, -X)

RelErr2 %>% group_by(Type) %>% summarize(MeanError = mean(Error, na.rm = TRUE)) %>% knitr::kable(digits = 18)

Type	MeanError
Naïve	-4.3280e-15
MacL	-2.0417e-14
Pade	-5.2000e-17

Graphing the relative error in a small area around 0 shows the differing behaviors. First, against the naïve implementation, both estimates do much better.

ggplot(RelErr2, aes(x = X)) + geom_point(aes(y = Error, colour = Type), alpha = 0.5)

1	ggplot(RelErr2, aes(x = X)) + geom_point(aes(y = Error, colour = Type), alpha = 0.5)

But when compared one against the other, the Pade approximant (blue) shows better behavior than the Maclaurin (red) and it’s relative error stays below EPS for a wider swath.

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

1	ggplot(RelErr, aes(x = X)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

Just for fun, restricting the y axis to that above, overlaying the naïve formulation (green) looks like this:

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = Naive), colour = 'green', alpha = 0.5) + scale_y_continuous(limits = c(-1.5e-13, 0)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

1	ggplot(RelErr, aes(x = X)) + geom_point(aes(y = Naive), colour = 'green', alpha = 0.5) + scale_y_continuous(limits = c(-1.5e-13, 0)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

There are certainly more efficient and elegant ways to calculate Padé approximants, but I found this exercise helpful, and I hope you do as well!

References

Baker, G. A. Essentials of Padé Approximants Academic Press, 1975
Van Assche, W. Pade and Hermite-Pade approximation and orthogonality ArXiv Mathematics e-prints, 2006

One Response

Distilled News | Data Analytics & R June 5, 2015 at 12:27 PM | Permalink

[…] A Practical Example of Calculating Padé Approximant Coefficients Using R I recently had the opportunity to use Padé approximants. There is a lot of good information available on line on the theory and applications of using Padé approximants, but I had trouble finding a good example explaining just how to calculate the co-efficients. […]

Comments are closed.