Strange Attractors http://www.avrahamadler.com Wandering around chaotically until beauty appears Fri, 31 Mar 2017 16:26:44 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.5 56452086 Delaporte package: The SPARCmonster is sated http://www.avrahamadler.com/2017/03/31/delaporte-package-the-sparcmonster-is-sated/ Fri, 31 Mar 2017 16:26:44 +0000 http://www.avrahamadler.com/?p=543 Read the full article...]]> Delaporte PMFFinally, finally after months of pulling out my hair, the Delaporte project on CRAN passes all of its checks on Solaris SPARC. The last time it did that, it was still using serial C++. Now it uses OpenMP-based parallel Fortran 2003. What a relief! One of these days I should write up what I did and why, but for now, I’ll be glad to put the project down for a month or seventeen!

]]>
543
Enough trying to use LTO http://www.avrahamadler.com/2017/01/31/enough-trying-to-use-lto/ Tue, 31 Jan 2017 07:12:50 +0000 http://www.avrahamadler.com/?p=540 Read the full article...]]> After months and months of working trying to get R on 64 bit Windows to work with link-time-optimization (LTO), I’ve come to the conclusion that GCC 4.9.3’s implementation of LTO just isn’t polished enough to work across the entire R infrastructure. I have been able to get base R, nloptr, Rblas, and Rlapack to compile with LTO, but they are not really any faster than without it, so it isn’t worth the headache. I’ve removed the steps relating to LTO from my instructions on how to build R+OpenBLAS.

]]>
540
Updated OpenBLAS instructions for R-3.3+ and Rtools34 http://www.avrahamadler.com/2016/07/19/updated-openblas-instructions-for-r-3-3-and-rtools34/ http://www.avrahamadler.com/2016/07/19/updated-openblas-instructions-for-r-3-3-and-rtools34/#comments Tue, 19 Jul 2016 19:39:03 +0000 http://www.avrahamadler.com/?p=509 I’ve just updated the instructions for building a 64-bit OpenBLAS-based Rblas.dll for Windows to reflect changes to R 3.3+ and Rtools34. Enjoy!

]]>
http://www.avrahamadler.com/2016/07/19/updated-openblas-instructions-for-r-3-3-and-rtools34/feed/ 4 509
Casualty Actuarial Society Elections 2016: Why I am a candidate http://www.avrahamadler.com/2016/06/29/cas-election-2016/ Wed, 29 Jun 2016 15:14:10 +0000 http://www.avrahamadler.com/?p=501 Read the full article...]]> I am very fortunate to be a fellow of the Casualty Actuarial Society: the terminal credential in my field of actuarial science. The society has been furthering the science, art, and standards of general liability actuarial work for over 100 years, and has consistently maintained, nay, enhanced the value of actuaries and the services they provide.

To paraphrase and possibly expand on my candidate statements, there are two primary reasons I would like to serve the CAS as a board member. We live in a very exciting era, especially as it relates to technological innovation. However, it is also a critical time for the field. Computing power and data collection have grown in ways that would have been unthinkable outside of science fiction even twenty-five years ago. This makes it a wonderful time to be an actuary. I’ve often claimed that actuaries were “doing data science” before that term was invented. While some of the methods and tools may have changed in the recent past, the concept of understanding all elements of risk—quantitative and qualitative—remains one of our core strengths. We have been dealing with most of the problems of data collection, scrubbing, imputation, and analysis for over a century. We are not just “data scientists for insurance”—we are experts in applying the critical combination of statistical techniques, business specific domain knowledge, and awareness of the limitations of any model or methodology.

This leads to one of the issues about which I feel strongly. The proliferation of data-analysis-type jobs is both a boon and a burden. We can, and have, taken advantage of advances in statistical theory, practice, algorithms, and computing power to name just a few. Yet we may also be in danger of being pigeonholed, if not rendered obsolete, if we do not evolve. Data capture, amalgamation, cleaning, and analysis have undergone a veritable sea change in the last decade, and actuaries need to be current, or at least conversant, with these techniques. As a member of the board, I would help the society “ride the wave” of the digital revolution and not only keep but enhance the value that actuaries have and bring to our principals. I believe that in order to stay relevant and valuable in the 21st century, especially in light of other North American actuarial organizations entering the general insurance arena, we need to expand the actuarial toolkit beyond its traditional boundaries. I was a member of the Education Structure Task Force which reviewed the then current syllabus with an eye towards the baseline of what the actuary of the next 25 years should know. I continued as a member of both the Statistics and CERA implementation task forces. The current exam S is one result of these task forces, and serves to better prepare actuaries of the 21st century for the problems they will face. I would like to see similar improvements to the education of actuaries in more advanced statistical concepts, especially the introduction of Bayesian hierarchical models and having actuaries aware of basic unsupervised learning techniques. Not all education needs to be officially enshrined in the syllabus, though. I do not want another exam for the sake of an exam; I do not think we need another exam. Rather, I want the CAS to encourage actuaries to write not only theoretical papers, but papers and monographs which present practical solutions and methods in a clear fashion. These can be used by all actuaries to expand their capabilities and thus enhance their value.

A second related issue which I feel is important is that I believe that increasing the visibility of the actuarial role in non-traditional risk management venues is in the best interest of the CAS and its members. New corporate and risk paradigms have emerged in the past decade, and existing non-insurance entities have seen value in actuarial services. New corporate concepts such as the proliferation of peer-to-peer companies have need of risk management services, and some already use actuaries. Traditional companies are now exposed to both new kinds of risk, such as cyber, as well as increased exposure to risks previously considered de minimis, such as domestic terrorism against property and infrastructure. Even non-insurance, non-classical risk bearing entities have made use of actuarial services, as the NFL did as part of its response to the brain trauma lawsuit. I believe that increasing awareness about the specific value actuaries bring to risk management roles in non-traditional venues will both solidify and expand opportunities for current and future actuaries.

Overall, I consider myself extremely fortunate to have this career at the crossroads of statistics and problem solving at this time. The CAS has been nurturing and advancing the science, art, education, and standards of casualty actuarial practice for over a century, providing the framework in which we actuaries perform with integrity and competence. For that alone we all owe the CAS a debt of gratitude. As a predominantly volunteer-driven and run organization, we can demonstrate that gratitude by giving of our time and expertise to the CAS and all of its members. More recently, I was privileged to be a member of the nominating committee for the past two years, and saw first-hand the care and concern that the CAS leadership has regarding the growth and health of the organization. The strategic thinking and attention to the short, medium, and long-term needs of the society and its members inspired me. Serving on the board would allow me to both give back to the CAS and “pay forward” the debt of gratitude I owe the society and all its members.

I would certainly appreciate your support come August 1, but more importantly, I would appreciate it if you took the time to review the candidates—and they are all interested in bettering the CAS and its members—and make an informed decision for whom to vote come August. This is your society, and as recent political events have shown, elections matter! Please use your vote, and use it as best you see fit. Thank you.

]]>
501
Updated R & BLAS Timings http://www.avrahamadler.com/2016/03/31/updated-r-blas-timings/ http://www.avrahamadler.com/2016/03/31/updated-r-blas-timings/#comments Thu, 31 Mar 2016 04:49:44 +0000 http://www.avrahamadler.com/?p=471 Read the full article...]]> With the recent releases of R 3.2.4 and OpenBLAS 2.17, I decided it was time to re-benchmark R speed. I’ve settled on a particular set of tests, based on my experience as well as some of Simon Urbanek’s work which I separated into two groups: those focusing on BLAS-heavy operations and those which do not. I’ve posted the code I use to its own page, but I’ll copy it below for convenience:

set.seed(4987)
library(microbenchmark)
library(Matrix)
A <- matrix(rnorm(1e6, 1e3, 1e2), ncol = 1e3)
B <- matrix(rnorm(1e6, 1e3, 1e2), ncol = 1e3)
A <- crossprod(A, A)
A <- A * 1000 / mean(A)
colnames(A) <- colnames(B) <- NULL
options(scipen=4, digits = 3)

BLAS <- microbenchmark(
    sort(c(as.vector(A), as.vector(B))),
	det(A),
    A %*% B,
	t(A) %*% B,
	crossprod(A, B),
	solve(A),
	solve(A, t(B)),
	solve(B),
	chol(A),
	chol(B, pivot = TRUE),
	qr(A, LAPACK = TRUE),
	svd(A),
	eigen(A, symmetric = TRUE),
	eigen(A, symmetric = FALSE),
	eigen(B, symmetric = FALSE),
	lu(A),
	fft(A),
	Hilbert(3000),
	toeplitz(A[1:500, 1]),
	princomp(A),
	times=25L, unit='ms', control = list(order = 'block')
)

NotBLAS <- microbenchmark(
	A + 2,
	A - 2,
	A * 2,
	A / 2,
	A * 0.5,
	A ^ 2,
	sqrt(A[1:10000]),
	sin(A[1:10000]),
	A + B,
	A - B,
	A * B,
	A / B,
	A[1:1e5] %% B[1:1e5],
	A[1:1e5] %/% B[1:1e5],
	times = 5000L, unit='ms', control = list(order = 'block')
)

My machine is a i7-2600K overclocked to 4.65Ghz with 16GB RAM, running Win7 64. I’ve also used RStudio Version 0.99.893 for each of these tests.

I ran the tests on four versions of R:

  • Plain vanilla R version 3.2.4 Revised (2016-03-16 r70336) installed from the binary on CRAN
  • The same version of R as #1, but with Rblas.dll replaced by one based on OpenBLAS 2.17 compiled specifically for the SandyBridge CPU (as part of the #3)
  • R compiled from source using the gcc 4.9.3 toolchain, passing -march=native to EOPTS and linking to a multi-threaded (4 max) SandyBridge-specific OpenBLAS 2.17
  • Same as #3, but activating link-time optimization for core R and the BLAS

This time I was a bit wiser, and saved the results to RDS objects so that I could combine them into one R session and compare them. I’ll post the raw output at the bottom of the page, but I think it’s more intuitive to look at graphical comparisons. The takeaway is if you can install a fast BLAS! The other optimization options had minor effects, sometimes better sometimes worse, but using a fast BLAS made a significant difference. Adding a Platform variable to the four sets of BLAS and NotBLAS outputs and then combining them in one session allowed for the generation of comparison plots. The call for these graphs, for those following along at home, is:

ggplot(BLAS) + geom_boxplot(aes(y = time / 1e6, color = Platform, x = expr)) + scale_y_continuous(labels = comma) + ylab('milliseconds') + coord_flip()
ggplot(BLAS) + geom_jitter(aes(y = time / 1e6, color = Platform, x = expr), alpha = 0.3) + scale_y_continuous(labels = comma) + ylab('milliseconds') + coord_flip()
ggplot(NotBLAS) + geom_boxplot(aes(y = time / 1e6, color = Platform, x = expr)) + scale_y_continuous(labels = comma) + ylab('milliseconds') + coord_flip()

You may also have to right-click and sellect “View Image” to get a better picture. First is a boxplot comparison for the BLAS related functions:
BLAS_Boxplot
As there are only 25 samples for each function, a jitter plot may be a bit simpler:
BLAS_jitter
It is clear that just adding the BLAS makes a noticeable difference for matrix-related functionality. The other optimizations tend to provide some benefit, but nothing significant.

The non-BLAS related functions, understandably, don’t respond the same way to the BLAS. As I ran 5000 samples per expression, a jitter plot would not be that clear, even with a low alpha.
NotBLAS_boxplot

What is seen here, in my opinion, that using the local architecture sometimes improves performance, such as for division, and other retards it slightly, such as for sin and sqrt. I think the latter two depend on the optimization as well. Default R calls for -O3 for C code. The magnitude of the improvements seems to be greater than the change in those times where there is an impediment, at least when eyeballing the plot. If I recall from previous tests, strangely, it may be slightly more efficient to pass -mtune = native, at least for my test suite. I’m not sure why, as arch implies tune. I’d have to recompile R yet again to test it, though. Also, with 5000 iterations per expression, garbage collection has to be run on occasion, which accounts for those outlier points.

In summation, if you deal with heavy calculation in R, specifically matrix-related, you will notice significant speed improvement by replacing the vanilla Rblas.dll with a faster one. Compiling or tuning for a specific architecture may show some small increases, but does not return the same results.

While it’s actually easier now to compile OpenBLAS for R and in R on Windows, my instructions are a bit dated, and so I’ll have to update those eventually. I have considered hosting some pre-compiled Rblas files, as Dr. Ei-ji Nakama does, but I’ve held back for two reasons 1) I have to ensure I apply the proper licenses and 2) I’m a bit concerned about someone suing or the like if for whatever reason, a blas didn’t work properly. I have disclaimers and all, but you never know 8-).

In any event, I’m always interested in hearing comments about your experience with compiling R or a fast BLAS for Windows, and especially if you found any of my previous posts helpful. Thanks!

Raw Benchmark Output

R version 3.2.4 Revised (2016-03-16 r70336) (pure vanilla)
Unit: milliseconds

BLAS
                                  expr     Min      LQ  Median      UQ     Max    Mean    SD      CV     n
                                (fctr)   (dbl)   (dbl)   (dbl)   (dbl)   (dbl)   (dbl) (dbl)   (dbl) (int)
1  sort(c(as.vector(A), as.vector(B)))  264.56  266.56  267.13  270.50  327.81  270.39 12.15 0.04492    25
2                               det(A)  134.83  135.77  136.47  137.74  206.93  139.83 14.12 0.10095    25
3                              A %*% B  465.64  473.39  492.22  504.79  546.33  493.55 22.62 0.04583    25
4                           t(A) %*% B  468.96  501.26  506.00  535.00  564.67  514.08 24.71 0.04807    25
5                      crossprod(A, B)  737.52  746.47  759.38  768.71  813.29  762.19 18.66 0.02449    25
6                             solve(A)  607.80  615.18  621.48  639.45  679.81  628.14 18.80 0.02992    25
7                       solve(A, t(B))  847.49  853.39  855.29  859.28  881.07  858.57  8.89 0.01036    25
8                             solve(B)  617.06  621.25  622.45  624.65  683.30  625.33 12.56 0.02008    25
9                              chol(A)  116.61  117.19  117.40  118.45  123.19  118.46  1.99 0.01676    25
10               chol(B, pivot = TRUE)    2.38    2.45    2.52    2.59    6.91    3.15  1.51 0.48091    25
11                qr(A, LAPACK = TRUE)  423.32  424.46  425.25  426.35  432.92  425.62  2.07 0.00487    25
12                              svd(A) 2219.86 2242.51 2265.10 2294.88 2372.62 2277.14 45.47 0.01997    25
13          eigen(A, symmetric = TRUE)  939.63  946.36  952.16  963.85 1020.42  959.00 19.78 0.02063    25
14         eigen(A, symmetric = FALSE) 3623.95 3650.20 3657.46 3675.48 3740.09 3662.55 28.50 0.00778    25
15         eigen(B, symmetric = FALSE) 4072.00 4127.98 4175.12 4228.25 4344.13 4183.00 77.04 0.01842    25
16                               lu(A)  137.73  142.06  144.51  147.26  210.54  156.60 26.47 0.16901    25
17                              fft(A)  112.27  116.98  119.89  123.29  135.01  120.63  5.20 0.04308    25
18                       Hilbert(3000)  134.83  196.37  197.50  199.45  386.30  201.86 46.23 0.22905    25
19               toeplitz(A[1:500, 1])    4.92    5.06    5.09    5.45   10.49    5.80  1.72 0.29759    25
20                         princomp(A) 1834.20 1855.31 1903.78 1947.02 2127.45 1923.42 81.44 0.04234    25

NotBLAS
                        expr   Min    LQ Median    UQ   Max  Mean     SD    CV     n
                      (fctr) (dbl) (dbl)  (dbl) (dbl) (dbl) (dbl)  (dbl) (dbl) (int)
1                      A + 2 1.719 1.891  1.922 2.012 74.96 2.542 2.5880 1.018  5000
2                      A - 2 1.723 1.910  1.965 2.066 72.86 2.628 2.6389 1.004  5000
3                      A * 2 1.725 1.900  1.948 2.049 80.89 2.604 2.6526 1.019  5000
4                        A/2 3.014 3.162  3.179 3.205 74.44 3.776 2.5372 0.672  5000
5                    A * 0.5 1.718 1.884  1.908 1.971 74.80 2.538 2.5780 1.016  5000
6                        A^2 1.718 1.896  1.939 2.036 73.41 2.587 2.6111 1.009  5000
7           sqrt(A[1:10000]) 0.110 0.111  0.111 0.112  5.30 0.116 0.0833 0.717  5000
8            sin(A[1:10000]) 0.364 0.365  0.365 0.366  1.35 0.368 0.0429 0.117  5000
9                      A + B 1.306 1.384  1.448 1.542 68.83 1.582 1.8367 1.161  5000
10                     A - B 1.308 1.355  1.432 1.539 67.87 1.571 1.8412 1.172  5000
11                     A * B 1.310 1.410  1.471 1.571 69.73 1.610 1.8548 1.152  5000
12                       A/B 4.767 4.779  4.789 4.853 66.93 4.946 1.7605 0.356  5000
13  A[1:100000]%%B[1:100000] 2.528 2.544  2.558 2.583 65.80 2.639 1.2633 0.479  5000
14 A[1:100000]%/%B[1:100000] 2.271 2.296  2.318 2.370 65.33 2.423 1.2646 0.522  5000
----------
R version 3.2.4 Revised (2016-03-16 r70336) (pure vanilla)
With SandyBridge-specific OpenBLAS (2.17)
Unit: milliseconds

BLAS
                                  expr     Min      LQ  Median      UQ    Max    Mean    SD     CV     n
                                (fctr)   (dbl)   (dbl)   (dbl)   (dbl)  (dbl)   (dbl) (dbl)  (dbl) (int)
1  sort(c(as.vector(A), as.vector(B)))  264.10  265.06  265.77  266.93  323.4  268.39 11.52 0.0429    25
2                               det(A)   14.31   14.57   15.26   16.67   20.0   15.82  1.43 0.0903    25
3                              A %*% B   19.25   21.34   26.83   28.49   84.1   27.78 12.68 0.4563    25
4                           t(A) %*% B   26.46   30.04   32.20   38.76   43.6   34.25  5.37 0.1567    25
5                      crossprod(A, B)   17.63   21.14   22.82   27.48   35.5   24.68  5.10 0.2066    25
6                             solve(A)   40.67   45.92   49.58   51.48  107.1   51.29 12.37 0.2412    25
7                       solve(A, t(B))   46.32   50.88   51.63   56.40   63.2   53.00  4.24 0.0799    25
8                             solve(B)   47.09   51.22   51.72   56.44  118.6   56.32 13.71 0.2435    25
9                              chol(A)    6.59    6.65    6.77    7.01   10.6    7.38  1.39 0.1890    25
10               chol(B, pivot = TRUE)    2.36    2.47    2.50    2.58    6.8    3.25  1.58 0.4851    25
11                qr(A, LAPACK = TRUE)   70.19   71.59   72.47   74.69   79.1   72.96  2.22 0.0304    25
12                              svd(A)  375.74  378.56  389.32  415.33  444.7  399.66 26.19 0.0655    25
13          eigen(A, symmetric = TRUE)  171.72  175.04  176.94  177.97  239.8  179.80 13.20 0.0734    25
14         eigen(A, symmetric = FALSE)  849.09  859.77  872.76  891.57  915.7  876.86 20.89 0.0238    25
15         eigen(B, symmetric = FALSE) 1011.80 1065.85 1070.08 1075.52 1085.1 1065.76 18.14 0.0170    25
16                               lu(A)   17.91   18.91   20.78   22.62   82.8   32.68 25.43 0.7782    25
17                              fft(A)  110.52  111.17  113.16  113.34  113.8  112.33  1.18 0.0105    25
18                       Hilbert(3000)  128.99  194.32  194.70  195.01  377.4  196.80 45.44 0.2309    25
19               toeplitz(A[1:500, 1])    4.77    4.96    5.03    5.04   10.0    5.58  1.64 0.2937    25
20                         princomp(A)  285.96  302.37  316.33  329.08  400.5  322.00 29.13 0.0905    25

NotBLAS
                        expr   Min    LQ Median    UQ   Max  Mean     SD    CV     n
                      (fctr) (dbl) (dbl)  (dbl) (dbl) (dbl) (dbl)  (dbl) (dbl) (int)
1                      A + 2 1.712 1.887  1.904 1.938 72.64 2.514 2.6240 1.044  5000
2                      A - 2 1.723 1.893  1.913 1.962 68.52 2.540 2.5268 0.995  5000
3                      A * 2 1.721 1.894  1.918 1.974 68.97 2.543 2.5326 0.996  5000
4                        A/2 3.016 3.173  3.192 3.229 73.36 3.804 2.5315 0.665  5000
5                    A * 0.5 1.724 1.900  1.939 2.049 73.70 2.590 2.6048 1.006  5000
6                        A^2 1.722 1.890  1.910 1.974 71.55 2.543 2.5479 1.002  5000
7           sqrt(A[1:10000]) 0.110 0.111  0.111 0.112  5.59 0.116 0.0876 0.753  5000
8            sin(A[1:10000]) 0.364 0.365  0.366 0.366  1.32 0.370 0.0408 0.110  5000
9                      A + B 1.310 1.336  1.378 1.447 62.35 1.512 1.7318 1.146  5000
10                     A - B 1.310 1.345  1.411 1.491 63.70 1.542 1.7465 1.133  5000
11                     A * B 1.312 1.349  1.421 1.501 65.96 1.550 1.7753 1.145  5000
12                       A/B 4.766 4.781  4.804 4.959 73.53 4.992 1.8146 0.364  5000
13  A[1:100000]%%B[1:100000] 2.526 2.541  2.550 2.591 63.56 2.638 1.2305 0.467  5000
14 A[1:100000]%/%B[1:100000] 2.272 2.294  2.310 2.361 66.49 2.399 1.2681 0.529  5000
----------
R version 3.2.4 Patched (2016-03-28 r70390) -- "Very Secure Dishes"
Compiled with -march=native
With SandyBridge-specific OpenBLAS (2.17)

BLAS
                                  expr    Min      LQ  Median      UQ     Max    Mean    SD     CV     n
                                (fctr)  (dbl)   (dbl)   (dbl)   (dbl)   (dbl)   (dbl) (dbl)  (dbl) (int)
1  sort(c(as.vector(A), as.vector(B))) 266.79  267.24  268.46  271.46  329.37  272.04 12.36 0.0454    25
2                               det(A)  14.24   14.43   15.12   16.94   18.63   15.69  1.43 0.0911    25
3                              A %*% B  20.95   24.25   28.79   31.89   34.96   28.26  4.21 0.1490    25
4                           t(A) %*% B  26.70   29.53   33.74   41.83  103.68   39.71 18.48 0.4652    25
5                      crossprod(A, B)  17.65   19.46   28.94   31.99   36.97   26.53  6.56 0.2474    25
6                             solve(A)  41.47   45.97   49.44   52.75   56.85   49.38  4.19 0.0849    25
7                       solve(A, t(B))  46.34   51.00   53.22   55.86  113.34   55.53 12.45 0.2242    25
8                             solve(B)  47.90   52.74   54.88   57.05  126.30   58.01 15.05 0.2595    25
9                              chol(A)   6.57    6.72    6.93    7.43   12.90    7.74  1.84 0.2377    25
10               chol(B, pivot = TRUE)   2.37    2.47    2.49    2.53    6.26    3.22  1.53 0.4755    25
11                qr(A, LAPACK = TRUE)  70.84   71.54   73.10   74.91   78.28   73.59  2.19 0.0298    25
12                              svd(A) 375.93  384.60  399.36  454.94  563.73  421.43 49.29 0.1170    25
13          eigen(A, symmetric = TRUE) 170.29  174.23  177.88  185.33  244.08  182.23 14.50 0.0796    25
14         eigen(A, symmetric = FALSE) 839.89  851.24  861.85  874.42  961.61  870.93 31.10 0.0357    25
15         eigen(B, symmetric = FALSE) 985.55 1045.25 1097.99 1142.28 1200.26 1093.09 56.17 0.0514    25
16                               lu(A)  18.25   20.03   21.52   23.14   91.58   34.15 27.10 0.7936    25
17                              fft(A) 106.88  110.80  112.50  119.96  132.55  115.51  6.66 0.0576    25
18                       Hilbert(3000) 133.10  197.69  201.83  204.57  411.96  205.26 50.69 0.2470    25
19               toeplitz(A[1:500, 1])   5.04    5.20    5.35    5.49   10.66    5.94  1.65 0.2773    25
20                         princomp(A) 326.08  336.14  341.65  349.99  425.50  354.56 31.37 0.0885    25

NotBLAS
                        expr   Min    LQ Median    UQ   Max  Mean     SD     CV     n
                      (fctr) (dbl) (dbl)  (dbl) (dbl) (dbl) (dbl)  (dbl)  (dbl) (int)
1                      A + 2 1.731 1.890  1.911 1.956 73.12 2.538 2.6108 1.0286  5000
2                      A - 2 1.726 1.884  1.903 1.943 70.36 2.527 2.5691 1.0168  5000
3                      A * 2 1.727 1.891  1.910 1.945 70.49 2.531 2.5708 1.0156  5000
4                        A/2 2.040 2.193  2.210 2.236 71.35 2.816 2.5552 0.9075  5000
5                    A * 0.5 1.726 1.902  1.939 2.016 74.95 2.591 2.6501 1.0227  5000
6                        A^2 1.727 1.886  1.907 1.966 71.32 2.541 2.5889 1.0188  5000
7           sqrt(A[1:10000]) 0.509 0.516  0.516 0.517  4.36 0.521 0.0670 0.1285  5000
8            sin(A[1:10000]) 0.715 0.720  0.721 0.722  1.72 0.725 0.0416 0.0573  5000
9                      A + B 1.331 1.356  1.378 1.459 65.85 1.522 1.8010 1.1831  5000
10                     A - B 1.331 1.357  1.383 1.461 64.68 1.527 1.7970 1.1765  5000
11                     A * B 1.332 1.354  1.372 1.431 65.46 1.513 1.8009 1.1905  5000
12                       A/B 2.412 2.423  2.432 2.460 68.39 2.580 1.8492 0.7168  5000
13  A[1:100000]%%B[1:100000] 2.917 2.999  3.005 3.027 67.03 3.084 1.2797 0.4150  5000
14 A[1:100000]%/%B[1:100000] 2.683 2.759  2.769 2.816 67.74 2.861 1.2999 0.4543  5000
----------
R version 3.2.4 Patched (2016-03-28 r70390) -- "Very Secure Dishes"
Compiled with -march=native and partial LTO (R core, blas, and lapack only)
With SandyBridge-specific OpenBLAS (2.17)

BLAS
                                  expr    Min     LQ  Median      UQ     Max    Mean    SD     CV     n
                                (fctr)  (dbl)  (dbl)   (dbl)   (dbl)   (dbl)   (dbl) (dbl)  (dbl) (int)
1  sort(c(as.vector(A), as.vector(B))) 263.84 265.83  266.39  268.22  323.52  269.49 11.48 0.0426    25
2                               det(A)  16.55  17.15   17.72   19.73   23.69   18.55  1.82 0.0982    25
3                              A %*% B  19.55  21.52   25.02   33.33   96.59   29.20 15.17 0.5195    25
4                           t(A) %*% B  27.71  33.26   35.59   38.03   45.20   35.47  4.33 0.1220    25
5                      crossprod(A, B)  17.81  21.49   25.98   28.99   34.36   25.94  4.49 0.1730    25
6                             solve(A)  48.38  51.94   54.40   57.04  112.31   56.72 12.03 0.2122    25
7                       solve(A, t(B))  54.93  56.65   60.59   63.24   72.91   61.05  5.25 0.0859    25
8                             solve(B)  55.50  58.85   60.88   62.99  121.31   63.46 12.50 0.1969    25
9                              chol(A)   8.57   8.69    8.80    8.98   12.21    9.40  1.28 0.1361    25
10               chol(B, pivot = TRUE)   2.35   2.44    2.51    2.62    6.87    3.18  1.60 0.5035    25
11                qr(A, LAPACK = TRUE)  71.16  72.38   73.57   74.85   83.43   74.41  2.93 0.0393    25
12                              svd(A) 376.02 380.71  392.98  437.93  490.97  407.28 34.08 0.0837    25
13          eigen(A, symmetric = TRUE) 173.96 176.57  179.35  181.38  259.73  183.29 16.79 0.0916    25
14         eigen(A, symmetric = FALSE) 852.68 856.26  860.10  864.31  926.66  865.16 16.63 0.0192    25
15         eigen(B, symmetric = FALSE) 987.24 993.14 1001.10 1020.62 1079.71 1010.79 24.32 0.0241    25
16                               lu(A)  19.18  20.01   21.76   24.21   26.51   22.11  2.44 0.1103    25
17                              fft(A) 106.65 107.09  108.58  110.35  116.92  108.98  2.36 0.0216    25
18                       Hilbert(3000) 125.60 191.73  192.18  194.44  314.63  195.15 33.81 0.1732    25
19               toeplitz(A[1:500, 1])   4.99   5.18    5.21    5.24   10.22    5.79  1.62 0.2801    25
20                         princomp(A) 285.97 296.12  300.58  317.75  370.74  315.50 30.97 0.0982    25

NotBLAS
                        expr   Min    LQ Median    UQ   Max  Mean     SD     CV     n
                      (fctr) (dbl) (dbl)  (dbl) (dbl) (dbl) (dbl)  (dbl)  (dbl) (int)
1                      A + 2 1.828 1.995  2.022 2.091 74.52 2.615 2.4756 0.9468  5000
2                      A - 2 1.807 1.973  1.995 2.064 70.60 2.636 2.5624 0.9721  5000
3                      A * 2 1.827 1.991  2.009 2.054 71.57 2.637 2.5618 0.9714  5000
4                        A/2 2.147 2.293  2.309 2.335 68.18 2.912 2.4859 0.8537  5000
5                    A * 0.5 1.817 1.990  2.007 2.049 68.35 2.628 2.5134 0.9564  5000
6                        A^2 1.795 1.964  1.980 2.011 67.96 2.596 2.5082 0.9660  5000
7           sqrt(A[1:10000]) 0.531 0.535  0.536 0.536  2.51 0.540 0.0474 0.0878  5000
8            sin(A[1:10000]) 0.719 0.726  0.726 0.727  1.70 0.728 0.0413 0.0566  5000
9                      A + B 1.326 1.355  1.391 1.545 66.25 1.690 1.8104 1.0713  5000
10                     A - B 1.321 1.365  1.437 1.602 68.45 1.726 1.8618 1.0788  5000
11                     A * B 1.327 1.366  1.439 1.586 63.71 1.719 1.8044 1.0499  5000
12                       A/B 2.435 2.463  2.530 2.560 65.87 2.804 1.8057 0.6439  5000
13  A[1:100000]%%B[1:100000] 2.987 3.002  3.007 3.025 63.89 3.092 1.2231 0.3956  5000
14 A[1:100000]%/%B[1:100000] 2.718 2.732  2.736 2.745 63.70 2.809 1.2202 0.4344  5000

]]>
http://www.avrahamadler.com/2016/03/31/updated-r-blas-timings/feed/ 5 471
New version of OpenBLAS released http://www.avrahamadler.com/2015/10/28/new-version-of-openblas-released/ Wed, 28 Oct 2015 14:31:48 +0000 http://www.avrahamadler.com/?p=468 OpenBLAS 0.2.15 has been released. I’ll be building Rblas.dll for Sandy and Ivy bridges on Windows shortly, and would certainly like to hear from anyone else using OpenBLAS and R on Windows.

]]>
468
Padé approximants: CRAN package http://www.avrahamadler.com/2015/06/10/pade-approximants-cran-package/ Wed, 10 Jun 2015 13:56:49 +0000 http://www.avrahamadler.com/?p=460 Read the full article...]]> While working on the previous post about Padé approximants, a search on CRAN showed that there was only one package which calculated the coefficients, given the appropriate Taylor series: the pracma package. The method it uses seems rather sophisticated, but does allow for calculating coefficients “beyond” that which the Taylor series would allow. Therefore, I put together a new R package, Pade, which uses the simpler system of linear equations to calculate the coefficients and does not permit calculating coefficients beyond the order of the supplied Taylor series. Any and all feedback is appreciated, as always!

]]>
460
LaTeX Renderer Change http://www.avrahamadler.com/2015/06/05/454/ Fri, 05 Jun 2015 17:51:26 +0000 http://www.avrahamadler.com/?p=454 I just changed the LaTeX renderer plugin on the blog from MathJax-LaTeX to WP QuickLaTeX, as the former stopped working and not only does the latter work, but it allows the {align} environment. If you notice any malformed math or missing text, please drop me a note.

]]>
454
A Practical Example of Calculating Padé Approximant Coefficients Using R http://www.avrahamadler.com/2015/06/04/a-practical-example-of-calculating-pade-approximant-coefficients-using-r/ http://www.avrahamadler.com/2015/06/04/a-practical-example-of-calculating-pade-approximant-coefficients-using-r/#comments Thu, 04 Jun 2015 20:53:28 +0000 http://www.avrahamadler.com/?p=411 Read the full article...]]>

Introduction

I recently had the opportunity to use Padé approximants. There is a lot of good information available on line on the theory and applications of using Padé approximants, but I had trouble finding a good example explaining just how to calculate the co-efficients.

Basic Background

Hearken back to undergraduate calculus for a moment. For a given function, its Taylor series is the “best” polynomial representations of that function. If the function is being evaluated at 0, the Taylor series representation is also called the Maclaurin series. The error is proportional to the first “left-off” term. Also, the series is only a good estimate in a small radius around the point for which it is calculated (e.g. 0 for a Maclaurin series).

Padé approximants estimate functions as the quotient of two polynomials. Specifically, given a Taylor series expansion of a function T(x) of order m + n, there are two polynomials, P(x) of order m and Q(x) of order n, such that \frac{P(x)}{Q(x)}, called the Padé approximant of order [m/n], “agrees” with the original function in order m + n. You may ask, “but the Taylor series from whence it is derived is also of order m + n?” And you would be correct. However, the Padé approximant seems to consistently have a wider radius of convergence than its parent Taylor series, and, being a quotient, is composed of lower-degree polynomials. With the normalization that the first term of Q(x) is always 1, there is a set of linear equations which will generate the unique Padé approximant coefficients. Letting a_n be the coefficients for the Taylor series, one can solve:

(1)   \begin{align*} &a_0 &= p_0\\ &a_1 + a_0q_1 &= p_1\\ &a_2 + a_1q_1 + a_0q_2 &= p_2\\ &a_3 + a_2q_1 + a_1q_2 + a_0q_3 &= p_3\\ &a_4 + a_3q_1 + a_2q_2 + a_1q_3 + a_0q_4 &= p_4\\ &\vdots&\vdots\\ &a_{m+n} + a_{m+n-1}q_1 + \ldots + a_0q_{m+n} &= p_{m+n} \end{align*}

remembering that all p_k, k > m and q_k, k > n are 0.

There is a lot of research on the theory of Padé approximants and Padé tables, how they work, their relationship to continued fractions, and why they work so well. For example, the interested reader is directed to Baker (1975), Van Assche (2006), Wikipedia, and MathWorld for more.

Practical Example

The function \log(1 + x) will be used as the example. This function has a special representation in almost all computer languages, often called log1p(x), as naïve implementation as log(1 + x) will suffer catastrophic floating point errors for x near 0.

The Maclaurin series expansion for \log(1 + x) is:

    \[ \sum_{k=1}^\infty -1^{k+1}\frac {x^k}{k} = x - \frac{1}{2}x^2 + \frac{1}{3}x^3 - \frac{1}{4}x^4 + \frac{1}{5}x^5 - \frac{1}{6}x^6\ldots \]

The code below will compare a Padé[3,3] approximant with the 6-term Maclaurin series, which would actually be the Padé[6/0] approximant. First to calculate the coefficients. We know the Maclaurin coefficients, they are 0, 1, -\frac{1}{2}, \frac{1}{3}, -\frac{1}{4}, \frac{1}{5}, -\frac{1}{6}. Therefore, the system of linear equations looks like this:

(2)   \begin{align*} &0 &= p_0\\ &1 + 0q_1 &= p_1\\ &-\frac{1}{2} + q_1 + 0q_2 &= p_2\\ &\frac{1}{3} - \frac{1}{2}q_1 + q_2 + a_0q_3 &= p_3\\ &-\frac{1}{4} + \frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 &= 0\\ &\frac{1}{5} - \frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 &= 0\\ &-\frac{1}{6} + \frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 &= 0 \end{align*}

Rewriting in terms of the known a_n coefficients, we get:

(3)   \begin{align*} &-p_0 =& 0\\ &0q_1 - p_1 =& -1\\ &q_1 + 0q_2 -p_2 =& \frac{1}{2}\\ &-\frac{1}{2}q_1 + q_2 + a_0q_3 - p_3 =& -\frac{1}{3}\\ &\frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 =& \frac{1}{4}\\ &-\frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 =& -\frac{1}{5}\\ &\frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 =& \frac{1}{6} \end{align*}

We can solve this in R using solve:

A <- matrix(c(0, 0, 1, -.5 ,1 / 3 , -.25, .2, 0, 0, 0, 1, -.5, 1 / 3, -.25, 0, 0, 0, 0, 1, -.5, 1 / 3, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0), ncol = 7)
B <- c(0, -1, .5, -1/3, .25, -.2, 1/6)
P_Coeff <- solve(A, B)
print(P_Coeff)

## [1] 1.5000000 0.6000000 0.0500000 0.0000000 1.0000000 1.0000000 0.1833333

Now we can create the estimating functions:
ML <- function(x){x - .5 * x ^ 2 + x ^ 3 / 3 - .25 * x ^ 4 + .2 * x ^ 5 - x ^ 6 / 6}

PD33 <- function(x){
  NUMER <- x + x ^ 2 + 0.1833333333333333 * x ^ 3
  DENOM <- 1 + 1.5 * x + .6 * x ^ 2 + 0.05 * x ^ 3
  return(NUMER / DENOM)
}

Let’s compare the behavior of these functions around 0 with the naïve and sophisticated implementations of \log(1+x) in R.
library(dplyr)
library(ggplot2)
library(tidyr)
D <- seq(-1e-2, 1e-2, 1e-6)
RelErr <- tbl_df(data.frame(X = D, Naive = (log(1 + D) - log1p(D)) / log1p(D), MacL = (ML(D) - log1p(D)) / log1p(D), Pade = (PD33(D) - log1p(D)) / log1p(D)))
RelErr2 <- gather(RelErr, Type, Error, -X)
RelErr2 %>% group_by(Type) %>% summarize(MeanError = mean(Error, na.rm = TRUE)) %>% knitr::kable(digits = 18)

Type MeanError
Naïve -4.3280e-15
MacL -2.0417e-14
Pade -5.2000e-17

Graphing the relative error in a small area around 0 shows the differing behaviors. First, against the naïve implementation, both estimates do much better.

ggplot(RelErr2, aes(x = X)) + geom_point(aes(y = Error, colour = Type), alpha = 0.5)

Graph1

But when compared one against the other, the Pade approximant (blue) shows better behavior than the Maclaurin (red) and it’s relative error stays below EPS for a wider swath.

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

Graph2

Just for fun, restricting the y axis to that above, overlaying the naïve formulation (green) looks like this:

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = Naive), colour = 'green', alpha = 0.5) + scale_y_continuous(limits = c(-1.5e-13, 0)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

Graph3

There are certainly more efficient and elegant ways to calculate Padé approximants, but I found this exercise helpful, and I hope you do as well!

References

  • Baker, G. A. Essentials of Padé Approximants Academic Press, 1975
  • Van Assche, W. Pade and Hermite-Pade approximation and orthogonality ArXiv Mathematics e-prints, 2006
]]>
http://www.avrahamadler.com/2015/06/04/a-practical-example-of-calculating-pade-approximant-coefficients-using-r/feed/ 1 411
New package on CRAN: lamW http://www.avrahamadler.com/2015/05/26/lamw-package-cran/ Tue, 26 May 2015 18:01:22 +0000 http://www.avrahamadler.com/?p=406 Read the full article...]]> Recently, in various research projects, the Lambert-W function arose a number of times. Somewhat frustratingly, there is no built-in function in R to calculate it. The only options were those in the gsl and LambertW packages, the latter merely importing the former. Importing the entire GNU Scientific Library (GSL) can be a bit of a hassle, especially for those of us restricted to a Windows environment.

Therefore, I spent a little time and built a package for R whose sole purpose is to calculate the real-valued versions of the Lambert W function without the need for importing the GSL: the lamW package. It does depends on Rcpp, though. It could have been written in pure R, but as there are a number of loops involved which cannot be vectorized, and as Rcpp is fast becoming almost a base package, I figured that the speed and convenience was worth it.

A welcome outcome of this was that I think I finally wrapped my head around basic Padé approximation, which I use when calculating some parts of the primary branch of the Lambert-W. Eventually, I’d like to write a longer post about Padé approximation; when that will happen, who knows 8-).

]]>
406