Using Cellular Automata and Climate Data to Generate Visual Art

After the recording of the first demo with my band Mundo Kumo where we create music inspired on per reviewed papers about climate change, it was time to create some of the videos that we will be using on our live show. My initial idea was to use computer generated art combined with real climate data. I had experience with coding and complex artificial intelligence algorithms however not much experience using artificial intelligence (AI) to generate art.

The code used here was developed using the python language and it can be downloaded freely on my github repository called AlgorithmicArt. Although this was the initial idea about the videos,  the idea changed a lot after and we are not using all the videos explained here. However they are still a really good starting point.

When doing a little bit of research on the topic i found that Cellular automata was probably one of the easiest way to start. The concept of cellular automata was created by the mathematician Stephen Wolfram and some of the rules are very similar to the majority vote model used widely on statistical mechanics.

The first intuitive step is the use of the one-dimensional cellular automata (the easiest to implement and understand) . Thus picking a cellular automaton rule, it is possible to create a picture similarly to the pictures found on the internet.

To make things a little bit more interesting, a video can be made showing the evolution of the cellular automata grouping sequential gifs. This can be done using different alternatives. For example using ubuntu and the imagemagick package:

convert -delay 20 -loop 0 *.jpg myimage.gif

The final video is a thing like this:

The second step is to add the climate information in the code and combine with the cellular automata. I used global temperature anomalies from NOAA. I maped the colors based on the range of temperature anomalies. Thus, the colors (global temperature anomalies) and the evolution of the cellular automata are connected:

Next post I will talk about two-dimensional cellular automata.

Be careful what you wish for: Error measures for regression using R

Almost 10 years ago I was working with evolutionary strategies for tuning neural network for time series prediction when I became curious about error measures and the effects on the final forecast. In general, evolutionary algorithms use a fitness function that is based on a error measure. The objective is to get better individual(s) minimizing (or maximizing) the fitness function. Thus, to determine which model is “the best”, the performance of a trained model is evaluated against one or more criteria (e.g. error measure). However, the relation between the lowest error and “best model” is complex and should be applied according with the desirable goal (i.e. forecasting average, forecasting extremes, deviance measures, relative errors, etc.).

There is a journal paper which gives the description of the errors used in the ‘qualV’ package. This package has several implementations of quantitative validation methods. The paper (which is very interesting by the way), also has some examples of how the final results of the errors measures change when dealing with noise, shifts, nonlinear scaling, etc.

The objective of this post is just to show the problem, and raise the awareness when measuring the best model based on error only. Sometimes the minimization of one error measure does not guarantee the minimization of all other error measures and it even could lead to a pareto front. Here i am using some of the functions described in the paper and for simplicity i am comparing here only 4 errors measures: mean absolute error (MAE), root-mean-square error (RMSE), correlation coefficient (r) and mean absolute percentage error (MAPE). Each error measure is measuring a distinct characteristic of the time series and each of them has strong and weak points. I am using R version 3.3.2 (2016-10-31) on Ubuntu 16.04.

Case (i): Adding noise

Lets say we have the function with the original signal given by:
$y=a\sin(\pi x b)+s.$

Using x=[0,1], a=1.5, b=2, and s=0.75 then:

x = seq(0,1,by=.005)
ysignal = 1.5*sin(pi*2*x)+0.75
plot(x,ysignal,main = "Main signal")


We should change the original signal and check how this will affect the final result. If the “forecast” is the same as the signal then all the errors should be 0. Thus applying some noise to the signal s=(0.75+noise), where noise comes from a Gaussian function with mean=0 and standard deviation =0.2, and comparing with the original signal we get:

library(qualV)
n = 0.2 #noise level
noise = rnorm(length(x),sd=n)
ynoise = ysignal+noise
par(mfrow=c(1,2))
range.yy <- range(c(ysignal,ynoise))
plot(x,ysignal,type='l',main = "Adding noise"); lines(x,ynoise,col=2)
plot(ynoise,ysignal,ylim=range.yy,xlim=range.yy,main = "Signal vs Forecast") 

round(MAE(ysignal,ynoise),2)
## [1] 0.16
round(RMSE(ysignal,ynoise),2)
## [1] 0.21
round(cor(ysignal,ynoise),2)
## [1] 0.98
round(MAPE(ysignal,ynoise),2)
## [1] 40.65


Case (ii): Shifting the signal

Lets apply a shift on the values of the original signal. With s=0.95 we have:

yshift = ysignal+0.2


round(MAE(ysignal,yshift),2)
## [1] 0.2
round(RMSE(ysignal,yshift),2)
## [1] 0.2
round(cor(ysignal,yshift),2)
## [1] 1
round(MAPE(ysignal,yshift),2)
## [1] 60.95


Case (iii): shift + rescale

Lets apply a shift and also rescale the values of the original signal. Doing a=0.8 and s=0.95 we have:

yresshift = 0.8*ysignal+0.2


round(MAE(ysignal,yresshift),2)
## [1] 0.19
round(RMSE(ysignal,yresshift),2)
## [1] 0.22
round(cor(ysignal,yresshift),2)
## [1] 1
round(MAPE(ysignal,yresshift),2)
## [1] 61.66


Case (iv): Changing the frequency

In this case lets vary slightly the frequency of the original signal making b=2.11:

yfreq = 1.5*sin(pi*2.11*x)+0.75


round(MAE(ysignal,yfreq),2)
## [1] 0.17
round(RMSE(ysignal,yfreq),2)
## [1] 0.22
round(cor(ysignal,yfreq),2)
## [1] 0.98
round(MAPE(ysignal,yfreq),2)
## [1] 89.33


Each case has the original series (in black) and the possible “forecast” (in red). I also plotted the original series (signal) versus the residual series. Which case would you pick as the best forecast? What is your assumption?

Installing R package gputools and cuda 8.0 on Ubuntu 16.04

This is a quick tutorial of how to install the R package ‘gputools’ version 1.1 using R version 3.3.2 (2016-10-31) and cuda 8.0 on Ubuntu 16.04. Most of these versions are new so I did some search on the internet and I could not find a tutorial about that. However most of this tutorial is based on this page which is for ‘gputools’ version 0.28 and cuda 7.0 on Ubuntu 15.04. At the end I just changed a few lines.

I have tested it on a ASUS ROG G752VM with NVIDIA GeForce GTX 965M graphics card. The instruction assumes you have the necessary CUDA compatible hardware support. In my case I also installed the NVIDIA driver 367.57 first. My computer was new so I did not have any nvidia driver or compatibility issues. However I strongly recommend to look on the internet how to remove the old drivers first, before install the new ones (things like sudo apt-get purge nvidia-cuda*).

Installing CUDA 8.0

First, to install CUDA 8.0 we can do:

wget https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

Or you can download the CUDA repository package for Ubuntu 16.04 from the CUDA download site and follow the instructions according to your necessities.

Environment Variables

I tried to install the gputools package without adding the variables to the environment and i got an error related to nvcc. Thus, as part of the CUDA environment, we should add the nvcc compiler in the .bashrc file of your home folder.

export CUDA_HOME=/usr/local/cuda-8.0
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64 PATH=${CUDA_HOME}/bin:${PATH} PATH=${CUDA_HOME}/bin/nvcc:${PATH} export PATH Installing gputools version 1.1 The fastest way to install gpuplots if you are using R version 3.3.2 is: install.packages('gputools') Now my tutorial differs a bit more from the tutorial I mentioned before. I received the message: rinterface.cu:1:14: fatal error: R.h: No such file or directory #include So we have to check where R header dir location is. First lets locate the file R.h: locate \/R.h ## /usr/share/R/include/R.h Then next step is to tell to gputools where the R.h is located. Thus it is necessary to change a line in the source package. First download and extract the source package: wget http://cran.r-project.org/src/contrib/gputools_1.1.tar.gz tar -zxvf gputools_1.1.tar.gz Look into the folder you just extracted then open the file configure on your favourite Ubuntu editor to replace the string R_INCLUDE="${R_HOME}/include" for R_INCLUDE="/usr/share/R/include" (which is the location of my R.h file).

The two finals steps are compress the modified source code

tar -czvf gputools_1.1_new.tar.gz gputools

and install the modified package

install.packages("~/gputools_1.1_new.tar.gz", repos = NULL, type = "source")

I had lots of warning messages but no error.

Testing performance

Now we can try some simple benchmarks and see how much time the CPU and gpu time will spend. First a small matrix multiplication:

library(gputools)

magnitude <- 10
dimA <- 2*magnitude;dimB <- 3*magnitude;dimC <- 4*magnitude
matA <- matrix(runif(dimA*dimB), dimA, dimB)
matB <- matrix(runif(dimB*dimC), dimB, dimC)

system.time(matA%*%matB);
##    user  system elapsed
##   0.000   0.000   0.001
system.time(gpuMatMult(matA, matB))
##    user  system elapsed
##   0.076   0.140   0.215

then using larger matrices:

magnitude <- 1000
dimA <- 2*magnitude;dimB <- 3*magnitude;dimC <- 4*magnitude
matA <- matrix(runif(dimA*dimB), dimA, dimB)
matB <- matrix(runif(dimB*dimC), dimB, dimC)

system.time(matA%*%matB);
##    user  system elapsed
##  15.552   0.028  15.579
system.time(gpuMatMult(matA, matB))
##    user  system elapsed
##   0.792   0.124   0.914

China and Brazil water pollution

Wow this is impressive. I imagine, how can they do that? The effort is huge. It requires a lot of laziness of the Governments, greedy, contempt…
WARNING: strong images (not for the respective Governments)!