Think Python AND R and not just PYTHON OR R: From NULL to String types


Continuing the saga of types in python and R, in this short post i will talk briefly about the NULL object (or NoneType in python) and how python and R handle the String type. I am using python 2.7.13 and R 3.3.3, both 64-bit on Ubuntu 16.04 and I am also using the free book called A Hands-On Introduction to Using Python in the Atmospheric and Oceanic Sciences by prof. Johnny Lin as a guide.

The NULL/None type

The NoneType in python and the NULL object in R are basically used to signify the absence of a value in many situations. It is often returned by expressions and functions whose value is undefined.  Because variables are dynamically typed, objects with value NULL can be changed by replacement operators and will be coerced to the type of the right-hand side. This object is also good to “safely” initialize a parameter, making sure to set the variable to a real value later.
For example, to initialize a variable to NULL (or None in python), and if later on the code tries to do an operation with the variable before the variable has been reassigned to a non-Null Type (or non-NoneType in python) variable, the interpreter will give an error.

In addition because NULL (None) is a special type (object) it has a relational operator to test if an object is NULL (or None). Using R:

a = NULL
a == NULL
is.null(a)
## logical(0)
## [1] TRUE

And using pyhton:

a = None
print a == None
print a is None
## True
## True

In python it is possible to use the common relational operator == but it is not recommended. One should use is instead of ==. More information about why here and an example here.

String variables

In python or R string variables (character vectors) are created by setting text in either paired single or double quotes.
Python uses the operator (+) to join strings together:

a = "Hello"
b = "World"
print a 
print b 
print a + b 
## Hello
## World
## HelloWorld

However R does not use the same operator. There are diffrent forms to concatenate strings in R and one of the simplest way is to use the functions paste() (by default includes a space caracther between the strings but this can be easily changed) or paste0().

paste(a,b)
paste0(a,b)
## [1] "Hello World"
## [1] "HelloWorld"

If you have any question, suggestion or opinion about this post please feel free to write a comment below.

Think Python AND R and not just PYTHON OR R: More about types


Last post I talked a little bit about the two most used programming languages for machine learning (python and R) and how they handle operators and types. In this post i will extend the talk about types in python and R. In particular logical operators and Boolean. Here I am using python 2.7.13 and R 3.3.3, both 64-bit on a Ubuntu 16.04 and I am using the free book called A Hands-On Introduction to Using Python in the Atmospheric and Oceanic Sciences by prof. Johnny Lin as a guide.

Logical operators

The logical operators are <, <=, >, >=, == for exact equality and != for inequality (valid for both languages). However there are some differences between python and R when comparing logical expressions. In R if test1 and test2 are logical expressions, then test1 & test2 is their intersection (“and”), test1 | test2 is their union (“or”), and !test1 is the negation of test1. Thus:

a = TRUE
b = FALSE
a & b
a | b
!a
## [1] FALSE
## [1] TRUE
## [1] FALSE

In python test1 and test2 is their intersection (“and”), test1 or test2 is their union (“or”), and not test1 is the negation of test1.

a = True
b = False
print(a and b)
print(a or b)
print(not a)
## False
## True
## False

Boolean variables

Python and R are case sensitive, so capitalization matters!!!! Therefore, TRUE != True. R also allows the use of T and F but it is not recommended. From the R documentation:

The elements of a logical vector can have the values TRUE, FALSE, and NA (for “not available”). The first two are often abbreviated as T and F, respectively. Note however that T and F are just variables which are set to TRUE and FALSE by default, but are not reserved words and hence can be overwritten by the user. Hence, you should always use TRUE and FALSE.

Thus doing a simple example in R:

c = T
a == c
## [1] FALSE

And this is why it is NOT recommended:

T = 10
a == T
## [1] FALSE

In some languages, the integer value zero is considered FALSE (or False) and the integer value one is considered TRUE (or True). From the R documentation:

Logical vectors may be used in ordinary arithmetic, in which case they are coerced into numeric vectors, FALSE becoming 0 and TRUE becoming 1.

Doing a simple example in R:

a == TRUE
b == FALSE
10 + a
10 + b
## [1] TRUE
## [1] TRUE
## [1] 11
## [1] 10

The Python’s version I am using here (2.7.13) follows the same convention:

print(1 == a)
print(0 == b)
print(10 + a)
print(10 + b)
## True
## True
## 11
## 10

How about the operators? Lets use the same rule with R:

a & 1
a & 0
1 & a
2 & a
0 & a
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] TRUE
## [1] FALSE

And using python:

print(a and 1)
print(a and 0)
print(1 and a)
print(2 and a)
print(0 and a)
## 1
## 0
## True
## True
## 0

Remember Python and R are dynamically typed but they sometimes handle variables in a different way? You can click here and see how python handles the truth value testing.

Similarities not so similar

As a final remark I’d like to mention about the similarities not so similar of the operators & and | in R and python. Yes, python also has the same operators but they are the bitwise logical operators, which is slight different of what we are doing here. For example:

print(a & 1)
print(a & 0)
print(1 & a)
print(2 & a)
print(0 & a)
## 1
## 0
## 1
## 0
## 0

In R the bitwise operators are bitwAnd() and bitwOr(), but this is another post. Click here for more information about bitwise operators in python and R.

If you have any question, suggestion or opinion about this post please feel free to write a comment below.