GIOVANNI CARMANTINI

A window on Numpy's views

2018-05-20T20:37:59+00:00

If you use Numpy for your everyday number crunching and you’ve never heard about views, you are not as effective as you could be. With this post, I want to help you take your Numpy skills to the next level.

Your mental model of what arrays are in Numpy is probably quite different from the underlying reality. To get you to the paradigm shift, we need a problem that can bring us down the rabbit hole. This is the windowing problem:

I have an input time series with \(k\) observations. I want to rearrange it in \(w\) windows of some size \(s\) and apply a function \(f\) to each window, to get an output array of \(w\) values. How can I do this fast and with a small memory footprint using Numpy?

Or – more visually – how to use Numpy to do something like the following, while keeping fast and small?

Two naive solutions to windowing end up being either too slow or too memory-hungry.

Specifically, using a pure Python for loop to iterate window-by-window over the original array is going to be slow; Creating a new windowed array from the old one requires up to \(w + 1\) times the amount of memory used by the old array. Both of these approaches could work for a one-off script on a small dataset, but they become unfeasible as soon as the data size increases, or our data transformation needs to be called repeatedly in some data processing pipeline.

Enter views.

Data views in numpy

A numpy array is a pointer to a contiguous chunk of memory, bundled with information on how to interpret that memory. It turns out that you can modify this information to reinterpret the same contiguous chunk of memory, and see it in a different way. When you do that, you are creating a view. ¹ The original array is then akin to a “canonical” interpretation of the underlying memory, and it owns that memory. A view reinterprets the canon, but it doesn’t own its memory – it shares it with the original array.

The interpretation information is contained in key numpy.ndarray attributes:

data: a pointer to the chunk of memory
dtype: data type. Is each byte in the memory chunk representing a boolean, an 8-bit integer, or perhaps is part of the 8 bytes forming a 64-bit floating point number?
itemsize: number of bytes needed to represent a single element, e.g. 8 for a 64-bit float. This is included for clarity, as it is actually completely defined by dtype.
shape: the size of the array in each of the dimensions. For example, a 2D array with \(j\) rows and \(k\) columns would have shape=(j, k).
strides: for each dimension, the number of bytes to jump ahead to get to the next element in that dimension. More about this later.

Here’s an example of how this information comes together to form an interpretation of some memory chunk:

Numpy view

By modifying the interpretation information bundled with every numpy array, we can keep the same underlying memory contents, yet

see the memory as containing one type of data or another (e.g. integers, floats, booleans, or arbitrary python objects);
decide how to move through this memory, ignoring some parts of it and only accessing some other;

Numpy’s view approach is extremely powerful. To illustrate why, say I have a 1GB array big_arr, and I want to create another array smaller_arr from it, containing all elements in the second half of big_arr. By creating a view of the original big_arr, numpy will just reuse the same memory from big_arr instead of copying half of its contents to some newly allocated 500MB memory chunk.

>>> bytes_in_gigabyte = 1e9
>>> # This increases RAM usage by 1GB (check with a RAM monitor)
>>> big_arr = np.ones(int(bytes_in_gigabyte), dtype=np.int8)
>>> # No big increase in RAM for smaller_arr
>>> smaller_arr = big_arr[big_arr.size/2:]

This is interesting and all, but – you may be wondering – what does this have to do with windowing? Let’s talk about strides.

Moving through memory

As I wrote before, a numpy array contains information about how to move through the memory chunk which the array points to.

An array’s strides tuple \((i_0, i_1, \ldots, i_N)\) is the answer to the question “If I am moving along dimension \(j\), how many bytes of memory do I have to skip to get to the next element?”. For example, for the array of 1-byte integers in the figure with 2 rows and 4 columns, strides would be equal to (4, 1), as if I am moving along dimension 0 (i.e. along rows), I need to skip 4 bytes of data to get to the next element from the current one, and to skip 1 byte to get to the next element if I am moving along dimension 1 (i.e. columns). An animation should help clarifying the concept:

We can modify strides to get a memory-efficient windowed view of an array, and be able to apply fast numerical transformations to it. This is because we can rely on Numpy’s fast vectorization to apply functions to our windowed array rather than having to use slow pure Python.

Windowing with strides

Let’s take the chunk of memory in the previous figure, interpreted as a sequence of 1-byte integers:

[1, 3, 3, 7, 8, 0, 0, 8]

Say we want to window this such that each window contains 2 elements, with the windowing advancing by one element at a time; this would result in the following array:

[[1, 3], [3, 3], [3, 7], [7, 8], [8, 0], [0, 0], [0, 8]]

If we were to create the windowed array from scratch, we would end up with an array roughly 2 times as big as the original one. As I wrote before, this is not practical when one has a big array to begin with, and the problem only grows worse as the windowing size increases. Luckily, we can manipulate the strides and shape information to move intelligently through the original chunk of memory, simulating windowing but without the memory cost. Remember, strides contains, for each dimension, the number of bytes to jump ahead to get to the next element in that dimension.

For the chunk of memory in the example, we can iterate over it as if it were windowed, by setting its shape to be (7, 2), and specifying that the first dimension always advances by one element at a time (one byte in the case of a int8 element). In this way, every time we are finished looking at a row (a window), we advance to the next one by moving a byte forward from the beginning of the current row, and see the next couple of elements as the next row.

Interestingly, Numpy won’t actually just allow us to change strides and shape without putting up a fight, so we need to use the as_strided function under numpy.lib.slide_tricks, which circumvents numpy’s safety checks to create an array on the original data with our desired shape and strides.

>>> import numpy as np
>>> arr = np.array([1, 3, 3, 7, 8, 0, 0, 8], dtype=np.int8)
>>> np.lib.stride_tricks.as_strided(arr, strides=(1, 1), shape=(7,2))
array([[1, 3],
       [3, 3],
       [3, 7],
       [7, 8],
       [8, 0],
       [0, 0],
       [0, 8]], dtype=int8)

A word of warning: bypassing numpy’s checks on shape and strides, we can do some serious damage. For example, what happens when the new shape makes the view “spill over” the original memory chunk?

>>> np.lib.stride_tricks.as_strided(arr, strides=(1, 1), shape=(10,2))
array([[   1,    3],
       [   3,    3],
       [   3,    7],
       [   7,    8],
       [   8,    0],
       [   0,    0],
       [   0,    8],
       [   8,  -16],
       [ -16, -108],
       [-108,   30]], dtype=int8)

That’s strange. What are those weird values starting from the 8th row? We never defined those. Those value are memory we never allocated for our original array, and that we were not supposed to access. Who knows who is using it and what it actually contains.

To avoid getting garbage in our arrays and messing with memory that is not ours to mess with, we must be sure to compute shapes and strides properly while windowing.

Efficient time series windowing in Numpy

The following function contains all the needed logic to make sure we can window efficiently and safely:

from __future__ import division
import numpy as np

def sliding_window(arr, window_size, step=0):
    """Assuming a time series with time advancing along dimension 0,
	window the time series with given size and step.

    :param arr : input array.
    :type arr: numpy.ndarray
    :param window_size: size of sliding window.
    :type window_size: int
    :param step: step size of sliding window. If 0, step size is set to obtain 
        non-overlapping contiguous windows (that is, step=window_size). 
        Defaults to 0.
    :type step: int

    :return: array 
    :rtype: numpy.ndarray
    """
    n_obs = arr.shape[0]

    # validate arguments
    if window_size > n_obs:
        raise ValueError(
            "Window size must be less than or equal "
            "the size of array in first dimension."
        )
    if step < 0:
        raise ValueError("Step must be positive.")

    n_windows = 1 + int(np.floor( (n_obs - window_size) / step))

    obs_stride = arr.strides[0]
    windowed_row_stride = obs_stride * step

    new_shape = (n_windows, window_size) + arr.shape[1:]
    new_strides = (windowed_row_stride, ) + arr.strides

    strided = np.lib.stride_tricks.as_strided(
        arr,
        shape=new_shape,
        strides=new_strides,
    )
    return strided

Let’s go through this bit by bit.

The first few lines are not particularly interesting, we just validate the passed arguments to make sure they make sense.

    n_obs = arr.shape[0]

    # validate arguments
    if window_size > n_obs:
        raise ValueError(
            "Window size must be less than or equal "
            "the size of array in first dimension."
        )
    if step < 0:
        raise ValueError("Step must be positive.")

Subsequently, we count the number of windows that the windowed array will end up containing.

    n_windows = 1 + int(np.floor( (n_obs - window_size) / step))

To understand the equation, think about this way. We place a window at the beginning of the array. How many times we can slide this window forwards by step before the window goes over the array bounds? That’s equal to the number of times we can fit a step in n_obs - window_size, i.e. floor( (n_obs - window_size)/step ). For a more visual intuition:

We are finally ready to create the windowed view. Whereas the old array was organized as a sequence of observations, the view is organized as a sequence of windows, each one containing window_size observations. We need to update the shape metadata to reflect this change in dimensionality, as well as strides, to let Numpy know how many bytes it needs to skip to go from one window to the next.

    obs_stride = arr.strides[0]
    windowed_row_stride = obs_stride * step

    new_shape = (n_windows, window_size) + arr.shape[1:]
    new_strides = (windowed_row_stride, ) + arr.strides

    strided = np.lib.stride_tricks.as_strided(
        arr,
        shape=new_shape,
        strides=new_strides,
		writeable=False,
    )
	return strided

And we are done. Let’s see it in action.

>>> big_arr = np.ones(int(1e9), dtype=np.int8)

Again, big_arr is an array of size ~1GB.

>>> sl_arr = sliding_window(big_arr, window_size=1000, step=1)
>>> out = np.empty(sl_arr.shape[0], dtype=np.int8)
>>> np.min(sl_arr, axis=1, out=out)

If we were to actually create the windowed array from scratch, we would need ~1TB worth of memory to accommodate it. Thanks to the power of views, the new array only needs enough memory to contain the interpretation information (11 orders of magnitude less), and is created instantly.

Conclusion

Numpy is great at hiding unnecessary complexity (such as strides or memory bounds) from day-to-day work. Yet, from time to time, it is good to open the clock and see what makes it tick. Windowing is the problem that had me open the clock. In this post, I used the windowing problem to abuse Numpy’s views a little and show you their inner workings, the same way I learned about them. Hope you enjoyed it.

In case you came here through a specific search for efficient windowing in Numpy, I can point you to some other useful resources.

Throughout this post, windowing was only applied to the first dimension of an array. This is because I assumed a time series as input, where iteration over subsequent observations is normally done on the first array dimension. That is, array[i] contains the value of some measurement(s) at time \(t_i\). If time series are not where you spend most of your number crunching, then your windowing needs might be different. You might want to have a look at a more general windowing implementations, such as this one or this one.

If you are looking for time series windowing in Python, but you are not comfortable tinkering with stride tricks, consider using Pandas in your projects, and have a look at its rolling interface.

Finally, if you feel the need for even more advanced windowing capabilities than covered here or the links provided, you may find what you are looking for in Numba or Cython.

If you enjoyed this post, I’d be grateful if you’d help it spread by sharing it with your fellow data crunchers and numeric Python enthusiasts. I’d also love to read what you think, so go on and make a blogger happy – drop me a comment!

See this paper for an introduction ↩

Deriving the power of Wald test for a single parameter

2016-06-22T15:56:00+00:00

While studying from Larry Wasserman’s “All of Statistics”, I’ve found that the exposition of the Wald test was a little confusing to me, so that I struggled a bit in trying to derive a key result. Given that I didn’t find much on the internet to help me, and that I finally figured it out after a while, I thought of writing a small post for other confused students.

Given a scalar parameter \(\theta\) of the distribution underlying the data, the Wald test uses its estimate \(\hat{\theta}\) to compute a statistic \({W}\), which is then used to pit a null hypothesis \(H_0\) against an alternative hypothesis \(H_1\). The distribution of the estimate \(\hat{\theta}\) is assumed to be asymptotically normal, centered on the true value of \(\theta\). The null hypothesis is \(H_0: \theta = \theta_0\), which states that the true value of the parameter \(\theta\) is some scalar \(\theta_0\) (so that, if we assume that \(H_0\) is true and given that \(\hat{\theta}\) is asymptotically normal, \(\hat{\theta}\) would be centered on \(\theta_0\)). The alternative hypothesis is instead \(H_1: \theta \ne \theta_0\), which states the opposite.

Let \(\widehat{\text{se}}\) be the estimated standard error of \(\hat{\theta}\). If the null hypothesis is true, and thus the asymptotic distribution of \(\hat{\theta}\) is a normal centered on \(\theta_0\), then

\[\begin{equation} \frac{\hat{\theta} - \theta_0}{\widehat{\text{se}}} \rightsquigarrow N(0,1), \end{equation}\]

where \({N}(0,1)\) is the normal distribution with mean 0 and unit standard deviation.

The Wald statistic is defined as

\[\begin{equation} W = \frac{\hat{\theta} - \theta_0}{\widehat{\text{se}}} \end{equation}\]

and the size \(\alpha\) Wald test states: reject the null hypothesis when \(\vert W \vert >z_\frac{\alpha}{2}\), where \(z_\frac{\alpha}{2}\) is equal to \(\Phi^{-1}(1-\frac{\alpha}{2})\), \(\Phi^{-1}\) being the inverse of the normal CDF. \({W}\) is asymptotically distributed as \({N}(0,1)\), so that the probability of rejecting the null hypothesis asymptotically converges to \(\alpha\).

Now for the key result I wanted to prove:

Theorem: Let \(\theta_\ast\) be the true value of \(\theta\), \(\theta_\ast \ne \theta_0\) (i.e. the null hypothesis really is false). The power \(\beta(\theta_\ast) = P_{\theta_\ast}(\vert W \vert >z_\frac{\alpha}{2})\) of correctly rejecting the null hypothesis is then approximately equal to

\[\begin{equation} 1 - \Phi\left(\frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} + z_\frac{\alpha}{2}\right) + \Phi\left(\frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} - z_\frac{\alpha}{2}\right) \end{equation}\]

In proving this result I was initially confused by the fact that in Wasserman’s book \(\hat{\theta}\) is assumed to be asymptotically normal with center in \(\theta_0\) (Theorem 10.3). I was then further led astray by this question on math.stackexchange, where the user asking the question incorrectly applies the definition of power of a test.

So here’s the proof:

Proof: The true value of \(\theta\) is \(\theta_\ast\): given that \(\hat{\theta}\) is asymptotically normal and centered on \(\theta_\ast\), then \(\frac{\hat{\theta} - \theta_\ast}{\widehat{\text{se}}} \rightsquigarrow N(0,1)\). The power of the Wald test for \(\theta=\theta_\ast\) is equal to

\[\begin{align} \beta(\theta_\ast) & = P_{\theta_\ast}(\vert W \vert>z_\frac{\alpha}{2}) \\ & = P_{\theta_\ast}\left(\frac{\vert \hat\theta - \theta_0\vert}{\widehat{\text{se}}} > z_\frac{\alpha}{2}\right)\\ & = P_{\theta_\ast}\left(\frac{\hat\theta - \theta_0}{\widehat{\text{se}}} > z_\frac{\alpha}{2} \right) + P_{\theta_\ast}\left(\frac{\hat\theta - \theta_0}{\widehat{\text{se}}} < - z_\frac{\alpha}{2} \right) \\ & = P_{\theta_\ast}\left(\frac{\hat\theta - \theta_0}{\widehat{\text{se}}} + \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} > \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} + z_\frac{\alpha}{2} \right) +\\ & \hphantom{{}={}} P_{\theta_\ast}\left(\frac{\hat\theta - \theta_0}{\widehat{\text{se}}} + \frac{\theta_0 - \theta_*}{\widehat{\text{se}}} < \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} - z_\frac{\alpha}{2} \right)\\ \end{align}\]

let \(Z = \frac{\hat\theta - \theta_0}{\widehat{\text{se}}} + \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} = \frac{\hat{\theta} - \theta_\ast}{\widehat{\text{se}}}\), then \(Z \rightsquigarrow N(0,1)\), so that

\[\begin{align} & \hphantom{{}={}}P_{\theta_\ast}\left(Z > \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} + z_\frac{\alpha}{2}\right) + P_{\theta_\ast}\left(Z < \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} - z_\frac{\alpha}{2}\right) \\\\ & = 1 - P_{\theta_\ast}\left(Z < \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} + z_\frac{\alpha}{2}\right) + P_{\theta_\ast}\left(Z < \frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} - z_\frac{\alpha}{2}\right) \\ & = 1 - \Phi\left(\frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} + z_\frac{\alpha}{2}\right) + \Phi\left(\frac{\theta_0 - \theta_\ast}{\widehat{\text{se}}} - z_\frac{\alpha}{2}\right) \end{align}\]

concluding the proof \(\blacksquare\)

Minimum Edit Distance in Python

2016-01-21T11:50:00+00:00

While I’m going through the NLP course by Jurafsky and Manning on coursera, I coded a small python implementation of the Wagner-Fischer algorithm presented in lecture 6, 7 and 8. And here it is! Please refer to the lectures for a more in-depth explanations of the algorithm. I’ll just go quickly through the basics and then present the code.

Introduction

How similar are these two strings?

Many people from different fields often end up asking themselves this question: the computational biologist comparing sequences of bases to see if they contain similar information; the computer scientist implementing speech recognition, trying to make sense of odd recognition results; me fighting with autocorrection for the control of my smartphone.

To actually answer this question, you first need to define some concept of distance between strings.

A useful definition is that of edit distance:

The edit distance is the number of operations (insertions, deletions or substitutions) needed to transform one string in another.

where an insert operation means adding a symbol to the string, deletion means subtracting one, and substitution is a deletion followed by an insertion. Depending on your definition of edit distance, you may just consider insertion and deletion and do without substitution.

For example, we may want to calculate the distance between the strings spell and help:

s	p	e	l	l
h	e	l	p

One way to transform help into spell is to align the two el substrings, insert an s at the beginning of help, and perform the remaining substitutions. If we abbreviate insertion, deletion and substitution with i, d and s,

s	p	e	l	l
	h	e	l	p
i	s			s

so that, depending on whether we consider substitution to be a single operation or two, we end up with an edit distance between spell and help of respectively 3 or 5.

Minimum edit distance

Normally we are not interested in any edit distance, but we want the minimum edit distance between two strings. How to compute it?

There is an infinity of ways in which we can transform one string into another. We can get creative with alignments, inserting whole books’ worth of characters and then deleting the ones we don’t need, hiring a chiliad of monkeys randomly tapping on a keyboard until they manage to get from the first string to the second, and so on.

Luckily, if it is the minimum edit distance that we want, we don’t need to search this enormous space naively; we can be smart about it.

Say we have an initial state (the starting string) and an ending state (the final string). To go from one to the other, we apply a sequence of operations: a path connecting them. It turns out that to find the shortest path between two states, we just need to make sure that we are following the shortest possible path between each of the intermediate states between the two.

This problem can be solved elegantly by dynamic programming.

Dynamic Programming for Minimum Edit Distance: Wagner–Fischer algorithm

With dynamic programming, we solve a large problem by first solving smaller parts of it, and then building on the knowledge we gathered to solve bigger and bigger parts.

In the Wagner-Fischer algorithm, we define a distance matrix \(D_{i,j} = d(X[1\ldots i], Y[1\ldots j])\), the matrix in which index \((i, j)\) corresponds to the minimum edit distance \(d\) between the first \(i\) symbols in \(X\) and the first \(j\) symbols in \(Y\). We first compute \(D_{i,j}\) for small \((i,j)\), and then go for larger and larger \(i\) and \(j\) using the smaller bits that we already computed before.

By doing this, we end up with the minimum edit distance between \(X\) and \(Y\), that is \(d(X[1\ldots m], Y[1 \ldots n])\), where \(m = \vert X \vert\) is the length of the \(X\) string, and \(n = \vert Y \vert\) is the length of the \(Y\) string.

I’ll now go through a python implementation of the algorithm. I’ll be using python3, as I wanted unicode support and I didn’t want to deal with the unicode nonsense in python2. To run the code in python2, just take out the unicode arrows.

First things first, let’s import some libraries. Numpy just makes things cleaner (not much going on here in terms of numerics), and we use tabulate to produce the final tables for backtracking and alignment.

1
2
import numpy as np
import tabulate as tb

We jump straight into defining the key function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def wagner_fischer(word_1, word_2):
    n = len(word_1) + 1  # counting empty string 
    m = len(word_2) + 1  # counting empty string

    # initialize D matrix
    D = np.zeros(shape=(n, m), dtype=np.int)
    D[:,0] = range(n)
    D[0,:] = range(m)

    # B is the backtrack matrix. At each index, it contains a triple
    # of booleans, used as flags. if B(i,j) = (1, 1, 0) for example,
    # the distance computed in D(i,j) came from a deletion or a
    # substitution. This is used to compute backtracking later.
    B = np.zeros(shape=(n, m), dtype=[("del", 'b'), 
                      ("sub", 'b'),
                      ("ins", 'b')])
    B[1:,0] = (1, 0, 0) 
    B[0,1:] = (0, 0, 1)

    for i, l_1 in enumerate(word_1, start=1):
    for j, l_2 in enumerate(word_2, start=1):
        deletion = D[i-1,j] + 1
        insertion = D[i, j-1] + 1
        substitution = D[i-1,j-1] + (0 if l_1==l_2 else 2)

        mo = np.min([deletion, insertion, substitution])

        B[i,j] = (deletion==mo, substitution==mo, insertion==mo)
        D[i,j] = mo
    return D, B

And here we implement a naive backtrace:

start from index \((m,n)\),
look at where the computed value in \((m,n)\) came from
In order of preference, follow a substitution, or a deletion, or an insertion (that is, go to the cell up and to the left if that’s where the value in \((m,n)\) was computed from, or to the cell above, or to the cell to the left)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def naive_backtrace(B_matrix):
    i, j = B_matrix.shape[0]-1, B_matrix.shape[1]-1
    backtrace_idxs = [(i, j)]

    while (i, j) != (0, 0):
    if B_matrix[i,j][1]:
        i, j = i-1, j-1
    elif B_matrix[i,j][0]:
        i, j = i-1, j
    elif B_matrix[i,j][2]:
        i, j = i, j-1
    backtrace_idxs.append((i,j))

    return backtrace_idxs

This next function takes a backtrace and computes the alignment between the two words. It goes through the operations and takes note of what has been applied at each step, while constructing the alignment.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
def align(word_1, word_2, bt):

    aligned_word_1 = []
    aligned_word_2 = []
    operations = []

    backtrace = bt[::-1]  # make it a forward trace

    for k in range(len(backtrace) - 1): 
    i_0, j_0 = backtrace[k]
    i_1, j_1 = backtrace[k+1]

    w_1_letter = None
    w_2_letter = None
    op = None

    if i_1 > i_0 and j_1 > j_0:  # either substitution or no-op
        if word_1[i_0] == word_2[j_0]:  # no-op, same symbol
        w_1_letter = word_1[i_0]
        w_2_letter = word_2[j_0]
        op = " "
        else:  # cost increased: substitution
        w_1_letter = word_1[i_0]
        w_2_letter = word_2[j_0]
        op = "s"
    elif i_0 == i_1:  # insertion
        w_1_letter = " "
        w_2_letter = word_2[j_0]
        op = "i"
    else: #  j_0 == j_1,  deletion
        w_1_letter = word_1[i_0]
        w_2_letter = " "
        op = "d"

    aligned_word_1.append(w_1_letter)
    aligned_word_2.append(w_2_letter)
    operations.append(op)

    return aligned_word_1, aligned_word_2, operations

Finally, this function formats the results from the Wagner–Fischer algorithm and backtracking to a table that is human-readable. In the table, each cell contains the computed minimum edit distance from the initial state to that state, and where it was computed from (that is, what operations could have produced it). The backtrace is highlighted with asterisks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def make_table(word_1, word_2, D, B, bt):
    w_1 = word_1.upper()
    w_2 = word_2.upper()

    w_1 = "#" + w_1
    w_2 = "#" + w_2

    table = []
    # table formatting in emacs, you probably don't need this line
    table.append(["<r>" for _ in range(len(w_2)+1)])
    table.append([""] + list(w_2))

    max_n_len = len(str(np.max(D)))

    for i, l_1 in enumerate(w_1):
    row = [l_1]
    for j, l_2 in enumerate(w_2):

        v, d, h = B[i,j]
        direction = ("⇑" if v else "") +\
            ("⇖" if d else "") +\
            ("⇐" if h else "")
        dist = str(D[i,j])

        cell_str = "{direction} {star}{dist}{star}".format(
                                     direction=direction,
                                     star=" *"[((i,j) in bt)],
                                     dist=dist)
        row.append(cell_str)
    table.append(row)

    return table

Now we are ready to compute the minimum edit distance table, backtrace and alignment. Note that the “#+ATTR_HTML” print statements are there to format the table for this website, they don’t serve any other mysterious purpose.

What’s the minimum edit distance between “spell” and “hello”?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
word_1 = "spell"
word_2 = "hello"

D, B = wagner_fischer(word_1, word_2)
bt = naive_backtrace(B)

edit_distance_table = make_table(word_1, word_2, D, B, bt)
alignment_table = align(word_1, word_2, bt)

print("Minimum edit distance with backtrace:")
print("#+ATTR_HTML: :border 2 :rules all :frame border "
      ":style text-align: right")  # org-babel export html properties
print(tb.tabulate(edit_distance_table, stralign="right", tablefmt="orgtbl"))

print("\nAlignment:")
print(tb.tabulate(alignment_table, tablefmt="orgtbl"))

Minimum edit distance with backtrace (the bold numbers):

	#	H	E	L	L	O
#		⇐ 1	⇐ 2	⇐ 3	⇐ 4	⇐ 5
S	⇑ 1	⇑⇖⇐ 2	⇑⇖⇐ 3	⇑⇖⇐ 4	⇑⇖⇐ 5	⇑⇖⇐ 6
P	⇑ 2	⇑⇖⇐ 3	⇑⇖⇐ 4	⇑⇖⇐ 5	⇑⇖⇐ 6	⇑⇖⇐ 7
E	⇑ 3	⇑⇖⇐ 4	⇖ 3	⇐ 4	⇐ 5	⇐ 6
L	⇑ 4	⇑⇖⇐ 5	⇑ 4	⇖ 3	⇖⇐ 4	⇐ 5
L	⇑ 5	⇑⇖⇐ 6	⇑ 5	⇑⇖ 4	⇖ 3	⇐ 4

Alignment:

s	p	e	l	l
	h	e	l	l	o
d	s				i

On the objectivity of Morality and Beauty

2016-01-07T17:05:00+00:00

A beautiful flower.

Sometime before Christmas, a friend pointed me to a great conversation between Sam Harris and Quantum Computation maven David Deutsch. Their discussion revolves on the ideas presented in the physicist’s book “The beginning of infinity”, seamlessly touching on science, knowledge, progress, the relation between humanity and the universe, and the objectivity of morality and beauty.

Yes, that’s right, the objectivity of morality and beauty. That’s why I had to read “The beginning of infinity”.

The book is composed in equal parts of insight and intelligent speculation. And who doesn’t love insight and intelligent speculation? Here’s my highlights of the book, diving directly into the controversial bits!

Reality, Truth, Objectivity

Deutsch defines something to be objective if it figures in our best explanations of the world. This way, objectivity doesn’t directly refer to the reality out there anymore. That’s because there’s no way to access that reality in its raw form – no such thing as knowledge from raw sensory experience: all knowledge is theory-laden.

All we have is our theories. However, our theories can and do sometimes contain elements of truth, reflecting the properties of the reality out there. The tools we have to make sure that the content of truths of our theories grows are conjecture and criticism. That is, we improve our theories by conjecturing competing explanations and we criticize them against some criteria. The criteria are theory-laden and are themselves subject to change, again through conjecture and criticism .

In physics, for example, we can consider a conjectured theory as an improvement in respect to the others if it better meets the criteria of explaining the available data, do so simply, and of producing testable predictions. These criteria weren’t born with physics itself, but emerged after centuries of discussion and criticism – and are not presently devoid of controversy either.

The power of conjecture and criticism – the power of reason itself – can be applied to more than just science. Philosophical theories that can be criticized against factual knowledge are also fair play. Take Morality and Aesthetics, for example.

Reason and Morality

Deutsch defines moral philosophy as tackling the problem of “what to do next”, that is to decide between a wide spectrum of options by reasoning about which options are better and which are worse. The Moral relativism point of view on the question is that there’s no real “better” or “worse”: these judgments only exist inside the arbitrary standards of culture.

Here’s where the factual knowledge in Deutsch’s claim makes its entrance. Moral theories do not just appear out of thin cultural air. They are connected with the physical world through the explanations we create to support them.

As an example, a person could consider gay adoption wrong because they think that homosexuality is a severe mental illness and, as a consequence of being raised by mentally ill parents, the child would suffer harm.

This moral theory can be pitted against physical reality. In fact, we observe plenty of adoptions by gay couples in which the child develops normally. At the same time, current science excludes the possibility that homosexuality is a severe mental illness, through its own reality-anchored cycle of conjecture and criticism. The explanation which underlies the moral theory is at odds with reality: we need a better moral theory. Luckily, we can create one through reality-anchored conjecture and criticism.

Morality is not the only philosophical theory that can be pitted against physical reality. Deutsch considers Aesthetics as another important example

Beauty, the common structure

Aesthetic pleasure can be derived by a multitude of means, as listening to great composers, looking at great paintings, or studying mathematics and physics.

Deutsch believes that there is a common structure underlying what we consider Aesthetically pleasing. If we want to explain why we find some things Aesthetically pleasing but not others, that structure has to be part of our explanation. Thus, given the definition of objectivity presented at the beginning – “objective is that which figures in our best explanations of the world” – the structure has to be considered objective.

To explain why he guesses that a common structure exists, Deutsch uses an argument which I really can’t get myself to like. He thinks that there’s no good explanation yet for why we like flowers, and guesses that the reason is that flowers evolved to produce objective beauty, and we have some of the knowledge to recognize it. In his argument, flowers and bees co-evolved so that the flowers could produce (and the bees recognize) a code to signal past the genetic rift between them – a code with universal reach. Humans are also separated by rifts, due to the fact that the content of each human mind is radically different from that of the others. For this reason human artists produced knowledge of the same nature as the one contained in the flowers’ genes, knowledge about what we call Beauty – a common structure which can inform codes of universal reach.

This is certainly a piece of intelligent speculation, but not very insightful to me. The flowers argument is a way for Deutsch to anchor Aesthetics to factual knowledge about the physical world, and detach it from parochial human experiences: if we are not the only thing in the universe able to produce and recognize beauty, then beauty cannot be cultural – flowers and bees don’t participate in our culture. Still, this argument is not really convincing to me. Is the rift between humans really as profound as that between flowers and bees? Humans can create profound connections even when they cannot even rely on a common culture or a common language to communicate; is art then really answering a need for universal communication? And why should codes with universal reach be biased to go towards beauty anyway? Would humans which never had contact with art not consider flowers beautiful? Given that bees evolved to (imperfectly) recognize beauty, and we created knowledge to (imperfectly) produce beauty, does that mean that bees would be attracted to our art if we adapted it to fit their way of experiencing the world?

I think we are in need of a better argument if we want to introduce Aesthetics to the physical world.

Conclusions

I absolutely loved “The beginning of infinity”. Deutsch’s views on the nature of knowledge and the means to attain it resonate with my own – never been a fan of positivism nor relativism. I think he is spot-on when he writes about science and how it is about better and better explanations rather than just predictions.

I still have to think carefully about his views about morality, as they imply that different cultures applying conjecture and criticism to theories, and their moral theories in particular, would in the limit converge towards the same moral theory. Then the aim of moral philosophy becomes to try and approximate a singular and specific moral reality. Can such a thing really exist?

Heteroclinic Switching simulator and visualizer

2015-11-15T16:53:26+00:00

I’ll soon be pleased to give a talk about my work on heteroclinic dynamics to other heteroclinic people with heteroclinic interests at the Heteroclinic dynamics in neuroscience workshop this December in Nice. In this heteroclinic setting, I thought it would be worth it to have a simple interactive simulation to play with while I illustrate my findings, as I don’t want the talk to be boring, and nothing boosts understanding like a good visualization!

So I spent the last few days programming a simple visualizer in Python for the switching dynamics in the system described here, built upon the simulator I am using to produce the Terabyte of data needed for my next paper with Fabio Neves. Ta-da:

In the meantime, I thought it would be nice to put everything on GitHub, so maybe other heteroclinic people can give it a heteroclinic try. You never know!

So, here it is! At the moment what it can do is to simulate the system I linked above, but I programmed it to be easily generalizable if needed. That is, if I find that somebody is actually interested in a general version – at the moment it contains bits that make it special-purpose.

Otherwise, it will just be an exercise in reproducible and open research, which I am happy with anyway.

Putting some make up on my org-mode flashcards

2015-07-16T00:00:00+00:00

Lately I’ve been playing with org-drill, an extension to Emacs org-mode implementing a spaced repetition algorithm for flashcard drills. That is, org-drill lets you write flashcards in the form of org outlines plus special syntax, and have study sessions where the cards are presented to you using some special algorithms that should improve retention. The flashcards can have hidden text/images, present hints, have multiple faces, together with other useful settings.

Example org-drill flashcard.

One of the main points of org-drill is of course its integration in org-mode, letting you keep your flashcards together with your other plain-text notes. The flashcards are no different from any other org outline, apart from some special properties which org-drill saves in the :PROPERTIES: drawer, and some syntax to indicate parts of text that should be hidden in the flashcard (to prompt a recall on the side of the student), and hints (to help the recall).

I find org-drill very useful to have quick sessions to strengthen key facts from what I’m studying. But I wanted to integrate it better with my org-mode workflow. That is, I wanted the key facts that I keep in my notes to be treated as flashcards when needed, and as normal notes otherwise. The org-drill clozes (with my customized delimiters) look something like this:

The answer to the ultimate question of life, the universe and everything is !| 42 || six by nine |! , as computed by !| Deep Thought |!

If you are using org-drill and didn’t customize the delimiters, you’ll have [ and ] instead of !| and |! . You can probably already see that the !| fact || hint |! syntax can be a little annoying when reading the notes.

To solve this problem I’ve written some functions that get the special syntax out of the way when exporting the org-mode file, and when reading the notes as notes – not as flashcards. Elisp to the rescue!

Here’s a before-after comparison:

An example org file with and without the org-drill special syntax.

Much neater! As a plus, the special syntax is also hidden when viewing flash-cards, making them neater as well:

Org-drill flashcard with no delimiters in visible clozes.

And the export works just as well:

Snapshot of the exported pdf.

I’m pretty happy with the result. It is a small thing but it makes a huge difference in that I can just write my notes as I normally do, very quickly make them flashcards if I need to, and seamlessly go back and forth between the two representations depending on the way I’m studying.

I’d like to push this further, as this little trick only works when my notes are a sequence of distinct and well-organized little pieces of information. This happens to be the case for the probability notes I used as an example here, but it needn’t always be. Sometimes it makes more sense to maybe have long-running paragraphs in a more discursive form, depending on the type of information the notes are capturing. It would be great to use a special syntax to create flashcards from tokenized concepts inside the paragraph, and hide/show that as well when need be. Who knows, maybe this will be a side project I’ll actually manage to work on!

Anyway, here is the small section of code accomplishing what I’ve described (and also my first attempt to write non-trivial emacs lisp… I hope I’m doing it the right way). Note that this code doesn’t assume the use of my custom delimiters, the default delimiters are supported. From the countless times I stole elisp snippets from someone’s blog, I know that there’s somebody out there that will appreciate this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
(require 'org-drill)

;; remove clozes when exporting ;;
(defun gsc/drill-compute-cloze-regexp ()
"Same regular expression as in org-drill-cloze-regexp,
but adding a group for the first delimiter, so that it can be
distinguished easily in a match."
  (concat "\\("
          (regexp-quote org-drill-left-cloze-delimiter)
          "\\)\\([[:cntrl:][:graph:][:space:]]+?\\)\\(\\|"
          (regexp-quote org-drill-hint-separator)
          ".+?\\)\\("
          (regexp-quote org-drill-right-cloze-delimiter)
          "\\)"))

(defun gsc/drill-cloze-removal (backend)
  "Remove drill clozes in the current buffer.
BACKEND is the export back-end being used, as a symbol."
     (while (re-search-forward (gsc/drill-compute-cloze-regexp) nil t)
       ;; (Copy-pasted this from org-drill-el)
       ;; Don't delete:
       ;; - org links, partly because they might contain inline
       ;;   images which we want to keep visible.
       ;; - LaTeX math fragments
       ;; - the contents of SRC blocks
      (unless (save-match-data
                (or (org-pos-in-regexp (match-beginning 0)
                                       org-bracket-link-regexp 1)
                    (org-in-src-block-p)
                    (org-inside-LaTeX-fragment-p)))       
        (replace-match "\\2" nil nil))))

(add-hook 'org-export-before-processing-hook 'gsc/drill-cloze-removal)

;; hide clozes in text ;;
(defvar gsc/drill-groups-to-hide '(1 3 4) 
"Group 1 and 4 are the left and right delimiters respectively,
group 3 is the cloze hint.")

(setplist 'gsc/inv-cloze '(invisible t))

(defun gsc/hide-clozes-groups ()
  (save-excursion
    (goto-char (point-min))
    (let ((cloze-regexp (gsc/drill-compute-cloze-regexp)))
    (while (re-search-forward cloze-regexp nil t)
      (loop for group in gsc/drill-groups-to-hide do
            (overlay-put 
             (make-overlay (match-beginning group) (match-end group))
             'category 'gsc/inv-cloze))))))

(defun gsc/show-clozes-all ()
  (save-excursion
    (goto-char (point-min)) 
    (while (re-search-forward (gsc/drill-compute-cloze-regexp) nil t)
      (remove-overlays 
       (match-beginning 1) (match-end 4) 'category 'gsc/inv-cloze))))

(defun gsc/hide-show-clozes (arg)
"If called with no argument, hides delimiters and hints 
for org-drill clozes.
If called with the C-u universal argument, it shows them."
(interactive "p")
(case arg
  (1 (gsc/hide-clozes-groups))
  (4 (gsc/show-clozes-all))))

Floating point representation visualizer

2015-03-30T16:45:09+00:00

Hi everyone,

I’ve been teaching an introductory course on the Theory of Computation at Plymouth University. The topic of my last lecture was the representation of real numbers in computers, and the unavoidable errors introduced by any choice of representation. Particularly, the lecture was focused on the floating point representation of real numbers.

To make the nature of the float representation more intuitive to the students, I programmed a little visualizer which plots the numbers that can be represented with a floating point representation, given the number of bits in the exponential and in the mantissa. Within the visualizer, it is possible to change the number of bits in the exponential and the mantissa and observe how the set of representable numbers change.

It is also possible to sum two numbers, a and b, and observe how errors come to be and get the visual intuition of why they are unavoidable (hopefully).

– Here, the ipython notebook I used during my lecture.

– Here, the zipped executable for the visualizer.

– Here the python code (you will need python 2.7, with matplotlib and numpy libraries).

At the moment of writing the visualizer is still lacking some features. In particular, I’d like to make the representation more IEEE-754 compliant by adding support for signed 0’s, signed infinity, and NaN values. I’ll do that soon, but for the moment, I think the visualizer is already useful as it is.

Bat Navigator prototype

2014-07-18T16:48:00+00:00

Hi everyone,

from the last post, I added a third ultrasonic sensor to the project.
The purpose of this sensor is to detect obstacles like tables or chairs, that are too low to be detected by the other two sensors. If an obstacle is detected, then the pitch of the two piezo transducers goes down. At the moment I implemented it as an all-or-none detection, so the user doesn’t actually know how high the obstacle is, just that it is there.

I want to change it later so that the new central sensor has its own beeping using the two piezos, so I won’t post the code here, as it is just a temporary fix. The reason why I went with a temporary fix is that I actually couldn’t wait to build and test the glasses, see if they actually worked, and see how well they worked.

Today I built the glasses.

So, here I present you the Bat Navigator version 0.1!

Some close-ups of the glasses!

Nice, right? Me and my friend Valerio tested them out today.

Our testing procedure consisted of the following steps:

One of us wears the glasses, and closes his eyes.
The other one spins him around a few times.
The one who wears the glasses tries to walk around the building without hurting himself.
The other one makes sure that he succeeds in not dying falling down some stairs
A third person plays pranks on the one with the eyes closed (essential part of testing)

The Bat Navigator actually works much better than I expected, I was able to navigate the building, even if at a really slow pace. I got the hang of it almost immediately, recognizing walls and their orientation, corridors, and tables. I’m pretty confident that if I keep working on it, it could actually become a feasible way to navigate indoor environments without using eyesight.

It will be a while before I update the blog again, as the next steps will either involve adding another couple of sensors to the side and figure out a way to convey that information without producing information overload for the user, or building a second prototype where everything is mounted on the glasses (so something that can actually be used normally). Both of the options require a good amount of work, and I won’t have that kind of time in the near future.
Maybe I will continue working on this from the second half of August, but, until then, I’ll have to focus on my central projects.

In the mean time, I want to thank Ricardo for giving me precious advice on how to go about building this prototype, and Valerio for his many tips, ideas and help in testing the Bat Navigator.

Bye bye now, I’ll keep you posted! (In a while)

Let's dance to the sound of piezos

2014-07-14T15:43:00+00:00

Hi everyone.

As I wrote in the last post, I found out that vibration motors do not have much “expressive power”, not as much as I thought anyway. For this reason, I am switching to piezoelectric transducers, which offer a greater range of expression, are easy to control and wire up, and consume almost nothing.

At first, I wanted the piezos to produce a fixed pitch, and vary the volume of the sound as a function of the distance of the obstacles detected by the ultrasound sensors. Turns out, it is not that easy to do so with a piezoelectric transducer, as it would be with a buzzer.

Then I tried to vary the pitch of the produced sounds as a function of the distance of the obstacles, but I didn’t like the result. I thought it would be quite annoying to listen to this continuous change of pitch while using the glasses for some time. The good thing is that it worked quite well as a theremin!

(Sorry about the noise, but piezo transducers produce very low sounds)

In the end, I decided to keep volume and pitch fixed, and vary the beeping rhythm, like in parking sensors:

The circuit here is quite simple – the mess you can see in the video is actually just because I was too lazy to tidy everything up. Each ultrasonic sensor is connected to Ground, to the 5V pin, and to a single digital pin (one for each sensor). In fact, with the NewPing library, you can use a single pin to ping the sensor, and to receive the reading.
The piezo transducers were just connected each to a digital pin and to ground. I think I will add some resistance to adjust the volume later on, and maybe a linear potentiometer to control the volume manually.

Now I can guide you through the code I’ve used here.
I’ll explain the key functions and then show the complete code, I’m assuming you already know the basics for C coding.

void loop(){

curr_millis=millis(); //take time passed from arduino start

// if a millisecond has passed
if (curr_millis != prev_millis){
debugprint("starting.");
debugprintln(curr_millis);

debugprint("\n");
// ping ultrasound sensors and retrieve distances
ping_us_sensors(curr_millis);
l_read_array[reading_n] = distances[0]; //reading_n is incremented only when both sensors have been pinged
r_read_array[reading_n] = distances[1];

debugprint("now returning to loop \n");
debugprint("distances are: \n");
debugprintln(distances[0]);
debugprintln(distances[1]);

// average last READINGS readings
int l_sum = 0;
int r_sum = 0;
for (int i=0; i<READINGS; i++){
l_sum += l_read_array[i];
r_sum += r_read_array[i];
}
float l_ave_distance = l_sum/float(READINGS) - MIN_DISTANCE;
float r_ave_distance = r_sum/float(READINGS) - MIN_DISTANCE;
if(l_ave_distance<0) l_ave_distance=0; // make sure not to have negative distances
if(r_ave_distance<0) r_ave_distance=0;

debugprint("and average distances:\n");
debugprintln(r_ave_distance);
debugprintln(l_ave_distance);
debugprint("\n");

// make piezos beep depending on the distances of the obstacles
pulse_piezos(curr_millis, l_ave_distance, r_ave_distance);
prev_millis = curr_millis;
}
}

This is the function which is called continuously as the Arduino is turned on. What it does is:
– Ping the US sensors.
– Average out the last READINGS readings, where READING is a number defined before.
– Shift the distance down of MIN_DISTANCE. This is used so that the piezos can already beep at the highest speed when an obstacle is presented at MIN_DISTANCE.
– Make the piezos beep.
The first and last actions in this list are actually carried out by two functions which are external to the loop. Let’s have a look at the function which pings the sensors.

// ping the us sensors and retrieve the distances from obstacles
void ping_us_sensors(long curr_millis){
  
  debugprint("Entered ping_us_sensor fn\n");
  debugprint("time passed from last ping:");
  debugprintln(curr_millis - last_ping_time);
  
  // time in microseconds it takes for the ultrasonic sound wave
  // to travel from the ultrasonic sensor, hit an obstacle, and return
  unsigned int uS; 
  
  // some time must pass between a ping and the next one
  if(curr_millis - last_ping_time >= PING_INTERVAL){
    
    // if readings_n is equal to READINGS -> readings_n = 0
    // used to populate reading arrays 
    reading_n%=READINGS;
    
    // used to ping sequentially all US sensors (in this case, 2)
    pinger_id++;
    pinger_id %= US_NUM;
    
    debugprint("Pinging sensor number \n");
    debugprintln(pinger_id);
    
    uS = sonar[pinger_id].ping();
    last_ping_time = curr_millis;
    
    // compute distance
    distances[pinger_id] = uS/US_ROUNDTRIP_CM;
    // when the distance is too much, the sensor sends a 0
    // better to just set that to MAX_DISTANCE
    if (distances[pinger_id]==0) distances[pinger_id]=MAX_DISTANCE;
    
    debugprint("Distance: \n");
    debugprintln(distances[pinger_id]);
    
    if(pinger_id%US_NUM == 0)reading_n++;
  }
  
    debugprint("\n");
}

This function pings the sensors one at a time, retrieving the distance for each one. To do that, it makes sure that enough time has passed since the last ping (PING_INTERVAL).
Also, it both the US have been pinged, it updates the reading_n variable. That variable is used in the loop function to populate the reading arrays.

Now for the function that makes the piezos beep.

// This function makes the piezo beep based on the distance of the obstacles
void beep_piezos(long curr_millis, long l_ave_distance, long r_ave_distance) {
  
  int r_interval, l_interval;
  // from the distance, assign beep intervals for the piezo transducers
  l_interval = int(l_ave_distance*DIST_TO_BEEP_MULTIPLIER);
  r_interval = int(r_ave_distance*DIST_TO_BEEP_MULTIPLIER);
  
  debugprint("entering beep_piezos fn.\n");
  debugprint("time passed from last right beep:");
  debugprintln(curr_millis - r_last_beep);
  debugprint("time passed from last left beep:");
  debugprintln(curr_millis - l_last_beep);
  
  // if enough time has passed since the last beep of the right piezo,
  // a new beep is started in the right piezo
  if(curr_millis - r_last_beep > r_interval){
    debugprint("right beep sent\n");
    r_last_beep = curr_millis;
  }
  // controls the actual wave that constitues the beep
  if(curr_millis - r_last_beep < BEEP_DURATION){
    if(r_phase == R_BEEP_LOW_RATIO){
      digitalWrite(R_PIEZO, HIGH);
      r_phase=0;
    }
    else{
      digitalWrite(R_PIEZO,LOW);
      r_phase++;
    }
  }
  
  // if enough time has passed since the last beep of the left piezo,
  // a new beep is started in the left piezo
  if(curr_millis - l_last_beep > l_interval){
    debugprint("left beep sent\n");
    l_last_beep = curr_millis;
  }
  // controls the actual wave that constitues the beep
  if(curr_millis - l_last_beep < BEEP_DURATION){
    if(l_phase == L_BEEP_LOW_RATIO){
      digitalWrite(L_PIEZO, HIGH);
      l_phase=0;
    }
    else{
      digitalWrite(L_PIEZO,LOW);
      l_phase++;
    }
  }
  
  debugprint("\n");    
}

This function makes the piezo beep more or less frequently depending on the distance of the left and right obstacles.
When the right time interval has passed, the function starts a beep. A beep here is a square wave sent to a piezo for a certain PULSE_DURATION time.

The pitch of the beep sound is controlled by the R and L_BEEP_LOW_RATIO. Given that this function is called by the loop every 1 ms, if for example L_BEEP_LOW_RATIO = 2, then a series of HIGH – LOW – LOW is sent, where each signal lasts a millisecond.
The frequency of the resulting square wave is 1/3 of 1000 Hz (again, because the function is called every millisecond, 1000 times a second).

Finally the complete code, hopefully it should be easy enough now to read through.

#include

// pins
#define R_US 11 // right ultrasonic sensor pin
#define L_US 8  // left ultrasonic sensor pin
#define R_PIEZO 3 // right piezo pin
#define L_PIEZO 5 // left piezo pin

// parameters
#define US_NUM 2 // total number of US sensors
#define READINGS 4 // US readings to average
#define MIN_DISTANCE 20 // distance in cm which make the piezo beep like crazy
#define MAX_DISTANCE 300 // maximum distance for a valid reading from the US sensors
#define PING_INTERVAL 40 // interval between pings
#define BEEP_DURATION 30 // how much a beep lasts in ms
#define DIST_TO_BEEP_MULTIPLIER 5 // to adjust the beeping relative to distance
#define L_BEEP_LOW_RATIO 2 // this regulates the frequency of the note for the left piezo
#define R_BEEP_LOW_RATIO 1 //

// the next lines turn debugging on or off
//#define DEBUG
#ifdef DEBUG
  #define debugbegin(x) Serial.begin(x)
  #define debugprint(x) Serial.print(x)
  #define debugprintln(x) Serial.println(x)
#else
  #define debugbegin(x)
  #define debugprint(x)
  #define debugprintln(x)
#endif /*DEBUG*/


NewPing sonar[US_NUM] = {
  NewPing(R_US, R_US, MAX_DISTANCE), // NewPing setup.
  NewPing(L_US, L_US, MAX_DISTANCE) // NewPing setup.
};

// variables used to keep track of time
long curr_millis=0;
long prev_millis=0; 

// variables used to keep track of the phase of the square wave for each piezo
int r_phase = 0;
int l_phase = 0;

// keep track of the last time the piezos beeped
long r_last_beep = 0;
long l_last_beep = 0;


int reading_n = 0; // used to update l_read_array/right arrays
int pinger_id = 0; // used to alternate pinging between sensors
int l_read_array[READINGS]; // used to store and average readings
int r_read_array[READINGS]; 
long last_ping_time = 0; // last time a US was pinged

int distances[US_NUM]; // stores the distances retrieved by the US sensors

// This function makes the piezo beep based on the distance of the obstacles
void beep_piezos(long curr_millis, long l_ave_distance, long r_ave_distance) {
  
  int r_interval, l_interval;
  // from the distance, assign beep intervals for the piezo transducers
  l_interval = int(l_ave_distance*DIST_TO_BEEP_MULTIPLIER);
  r_interval = int(r_ave_distance*DIST_TO_BEEP_MULTIPLIER);
  
  debugprint("entering beep_piezos fn.\n");
  debugprint("time passed from last right beep:");
  debugprintln(curr_millis - r_last_beep);
  debugprint("time passed from last left beep:");
  debugprintln(curr_millis - l_last_beep);
  
  // if enough time has passed since the last beep of the right piezo,
  // a new beep is started in the right piezo
  if(curr_millis - r_last_beep > r_interval){
    debugprint("right beep sent\n");
    r_last_beep = curr_millis;
  }
  // controls the actual wave that constitues the beep
  if(curr_millis - r_last_beep < BEEP_DURATION){
    if(r_phase == R_BEEP_LOW_RATIO){
      digitalWrite(R_PIEZO, HIGH);
      r_phase=0;
    }
    else{
      digitalWrite(R_PIEZO,LOW);
      r_phase++;
    }
  }
  
  // if enough time has passed since the last beep of the left piezo,
  // a new beep is started in the left piezo
  if(curr_millis - l_last_beep > l_interval){
    debugprint("left beep sent\n");
    l_last_beep = curr_millis;
  }
  // controls the actual wave that constitues the beep
  if(curr_millis - l_last_beep < BEEP_DURATION){
    if(l_phase == L_BEEP_LOW_RATIO){
      digitalWrite(L_PIEZO, HIGH);
      l_phase=0;
    }
    else{
      digitalWrite(L_PIEZO,LOW);
      l_phase++;
    }
  }
  
  debugprint("\n");
    
}

// ping the us sensors and retrieve the distances from obstacles
void ping_us_sensors(long curr_millis){
  
  debugprint("Entered ping_us_sensor fn\n");
  debugprint("time passed from last ping:");
  debugprintln(curr_millis - last_ping_time);
  
  // time in microseconds it takes for the ultrasonic sound wave
  // to travel from the ultrasonic sensor, hit an obstacle, and return
  unsigned int uS; 
  
  // some time must pass between a ping and the next one
  if(curr_millis - last_ping_time >= PING_INTERVAL){
    reading_n%=READINGS; //used to populate the distance arrays
    
    // used to ping sequentially all US sensors (in this case, 2)
    pinger_id++;
    pinger_id %= US_NUM;
    
    debugprint("Pinging sensor number \n");
    debugprintln(pinger_id);
    
    uS = sonar[pinger_id].ping();
    last_ping_time = curr_millis;
    
    // compute distance
    distances[pinger_id] = uS/US_ROUNDTRIP_CM;
    // when the distance is too much, the sensor sends a 0
    // better to just set that to MAX_DISTANCE
    if (distances[pinger_id]==0) distances[pinger_id]=MAX_DISTANCE;
    
    debugprint("Distance: \n");
    debugprintln(distances[pinger_id]);
    
    if(pinger_id%US_NUM == 0)reading_n++;
  }
  
    debugprint("\n");
}
  

void setup() {
  
  debugbegin(115200);
  
  // set piezo pins to OUTPUT
  pinMode(R_PIEZO,OUTPUT);
  pinMode(L_PIEZO,OUTPUT);
  
  //initializing the arrays
  for (int i=0; i<READINGS; i++){
    l_read_array[i] = 0;
    r_read_array[i] = 0;
  }
}

void loop(){
  
  curr_millis=millis(); //take time passed from arduino start
    
  // if a millisecond has passed
  if (curr_millis != prev_millis){
    debugprint("starting.");
    debugprintln(curr_millis);

    debugprint("\n");    
    // ping ultrasound sensors and retrieve distances
    ping_us_sensors(curr_millis);
    l_read_array[reading_n] = distances[0]; //reading_n is incremented only when both sensors have been pinged 
    r_read_array[reading_n] = distances[1];
    
    debugprint("now returning to loop \n");
    debugprint("distances are: \n");
    debugprintln(distances[0]);
    debugprintln(distances[1]);
    
    
    // average last READINGS readings
    int l_sum = 0;
    int r_sum = 0;
    for (int i=0; i<READINGS; i++){
      l_sum += l_read_array[i];
      r_sum += r_read_array[i];
    }
    float l_ave_distance = l_sum/float(READINGS) - MIN_DISTANCE;
    float r_ave_distance = r_sum/float(READINGS) - MIN_DISTANCE;
    if(l_ave_distance<0) l_ave_distance=0; // make sure not to have negative distances
    if(r_ave_distance<0) r_ave_distance=0;
    
    debugprint("and average distances:\n");
    debugprintln(r_ave_distance);
    debugprintln(l_ave_distance);
    debugprint("\n");
      
    // make piezos beep depending on the distances of the obstacles
    beep_piezos(curr_millis, l_ave_distance, r_ave_distance);
    prev_millis = curr_millis;
  }
}

OK, that’s it for this post, hope you enjoyed it! For the next post, I will probably implement a third sensor to sense low-frontal objects (but I want to use only two piezo transducers).

I’ll keep you posted!

Adding the ultrasonic sensor

2014-06-29T11:37:00+00:00

Hi everyone.

In this post I’ll show the new circuit from the addition of the ultrasonic sensor, the code I uploaded to the Arduino to drive it, and a video to demonstrate its functioning.

Here is the circuit.

Nothing much going on here, the HC-SR04 needs 5V, so it's connected to the 5V pin, and obviously to ground. The Trig and Echo pins of the sensor are connected directly to two of the digital pins of the Arduino, D11 and D12.

An ultrasonic sensor works like this:

1 – You send a short pulse to trigger the vibration of the transducer. Like ringing a bell.
2 – The sound waves produced by the transducer travel, hit something and are reflected, travelling back to the sensor, and making the transducer vibrate again. For this sensor there are two separate transducers, one for production, one for reception.
3 – The transducer, because of the vibration, produces a current, which is a function of the distance travelled by the sound waves.
4 – Read out that current and, knowing the function f(current)=distance, you have your distance!

The sensor I’m using is quite smart, and instead of returning a continuous current to be read out and processed – the function f(current) = distance is easily nonlinear – it returns a digital signal, where the duration of the high value is linearly proportional to the distance of the sensed obstacle. Which makes things much easier.

To make things even easier, there’s a nice ready-to-use ultrasonic sensor library for Arduino which supports the HC-SR04. Cool!

Here’s a video demonstration of the project at this point.

And here’s the code I used, adapted from the basic example of the New Ping library.

// ------------------------------------------------------------------------
// This code reads the ultrasonic sensor about 20 times per second, and makes the motor vibrate 
// in inverse proportion to the distance (the less the distance, the more the vibration).
// Also, to take out some noise, the last six readings are averaged.
// The code is adapted from the New Ping library basic example.
// ------------------------------------------------------------------------

#include 

#define TRIGGER_PIN  12  // Arduino pin tied to trigger pin on the ultrasonic sensor.
#define ECHO_PIN     11  // Arduino pin tied to echo pin on the ultrasonic sensor.
#define MAX_DISTANCE 200 // Maximum distance we want to ping for (in centimeters). Maximum sensor distance is rated at 400-500cm.

#define MOTOR_PIN 3 // Arduino pin tied to motor control.

#define N_READINGS 6 // Number of readings to average to get rid of some noise.

NewPing sonar(TRIGGER_PIN, ECHO_PIN, MAX_DISTANCE); // NewPing setup of pins and maximum distance.

int read_array[N_READINGS]; //Array that stores the last N_READINGS to average

int read_count = 0;
float ave_distance = 0;

void setup() {
  for (int i=0; i&lt;N_READINGS; i++){ // Initialize readings to 0
    read_array[i] = 0;
  }
}

void loop() {
  
  if(read_count&gt;N_READINGS-1) read_count=0;
  
  delay(50); // Wait 50ms between pings (about 20 pings/sec). 29ms should be the shortest delay between pings.
  unsigned int uS = sonar.ping(); // Send ping, get ping time in microseconds (uS).
  
  int distance = uS / US_ROUNDTRIP_CM; // Compute distance from known constant (defined in New Ping library).
  
  read_array[read_count] = distance; // Populate readings array.
  
  int sum = 0; // needed for average
  for (int i=0; i&lt;N_READINGS; i++){
    sum += read_array[i];
  }
  ave_distance = sum/float(N_READINGS); //average of last n readings
  
  
  if (ave_distance &lt; 100 && distance!=0){ // 0 corresponds also to distance over sensor limit
    analogWrite(MOTOR_PIN, 255-int(ave_distance*1.6)); //1.6 is found empirically, by experimenting
  }
  else{
    analogWrite(MOTOR_PIN, 0); // needed, otherwise the pin will keep sending the last value sent
  }
  
  read_count++;
}

The code is pretty straightforward, there isn’t much to explain I guess. The 1.6 value I used to adjust the amount of vibration of the motor was obtained experimentally, by trial and error. It is high enough so that the motor still vibrates when the distance is at the maximum. If a lower value were to be used, then the motor wouldn’t receive enough current to start vibrating at distances near the limit.I’m starting to think that maybe having a vibration motor is not that great an idea. Its expressive range is very limited, as you can listen from the video, the vibration doesn’t seem to change much from the nearest distance to the farthest. Also, it consumes a great deal power, which I could use to add more sensors to the glasses.
To explore other options, I bought two piezoelectric transducers, which should need very very little power, and much more control over their vibration is possible.For the next post, I’ll experiment with the piezoelectric tranducers. I think it won’t be a circuit post, as a piezoelectric transducer only needs a resistor, and nothing more, as far as I understand. Instead, given that there’s more things you can do with a piezoelectric transducer, like controlling amplitude and frequency of vibration, more attention will be given to the code.I’ll keep you posted!