Growing Words on Tries

Give me a place to stand and I will move the Earth.

Thus says Archimedes. Leverage. Brute force can only get you somewhere but true power is in leverage. But leverage does not come easily. It involves a lot of thinking and training. In this post, we are going to explore how we use leverage in order to solve a common programming problem.

Suppose I have an array of words or phrases (or String, as used in programming) that I want to search. The search is incremental in the sense that whenever I press the first few letters of the word, I am given an incremental search result. These first few letters can occur in any part of the string. I want to develop an algorithm to to search the array of strings in an efficient manner.

Let’s say we have an array of words: “far”, “for”, “force”, “forced”, “forces”, “force field”. Suppose I want to search for the word “force”. What I do is type in sequence “f”, “o”, “r”,… When I type “f”, the algorithm will then search for all strings containing “f”.

trie_blog2

When I type “o”, it will search for all strings containing “fo”. When I type “r”, it will search for all strings containing “for”, and so on.

trie_blog2.2

The first thing that comes to mind is to match each string in the array for each keystroke that we type. All those that match will then be collected and will be used as the new array where we search the new substring formed when the next letter is pressed. If there are a million strings in the array, we will have to search the entire array when the first string is pressed. Hopefully, we will get a small subset that will help us find quickly what we are looking for. This is a brute force approach because it will have to compare a million times just to come up with the set of matches.

Leverage

Nature has always be a source of innovations for science and engineering. We turn to the birds to learn how planes can fly. We turn to the fishes to help our submarines dive. We turn to ants to help us find the shortest path to food. For this problem, we turn to trees.

Imagine representing the array of words as leaves of a tree as shown in the diagram below.

TrieDiagram-v2

In the diagram, we have 6 words, namely: “far”, “for”, “force”, “forced”, “forces”, “force field” that are arranged in a tree structure. Each non-leaf node contains a letter and the leaf nodes contain the words. Typing the first letter “f” already focuses on all words containing “f” thereby preventing the entire array to be traversed. After forming the sequence “for”, we immediately focus on the words “force”, “forced”, “forces”, and “force field”. The more we type letters, the more focused we are on the item we are searching. This is a very efficient search structure. Thanks to the trees for the inspiration.
http://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js
https://ajax.googleapis.com/ajax/libs/angularjs/1.0.8/angular.min.js
http://bobbycorpus.net/wp-content/uploads/2013/09/trie-min2.js

To appreciate this structure, you can find below an example that uses this structure to power the word search. In the text box below, type the word “fr”, without quotes, and you should see a filtering of the words.

Enter Search String

  • {{ result }}

This is an example of what is known as a typeahead and is very convenient when you are searching for a specific word or phrase in a long list. Thanks also to the javascript technology, we are able to immediately see our algorithm in action.