
Data Structures
Trie
Available Words Using A Trie Datastructure
In this first post we will be building a project using a Trie data structure. This is an exercise from the book called "A Common-Sense Guide to Data Structures and Algorithms" by Jay Wengrow. The original implementation was written in Ruby but we will rewrite it using JavaScript. The exercise in the book built this project to demonstrate an autocomplete feature. I will rework this slightly and it will be used to show words in a library as you type.
A Trie (pronounced like try) is a tree-like data structure usually used for storing a dynamic set of strings. Before we get started building I will briefly explain the Trie data structure
.png)
In the image above there is a root node. This root node will contain a property called children (these are identified by the arrows). These children point to the next node and that node contains its own children property that points to the next node, you continue that pattern until you hit an *, which indicates the end of the word.
Walking through the Trie from the top root node. The root node contains R and C in its children property. We will run down the left side that starts with R. This child node has E in its children which means so far we have "RE". We haven't hit an * yet which indicates the end of the word so we continue. E points to a node that has D and A as its children so we have "RED" and "REA". We check each nodes children, the D points to an * indicating the end of the word "RED" but A points to a another Node with D in its children so we continue down the right side. D points to an * signaling the end of the word. We also have "CAT" on the other side and it follows the same rules as the left side.
Creating the Trie Class
class TrieNode {
constructor() {
this.children = {};
}
}
class Trie {
constructor() {
this.root = new TrieNode();
}
}
In this code we are just creating the Trie Class. Its fairly straight forward, We just create a TrieNode class and initialize it with a children property that is an empty object then create the actual trie with the root set to a new node.
Our First Function: Insert
class Trie {
constructor() {
this.root = new TrieNode();
}
insert(word) {
let currentNode = this.root;
for (let char of word) {
if (!currentNode.children[char]) {
currentNode.children[char] = new TrieNode();
}
currentNode = currentNode.children[char];
}
currentNode.children["*"] = new TrieNode();
}
}
Here we are creating the insert method on the Trie class. This is how you actually add words to the Trie, which acts as a sort of dictionary. This takes in a string (word) as a variable and initializes the 'currentNode' to the root. We iterate over each character in the word and check if the current nodes 'children' property contains this node. Remember from the chart above that "READ" and "RED" shared the letter R. If the current letter already exists we do not duplicate it. If the letter does not currently exist we add it to the children object. This makes sense because there is no need to have 2 letter R's one for Read and one for Red. We can just have one letter R that they both use.
Regardless if the character exists we increment the current node to the child so we can repeat the process again with the next letter. After the loop has finished iterating through each character in the string the signal is added to the end of the word. This is so that when we get to the end of the string we know we are at the end of the word so we add the *. We don't actually check to see if the word is valid so if you insert a fake or misspelled word to the trie it will still be added and the end indicated with an *. Thats all you have to do to add a string to the Trie data structure.
Search Method
class Trie {
constructor() {
this.root = new TrieNode();
}
search(word) {
let currentNode = this.root;
for (let char of word) {
if (currentNode.children[char]) {
currentNode = currentNode.children[char];
} else {
return false;
}
}
return currentNode;
}
}
The next method we will add is the search method. This takes in a string and sets the current node to the root. The for loop iterates over every letter in the string that was passed in. We then check if that character is located in the nodes children. If we find the letter we set the current node to the child node if we dont find the letter that means the word does not exists and we return false. If we make it all the way through the loop that means we have found the word and instead of returning true we return the current node. We do this because we will use the current node in a another method later on.
Final Methods: Find and Collect
class Trie {
constructor() {
this.root = new TrieNode();
}
collectAllWords(root, word = "", words = []) {
for (let char in root.children) {
if (char === "*") {
words.push(word);
} else {
let newWord = word + char;
this.collectAllWords(root.children[char], newWord, words);
}
}
return words;
}
findWords(prefix) {
let currentNode = this.search(prefix);
if (!currentNode) {
return false;
}
return this.collectAllWords(currentNode);
}
}
We only have 2 methods left to build out until we have our working Trie class. We will start with our collect method called CollectAllWords. We will use recursion in this method so we initilize the method with the current node (root), an empty string for the current word, and lastly an array so we can store all the words we find. This collectAllWords method takes in the rootnode and iterates over all the children in that node. If we were to just pass in the root node this function would find and collect all words in the entire Trie. This is actually where we will be using the 'currentNode' returned in our search function, which is the last node in the string we passed in. For example in our root node in our first diagram loop would iterate over "R" and "C". The function checks if the current character is the end of the word if its not we add the current character to our word string. We then pass all this info into a recursive call and do it all over again until we hit the *. At the end of the word we push our current word into the words array.
We are basically building a string here one letter at a time. In the Word "CAT" we pass in the root node and check its children. We are ignoring the R child node for now but we get to the letter "C" in the loop. We add that to the string and pass the C node (with its children), the newWord we have so far wich newWord="C" and our empty array of words. Then we do this recursively over again. We look at the children, check if its the end of the word, if not we add the character to the current word string which is "C". We Do our recursive call again with the child node as the root and "CA" as the word. We do this again as we build the word. Check its children, add that letter to the the string and increment the root node. As you can see we are building the string and incrementing the current node to the child node as we traverse the Trie. We keep doing this until we reach an *. at that point we take what we have saved in the word string and push it onto the array. After all our recursive calls have finished we return the array of words. If you are confused by the recursion part I recommend going online and doing a bit of studying on recursion in general. It can be a bit confusing at first but its essentially another way to do a loop. We have a function that repeats itself until it meets a specific case and then stops.
Our final method we call our search function with the prefix, this prefix is the letters we are typing until we type out the actual word. If we get false back from the search function that means it does not exist in our Trie and we end return false. If we do find something we know it exists so we call our collect all Words method with what we return from search. If you remember Search returns the last node in the string so this is our "root" node in our collect all words function. So in our diagram example if the prefix is "RE" we start the collect all words method with the root node of "E" and then we start iterating over its children "D" and "A." This will return all words in our Trie that start with "RE"
Now to actually use this thing
on CodePen.
I have an event listener on the input that each time you type in the input field it runs our find words method with the input value. You get back an array with all the words and create an unordered list with the values and then display them in a basic way. Tab over to the javascript section to see exactly how I put this together.
To use the Trie class you can add the words to the trie by typing in a word into the input and then clicking the button. After you add the word try searching for it by typing into the input field.
I hope you enjoyed learning about this as much as I did. Feel free to take the code and play around with it however you want. I will also link to the GitHub repo here: https://github.com/nmelms/available-words
If you have any specific questions feel free to Email me!