An Introduction to Linked Lists and Nodes

TAD

Introduction

This subject can be a little difficult to understand especially if you haven't used nodes before, or don't understand the powerful advantages that they create.

The simplest way to think of "nodes" is like a road sign where each road is a "link" to another road sign. The node is just a small data structure with a number of address-pointers (the "links") stored inside it, the "link" is just the memory address of the next node.

I always find that using some paper and a pen/pencil makes explaining things one million times easier than trying to use words, so I will try to use ASCII diagrams as much as possible, and try to make them look pretty.

Terminology

"Node": A small data structure which is used to store "links" to other nodes. In this article I will used boxes to represent nodes.

"Link": A variable which is used to point to somewhere in memory. I will use lines and arrow characters to represent links to items in memory, this should help make them more clear to understand.

"Linked List": A number of nodes which are connected together like a metal chain.

"Chain": This is used to mean the same as a linked list, where a node is the point at which two chain links cross. You can think of the actual metal loops as being the "next" and "previous" links for a particular node.

Why use a linked-list?

Well, unlike a continuous list in memory, like a table of bytes, linked-lists can describe lists which are scattered throughout memory and can be used to describe circular lists. They also make inserting, deleting and sorting items much, much, much faster.

The vital part of any linked-list is the "node". This is just a collection of "links" stored together in a small data structure.

The number of links for a node depends on the task at hand, but usually there is just 2 links per node:

"Previous Link": This is used to navigate our way back to the previous item in the chain of linked items.

"Next Link": This is used - yeah, you guessed it - to find the next item in the list.

Using two links in this way will make inserting items and deleting items much easier later on. Remember the list is a one-dimensional object in which we can move forwards (by following the "next" links) or move backwards (by following the "previous" links). But there is no reason why you can't extend this technique to 2, 3 or N dimensions, just add more links for each node.

What does a Node look like?

Well, as I have said before it is best to think of a node as being the junction at a road where a number of roads meet. The junction is the "node" and the roads are the "links".

Say we have four nodes labelled A, B, C and D. For each node we have a previous link ("Prev") and a next link ("Next"). To make things easy I won't use memory address, instead I will using A, B, C and D to refer to the address of each of the 4 nodes.

        Node                Node                Node                Node
   <---  A   --->      <---  B   --->      <---  C   --->      <---  D   --->

     Prev   Next         Prev   Next         Prev   Next         Prev   Next
   ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐
   │xxxx ││  B  │      │  A  ││  C  │      │  B  ││  D  │      │  C  ││xxxx │
   └------└------      └------└------      └------└------      └------└------

The 'xxxx' entries can be ignored as they are not important for the time being.

You should be able to see that we can navigate our way from node to node by using the "Prev" or "Next" links.

Deleting an item from the linked-list.

To remove an item from the chain we need to update the links of the previous and next nodes from our current one.

Say we wanted to delete node B.

                            Node
                       <---  B   --->

                         Prev   Next
                       ┌-----┐┌-----┐
                       │  A  ││  C  │
                       └------└------
        Node                                    Node                Node
   <---  A   --->                          <---  C   --->      <---  D   --->

     Prev   Next                             Prev   Next         Prev   Next
   ┌-----┐┌-----┐                          ┌-----┐┌-----┐      ┌-----┐┌-----┐
   │xxxx ││  B  │                          │  B  ││  D  │      │  C  ││xxxx │
   └------└------                          └------└------      └------└------

To close the gap in the linked list, we need to update Node A "Next" and Node C "Prev" links. We should then end up with this:

        Node                                    Node                Node
   <---  A   --->                          <---  C   --->      <---  D   --->

     Prev   Next                             Prev   Next         Prev   Next
   ┌-----┐┌-----┐                          ┌-----┐┌-----┐      ┌-----┐┌-----┐
   │xxxx ││  C  │                          │  A  ││  D  │      │  C  ││xxxx │
   └------└------                          └------└------      └------└------
            ***                              ***

In pseudo-code this would be:

              p = node[B].prev        ; get the previous & next links
              n = node[B].next        ; from our current node, B.

              node[p].next = n        ; update node p "next" --> node n

              node[n].prev = p        ; update node n "prev" --> node p

In 80x86 this would look like this:

              [DS:BX] --> node B

              MOV SI, [BX+node.Prev]  ; SI = p
              MOV DI, [BX+node.Next]  ; DI = n

              MOV [SI+node.Next], DI  ; update node p "next" --> node n
              MOV [DI+node.Prev], SI  ; update node n "next" --> node p

Where the node structure has been previously defined:

 node        STRUC
  Next       dw      ?
  Prev       dw      ?
 node        ENDS

Inserting an item into the linked-list.

Let's suppose we wanted to insert node B (the one we just removed) back into the linked-list. But this time let's insert it between node C and node D.

                                                Node
                                           <---  B   --->

                                             Prev   Next
                                           ┌-----┐┌-----┐
                                           │  A  ││  C  │
                                           └------└------
        Node                Node                                    Node
   <---  A   --->      <---  C   --->                          <---  D   --->

     Prev   Next         Prev   Next                             Prev   Next
   ┌-----┐┌-----┐      ┌-----┐┌-----┐                          ┌-----┐┌-----┐
   │xxxx ││  B  │      │  B  ││  D  │                          │  C  ││xxxx │
   └------└------      └------└------                          └------└------

This time we need two pieces of information, the node to be inserted (node B) and the position where it should be inserted (node C). We want to end up with this linked-list:

        Node                Node                Node                Node
   <---  A   --->      <---  C   --->      <---  B   --->      <---  D   --->

     Prev   Next         Prev   Next         Prev   Next         Prev   Next
   ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐
   │xxxx ││  C  │      │  A  ││  B  │      │  C  ││  D  │      │  B  ││xxxx │
   └------└------      └------└------      └------└------      └------└------
                                ***                              ***

In pseudo-code this would be:

              n = node[C].next        ; get next node n from position

              node[C].next = B        ; update node C "next" --> node B
              node[B].prev = C        ; update node B "prev" --> node C

              node[B].next = n        ; update node B "next" --> node D
              node[n].prev = B        ; update node D "prev" --> node B

In 80x86 this would be:

              [DS:BX] --> node B
              [DS:SI] --> node C (position to insert from)

              MOV DI, [BX+node.Next]  ; DI = n

              MOV [SI+node.Next], BX  ; update node C "next" --> node B
              MOV [BX+node.Prev], SI  ; update node B "prev" --> node C

              MOV [BX+node.Next], DI  ; update node B "next" --> node D
              MOV [DI+node.Prev], BX  ; update node D "prev" --> node B

Why bother with nodes?

All this updating of node links looks like a lot of work, but in fact it can save a great deal of time. Suppose that each node was used before a 50 Kb block of data and we have 100 such blocks in memory. Let's suppose we wanted to delete the 2nd block, using the normal insert/delete method we would need to move 98 blocks which is 4900 Kb!!!

But using the node method we only have to update two links (in the 1st and 3rd nodes) which means about 8 bytes, quite a saving.

The real power in linked-list and nodes is the fact that they can be ANYWHERE in memory, they don't have to be in a continuous order, the "links" make this possible. And it is these links which makes sorting or memory-allocating so much faster.

When allocating a block of memory some extra bytes are added to the amount needed, these are used to build a node structure. This new block of memory is added to all the others using the node to tie them all together and keep track of what areas of memory have already been allocated.

Going around in circles.

Because the nodes allow neighbouring items to be located anywhere in memory we can have them scattered throughout memory. If we point the links in the last node back to the first then we have a circular chain.

Moving around this circular chain does mean we can visit every possible node simply by following a single link, either "next" or "previous" link.

        Node                Node                Node                Node
   <---  A   --->      <---  B   --->      <---  C   --->      <---  D   --->

     Prev   Next         Prev   Next         Prev   Next         Prev   Next
   ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐      ┌-----┐┌-----┐
   │  D  ││  B  │      │  A  ││  C  │      │  B  ││  D  │      │  C  ││  A  │
   └------└------      └------└------      └------└------      └------└------
     ***                                                                ***

Look at the above diagram and nodes A and D, their links close the circle making this a circular linked-list.

Try following the 'next' link in each node. No matter at which node you start as you always visit each and every node in this list. The same is also true for the 'prev' links.

Now, there is a problem.

How do we know when we have reached the end of the list?

The simple solution is to store the node at which you start from and follow the links until you encounter the starting node again.

Problems.

Perhaps the biggest problem with linked-lists as that they are sequential data structures. There is NO way to randomly access the Nth item. Instead you must follow N links and this can be slow depending on the value of N.

Closing Words

I advise any newbie coder to spend some time learning about nodes and linked-lists because of their sheer speed and the fact that binary and multi-way trees rely on these nice data structures.

Good Luck.

Regards

TAD #:o)