Greedy Approach

Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit.

Where Greedy Algorithms are used ?

Standard Greedy Algorithm Problems:

  1. Kruskal’s Minimum Spanning Tree (MST):
    • Create a MST by picking edges one by one.
    • The Greedy Choice is to pick the smallest weight edge that doesn’t cause a cycle in the MST constructed so far.
  2. Prim’s Minimum Spanning Tree (MST):
    • Create a MST by picking edges one by one.
    • Maintain two sets: a set of the vertices already included in MST and the set of the vertices not yet included.
    • The Greedy Choice is to pick the smallest weight edge that connects the two sets.
  3. Dijkstra’s Shortest Path (SPT):
    • The Dijkstra’s algorithm is very similar to Prim’s algorithm.
    • The shortest path tree is built up, edge by edge.
    • Maintain two sets: a set of the vertices already included in the tree and the set of the vertices not yet included.
    • The Greedy Choice is to pick the edge that connects the two sets and is on the smallest weight path from source to the set that contains not yet included vertices.
  4. Huffman Coding:
    • A loss-less compression technique.
    • It assigns variable-length bit codes to different characters.
    • The Greedy Choice is to assign least bit length code to the most frequent character.
  5. Approximation for Hard optimization problems:
    • The greedy algorithms are sometimes also used to get an approximation for Hard optimization problems.
    • Example: Traveling Salesman Problem is a NP-Hard problem.
    • A Greedy choice for this problem is to pick the nearest unvisited city from the current city at every step.
    • This solutions don’t always produce the best optimal solution but can be used to get an approximately optimal solution.

Standard Greedy Algorithm Problems

1. Activity Selection Problem***

Problem:

Given n activities with their start and finish times.

Select the maximum number of activities that can be performed by a single person, assuming that a person can only work on a single activity at a time.

Example-1: Consider the following 3 activities sorted by finish time.

start[] = {10, 12, 20}

finish[] = {20, 25, 30}

Output: (10, 20) (20, 30)

Example-2: Consider the following 6 activities sorted by finish time.

start[] = {3 , 0 , 5 , 8 , 5, 1}

finish[] = {4 , 6 , 9 , 9 , 7, 2}

Output: (1, 2) (3, 4) (5, 7) (8, 9)

Algorithm:
Implementation
def activity_selection(start, finish):
    activities = []
    n = len(start)
    for i in range(n):
        activities.append((start[i], finish[i]))
    
    # Sort the activities by finish time in increasing order and if finish times same then start time in decreasing order
    activities.sort(key=lambda k: (k[1], -k[0]))

    print("Selected Activities:")
    prev_finish = 0
    for current_start, current_finish in activities:
        if(current_start >= prev_finish):
            print("({}, {})".format(current_start, current_finish))
            prev_finish = current_finish



print("Example-1:")
start =  [10, 12, 20] 
finish = [20, 25, 30] 
activity_selection(start, finish)

print("\nExample-2:")
start =  [3 , 0 , 5 , 8 , 5, 1] 
finish = [4 , 6 , 9 , 9 , 7, 2] 
activity_selection(start, finish)


# Time : O(n log n) if input activities not sorted. 
#      : O(n) if input activities are sorted.
# Auxilliary Space: O(n) :-> Creating a new array but can be done in O(1)

Output:

activity_selection_output

Complexity:



2. Minimum Number of Platforms Required***

Problem:

Given arrival and departure times of all trains that reach a railway station, find the minimum number of platforms required for the railway station so that no train waits.

Example:

Input:

arr[] = {9:00, 9:40, 9:50, 11:00, 15:00, 18:00}

dep[] = {9:10, 12:00, 11:20, 11:30, 19:00, 20:00}

Output: 3

Algorithm:
Implementation:
def min_number_of_platforms_required(arrivals, departures):
    n = len(arrivals)
    arrivals.sort()
    departures.sort()

    i =0; j=0; count = 0; platforms =0
    while(i<n and j<n):
        if(arrivals[i] < departures[j]):
            count += 1
            i += 1
        else:
            if(count > platforms):
                platforms = count
            count -= 1
            j += 1

    print("Minimum Number of Platforms required = {}".format(platforms))



print("Example-1:")
arrivals    = [9.00,  9.40, 9.50,  11.00, 15.00, 18.00]
departures  = [9.10, 12.00, 11.20, 11.30, 19.00, 20.00]
min_number_of_platforms_required(arrivals, departures)


# Time Complexity : O(n log n) if arrivals & departures are not sorted. 
#                 : O(n) if arrivals & departures are sorted.
# Auxilliary Space: O(1)

Output:

min_platform_required_output

Complexity:



3. Huffman Coding***

Algorithm to build Huffman Tree:

Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.

  1. Create a leaf node for each unique character and build a min heap of all leaf nodes.
  2. Extract two nodes with the minimum frequency from the min heap.
  3. Create a new internal node with frequency equal to the sum of the two nodes frequencies.
    • Make the first extracted node as its left child and the other extracted node as its right child.
    • Add this node to the min heap.
  4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree is complete.
Build Huffman Tree Example

Example:

character Frequency

​ a ————————> 5

​ b ————————> 9

​ c ————————> 12

​ d ————————> 13

​ e ————————> 16

​ f ————————> 45

Step-1: Build a min heap that contains 6 nodes where each node represents root of a tree with single node.

Step-2: Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9 = 14.

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements.

character Frequency

​ c ————————> 12

​ d ————————> 13

Internal Node ———> 14

​ e ————————> 16

​ f ————————> 45

Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 + 13 = 25.

character Frequency

Internal Node ———> 14

e ————————> 16

Internal Node ———> 25

f ————————> 45

Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30.

Now min heap contains 3 nodes.

character Frequency

Internal Node ———>25

Internal Node ———> 30

​ f ————————> 45

Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 + 30 = 55.

huffman_4

Now min heap contains 2 nodes.

character Frequency

f ———————> 45

Internal Node ——>55

Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100

huffman_5

Now min heap contains only one node.

character Frequency

Internal Node ——>100

Since the heap contains only one node, the algorithm stops here.

Algorithm to print codes from Huffman Tree:
  1. Traverse the tree formed starting from the root and maintain a code starting with empty string.
  2. While moving to the left child, add 0 to the code string and while moving to the right child, add 1 to code string.
  3. Print the code when a leaf node is encountered.

huffman_coding

The Codes are as follows:

character Codes

​ f —————> 0

​ c —————> 100

​ d —————> 101

​ a —————> 1100

​ b —————> 1101

​ e —————> 111

Implementation
import heapq


class HeapNode:
    def __init__(self, char, frequency):
        self.char = char
        self.frequency = frequency
        self.left = None
        self.right = None
    
    def __lt__(self, other):
        return self.frequency < other.frequency


def build_huffman_tree(chars, frequencies):
    # Create a leaf node for each unique character and build a min heap of all leaf nodes
    n = len(chars)
    custom_heap = []
    for i in range(n):
        heap_node = HeapNode(chars[i], frequencies[i])
        heapq.heappush(custom_heap, heap_node)
    
    # Extract two nodes with the minimum frequency from the min heap and start merging
    while(len(custom_heap) > 1):
        node_left  = heapq.heappop(custom_heap)
        node_right = heapq.heappop(custom_heap)
        merged_node = HeapNode(None, node_left.frequency+node_right.frequency)
        merged_node.left = node_left
        merged_node.right = node_right
        heapq.heappush(custom_heap, merged_node)
    
    # Root Node of the Heap
    root_node = heapq.heappop(custom_heap)

    return root_node


def print_huffman_code(node, code):
    if(node.left == None and node.right == None):
        print("{}: {}".format(node.char, code))
        return
    
    print_huffman_code(node.left, code+"0")
    print_huffman_code(node.right, code+"1")


def generate_huffman_code(chars, frequencies):
    root_node = build_huffman_tree(chars, frequencies)
    print_huffman_code(root_node, "")




print("Huffman Codes for characters:")
chars = ['a', 'b', 'c', 'd', 'e', 'f']
frequencies = [5, 9, 12, 13, 16, 45]
generate_huffman_code(chars, frequencies)


# Time Complexity : O(nlogn) where n is the number of unique characters. 
#     To extract Min using heappop() takes O(logn) and as there are n nodes, it is called 2*(n – 1) times.
# Auxilliary Space : O(n) :-> Storing n nodes in heap

Output:

huffman_coding_output

Complexity:



4. Connect ‘n’ ropes with minimum cost***

Problem:

There are given n ropes of different lengths, we need to connect these ropes into one rope.

The cost to connect two ropes is equal to sum of their lengths, calculate the minimum cost.

Example:

Input: 4 ropes of lengths 4, 3, 2 and 6.   Output: 29

Explanation:

1) First connect ropes of lengths 2 and 3. Now we have three ropes of lengths 4, 6 and 5. 2) Now connect ropes of lengths 4 and 5. Now we have two ropes of lengths 6 and 9. 3) Finally connect the two ropes and all ropes have connected.

Total cost = 5 + 9 + 15 = 29

Approach: Huffman Coding
Algorithm:
  1. Create a min heap and insert all lengths into the min heap.
  2. Do following while number of elements in min heap is not one.
    • a) Extract the minimum and second minimum from min heap
    • b) Add the above two extracted values and insert the added value to the min-heap.
    • c) Maintain a variable for total cost and keep incrementing it by the sum of extracted values.
  3. Return the value of this total cost.
Implementation:
import heapq

def connect_n_ropes_min_cost(ropes):
    total_cost = 0
    heapq.heapify(ropes)

    while(len(ropes) >= 2):
        new_connected_rope = heapq.heappop(ropes) + heapq.heappop(ropes)
        total_cost += new_connected_rope
        heapq.heappush(ropes, new_connected_rope)
    
    print(total_cost)


print("Example-1: connect_n_ropes_min_cost([4, 3, 2, 6])")
connect_n_ropes_min_cost([4, 3, 2, 6])

Output:

connect_n_ropes_min_cost_output

Complexity:




← Previous: Recursion and Backtrack Approach

Next: Dynamic Programming Approach →