How to Find the K Largest Elements in an Unsorted Array Using a Min Heap | Day #17

How to Find the K Largest Elements in an Unsorted Array Using a Min Heap | Day #17

A Step-by-step Process of Finding k Largest Elements with Min Heap

Finding the k largest elements in an unsorted array is a common problem in computer science, especially when dealing with large datasets. In this guide, we’ll explore how to efficiently solve this problem using a Min Heap, with code examples in both Python and JavaScript.


Introduction

When you need to extract the largest k elements from a large dataset, a Min Heap provides an efficient way to do so. Sorting the entire array would take O(n log n) time, but with a Min Heap, we can reduce the complexity to O(n log k). This makes it ideal for scenarios where only a subset of the top values is needed.

Why Min Heap Works Best for This Use Case?

  • A Min Heap of size k ensures that the smallest element is always at the root.

  • As you iterate through the array, only elements larger than the root replace the minimum value in the heap. This guarantees that the heap always contains the top k largest elements.

Real-World Applications

  • Top-k recommendations (e.g., showing top products or movies)

  • Leaderboards in games or competitions

  • Social media analytics (e.g., top trending hashtags)

  • Task scheduling systems based on priorities


Algorithm Overview

To solve this problem using a Min Heap, follow these steps:

Step-by-Step Process

  1. Initialize a Min Heap with the first k elements of the array.

  2. Heapify the initial elements to form a valid Min Heap.

  3. Iterate through the remaining elements of the array:

    • If the current element is larger than the root of the Min Heap, replace the root with the current element.

    • Reheapify to maintain the Min Heap property.

  4. Once all elements have been processed, the Min Heap will contain the k largest elements.

Flowchart Overview

Start → Initialize Min Heap with first k elements
        ↓
  Heapify to create a valid Min Heap
        ↓
  For each remaining element:
    ↓
    Is element > root of heap?
        ↓
      Yes → Replace root → Heapify
        ↓
      No → Continue to next element
        ↓
End → Min Heap contains k largest elements

Code Implementation

Python Implementation

import heapq

def find_k_largest_elements(arr, k):
    # Step 1: Create a Min Heap with the first k elements
    min_heap = arr[:k]
    heapq.heapify(min_heap)

    # Step 2: Process remaining elements in the array
    for num in arr[k:]:
        if num > min_heap[0]:
            heapq.heapreplace(min_heap, num)  # Replace root and heapify

    return min_heap

# Example usage
arr = [3, 2, 1, 5, 6, 4]
k = 2
print(find_k_largest_elements(arr, k))  # Output: [5, 6]

JavaScript Implementation

class MinHeap {
    constructor() {
        this.heap = [];
    }

    // Insert a new element and sort the heap
    insert(val) {
        this.heap.push(val);
        this.heap.sort((a, b) => a - b);
    }

    // Replace the root element with a new value and re-sort
    replaceRoot(val) {
        this.heap[0] = val;
        this.heap.sort((a, b) => a - b);
    }
}

function findKLargestElements(arr, k) {
    const minHeap = new MinHeap();

    // Step 1: Initialize the Min Heap with the first k elements
    arr.slice(0, k).forEach(num => minHeap.insert(num));

    // Step 2: Process remaining elements
    for (let i = k; i < arr.length; i++) {
        if (arr[i] > minHeap.heap[0]) {
            minHeap.replaceRoot(arr[i]);  // Replace root and re-sort
        }
    }

    return minHeap.heap;
}

// Example usage
const arr = [3, 2, 1, 5, 6, 4];
const k = 2;
console.log(findKLargestElements(arr, k)); // Output: [5, 6]

Complexity Analysis

Time Complexity

  • Heapify the first k elements: O(k)

  • Iterate through the rest of the elements: O((n - k) log k)

    • Each replacement operation in the heap takes O(log k) time.

Thus, the overall time complexity is O(n log k), where n is the size of the input array and k is the number of largest elements to find.

Space Complexity

  • The space complexity is O(k) for storing the Min Heap.

Edge Cases to Consider

  • Array size smaller than k: Return the entire array.

  • k = 0: Return an empty array.

  • k = array length: Return the sorted array.

  • Handling duplicates: The algorithm will work correctly, and duplicates will be treated as individual elements.


Comparison with Other Methods

Sorting Approach

  • Time Complexity: O(n log n)

  • Sorting the entire array and taking the last k elements is straightforward but not optimal for large datasets.

Using a Max Heap

  • A Max Heap could also be used, but it requires additional effort to extract the k largest elements from the heap.

Using a Min Heap is more efficient because it allows us to maintain only k elements throughout the process.


Conclusion

Using a Min Heap to find the k largest elements in an unsorted array is an efficient and scalable solution, especially for large datasets. The approach ensures that we maintain only the necessary elements, saving both time and space.


Call to Action

Try implementing the Min Heap approach yourself! Experiment with different datasets and values of k to see how the algorithm performs. Share your results or ask any questions in the comments below.