The Question
Coding

Subarrays with K Distinct Integers

Given an integer array and an integer k, implement an efficient algorithm to calculate the total number of subarrays that contain at most k distinct integers. Discuss the time and space complexity, and explain how this approach can be extended to find the number of subarrays with exactly k distinct integers.
Java
Sliding Window
Two Pointers
HashMap
Questions & Insights

Clarifying Questions

What is the maximum size of the input array and the range of integers within it? (e.g., 10^5 elements, integers up to 10^9).
How should we handle the case where k = 0?
Should the function return a long to prevent integer overflow for large arrays?
Are the integers in the array always non-negative?
Assumptions:
The array length N can be up to 10^5.
Elements are integers; a HashMap is safer than a frequency array unless the range is small (e.g., 0-10^5).
k \ge 1 (if k=0, the answer is 0 since a non-empty subarray must have at least one distinct element).
The result will be long to accommodate N(N+1)/2 possible subarrays.

Thinking Process

Utilize the Sliding Window (Two Pointers) technique. For every fixed right boundary r, find the smallest left boundary l such that the window [l, r] contains at most k distinct integers.
The number of subarrays ending at index r that satisfy the "at most k" condition is exactly r - l + 1. This is because if the window [l, r] is valid, then [l+1, r], [l+2, r], ..., [r, r] are also valid.
Maintain a frequency map of the current window to track the count of distinct integers.
Sum up the valid subarray counts for every r from 0 to N-1 to get the total result.
Implementation Breakdown

Problem Set

Functional Requirement: Given an array nums and an integer k, return the total number of subarrays having at most k distinct integers.
Constraints:
Time complexity should be O(N) to handle large inputs.
Space complexity should be O(K) or O(N) for the frequency tracking.

Approach

Algorithm: Sliding Window (Two Pointers).
Data Structure: HashMap<Integer, Integer> to track frequencies of elements in the current window.
Complexity:
Time: O(N) because each pointer travels the array at most once.
Space: O(min(N, K)) to store the frequency map.

Implementation

Wrap Up

Advanced Topics

Exact K Distinct Elements: This problem is often a building block to solve "Count subarrays with exactlyk distinct integers." The solution for "exactly k" is atMost(k) - atMost(k - 1).
Memory Optimization: If the range of integers in nums is known and small (e.g., 0 \le nums[i] \le 10^5), replacing HashMap with a primitive int[] array will significantly improve performance by reducing hashing overhead and garbage collection pressure.
Parallelization: This specific sliding window problem is inherently sequential. However, for massive datasets (Distributed Systems), one could use a MapReduce approach to count frequencies in chunks, though merging window boundaries across nodes requires careful handling of cross-border subarrays.