The Question
Coding

Count Subarrays with at Most K Distinct Integers

Given an integer array nums and an integer k, return the number of contiguous subarrays that contain at most k distinct integers. Example: Input: nums = [1,2,1,2,3], k = 2 Output: 12 Explanation: Subarrays formed with at most 2 distinct integers: [1], [2], [1], [2], [3], [1,2], [2,1], [1,2], [2,3], [1,2,1], [2,1,2], [1,2,1,2]. Constraints: - 1 <= nums.length <= 10^5 - 0 <= nums[i] <= 10^9 - 0 <= k <= nums.length
Java
Sliding Window
Two Pointers
Hash Map
Questions & Insights

Clarifying Questions

What is the range of the input array size $N$ and the value $K$?
Assumption: $N$ can be up to $10^5$, and $0 \le K \le N$. We need an $O(N)$ solution.
What is the range of the integers within the array?
Assumption: Integers are non-negative. If the range is small (e.g., $1$ to $N$), we can use a frequency array; otherwise, a `HashMap` is required.
Can $K$ be zero?
Assumption: If $K=0$, the number of subarrays is 0, as every non-empty subarray has at least one distinct integer.
What is the expected return type for the count?
Assumption: For an array of size $10^5$, the number of subarrays can exceed the range of a 32-bit integer ($N(N+1)/2 \approx 5 \times 10^9$), so we should return a `long`.

Thinking Process

Sliding Window Intuition: A subarray [i, j] containing at most K distinct elements implies that any subarray within it ending at j (i.e., [i, j], [i+1, j], \dots, [j, j]) also contains at most K distinct elements.
Window Expansion: We iterate through the array using a right pointer. For every new element added, we update our count of distinct elements (using a frequency map).
Window Contraction: If the number of distinct elements exceeds K, we increment the left pointer and update the frequency map until the number of distinct elements is \le K.
Contribution Calculation: At each step where the window [left, right] is valid, the number of valid subarrays ending exactly at right is (right - left + 1). We sum these contributions to get the total count.
Implementation Breakdown

Problem Set

Functional Requirements:
Given an integer array nums and an integer k.
Return the total number of subarrays that contain at most k distinct integers.
Constraints:
1 \le nums.length \le 10^5
0 \le nums[i] \le 10^9
0 \le k \le nums.length
Time Complexity: O(N)
Space Complexity: O(min(N, K))

Approach

Algorithm: Sliding Window (Two Pointers)
Data Structure: HashMap<Integer, Integer> (Frequency Map)
Complexity:
Time: O(N) as each pointer moves from 0 to N exactly once.
Space: O(K) to store frequencies of up to K+1 distinct elements.

Implementation

Wrap Up

Advanced Topics

Optimization for Small Integer Ranges: If nums[i] is bounded by a small value M, using an array int[] freq = new int[M+1] is significantly faster than HashMap due to cache locality and avoiding object overhead.
Finding "Exactly K": This logic is the building block for "Subarrays with exactly K distinct integers." That problem is solved by calculating atMost(K) - atMost(K-1).
Parallelization: While sliding window is inherently sequential, for massive datasets, one could use a divide-and-conquer approach combined with a merge step to count subarrays crossing the midpoint, though it is significantly more complex.