The Question

Median of Two Sorted Arrays

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays. The solution must achieve an overall run time complexity of O(log(min(m, n))). Example 1: Input: nums1 = [1,3], nums2 = [2] Output: 2.00000 Example 2: Input: nums1 = [1,2], nums2 = [3,4] Output: 2.50000 Constraints: - nums1.length == m - nums2.length == n - 0 <= m, n <= 1000 - 1 <= m + n <= 2000 - -10^6 <= nums1[i], nums2[i] <= 10^6

Python

Binary Search

Questions & Insights

Clarifying Questions

What are the constraints on the sizes of the two arrays ($m$ and $n$)? (e.g., Can they be zero? What is the maximum length?)

What are the value ranges for the integers in the arrays? (To handle potential overflow if calculating sums, though not strictly necessary for medians).

Is it guaranteed that at least one array is non-empty?

What is the required time complexity? (Usually, for this problem, the expectation is

O(\log(m+n))

rather than

O(m+n)

Assumptions

0 \leq m, n \leq 10^4

The input arrays are already sorted.

1 \leq m + n \leq 2 \cdot 10^4

The time complexity must be logarithmic.

Thinking Process

To achieve

O(\log(\min(m, n)))

, we must use a Binary Search approach. Instead of searching for a value, we search for a partition point in the smaller array.

We aim to partition both arrays into a "left side" and a "right side" such that the total number of elements in the left side is equal to (or one more than) the right side.

Let

i

be the partition in And

j

be the partition in B. For a valid median partition:

i + j = \text{half\_total\_length}

\text{max}(A[i-1], B[j-1]) \leq \text{min}(A[i], B[j])

A[i-1] > B[j]

, it means our partition in

A

is too far to the right, so we move the binary search range left. If

B[j-1] > A[i]

, we move the range right.

Implementation Breakdown

Problem Set

Functional Requirement: Return the median as a float.

Constraints:

Time complexity:

O(\log(\min(m, n)))

Space complexity:

O(1)

Edge Cases: One array is empty; arrays have different lengths; all elements in one array are smaller than the other.

Approach

Algorithm: Binary Search on Partition Index.

Data Structure: None (utilizes input arrays).

Complexity:

Time:

O(\log(\min(m, n)))

where

m

and

n

are lengths of the input arrays.

Space:

O(1)

additional space.

Implementation

Wrap Up

Advanced Topics

Readability vs. Performance: The

O(\log(\min(m, n)))

approach is highly performant but significantly more complex to read and debug than the

O(m+n)

two-pointer approach. In a production environment, if

m

and

n

are small (e.g., < 1000), the two-pointer approach might be preferred for maintainability.

K-th Element Generalization: This problem is a specific case of finding the

k

-th smallest element in two sorted arrays. The binary search partition logic can be generalized to find any

k

-th element, which is useful for calculating percentiles in distributed datasets.

Handling Large Data: If the arrays were too large to fit in memory, we would use a modified binary search that performs remote "seek" operations or works on block-level metadata.