Prefix Sum with HashMap: Time Complexity Optimization
A very cool way of using Prefix Sum with Hash-map is shown in this post. I am going to break down this LeetCode 560 problem to reveal the idea of lowering down the time complexity by adopting Hash-map (or dictionary in Python).
LeetCode 560
Reviewing question 560, an array nums
and a target k
is given to determine the number of sub-arrays res
which having sum of interval equals k
. Such that, giving array nums = [1, 1, 1, 1, 1, 1]
and k = 4
. The result res
will be 3, target sub-array set contains nums[0:3]
, nums[1:4]
and nums[2:5]
.
Adopting Prefix Sum
Since the time complexity of calculating sum of interval for a specific interval is $O(n)$:
If we want to calculate interval of sum more than one time, such as the problem 560, we should utilize Prefix Sum, which, $O(n)$ for constructing pref
array and $O(1)$ for querying a specific interval, say [L, R]
:
Back to the Problem
In the problem, we need to check sum of intervals of each specific pair of L and R, since there are negative elements in nums
.
The brute force code can be easily written using two for loop, which incurs a square time complexity $O(n^2)$.
From observation, we notice that:
When we considering each R, to find sum of interval starts with each L ranging from 0 to R, a linear time is needed to deduct the pref[R]
by each pref[L]
.
Since scanning each pref[L]
linearly is a little bit waste of time, so we can consider utilizing some spaces to trade time. We can adopt a dictionary (HashMap).
Consider that:
- In the inner for loop of L, we want to know how many L's can satisfy
pref[R] - pref[L] == k
, equivalent topref[L] == pref[R] - k
. - We adopt a dictionary
d
to maintain number of L's having Prefix Sum as the key ofd
, henced[i]
means number of L satisfyingpref[L] == i
. - Then we break the inner loop by
res += d[pref[R] - k]
, that's wise, isn't it? - Finally we update the dictionary
d
by the loop of R (refer to the following code). - Note that we should initalize
d
byd = {0: 1}
, think about why we should do so. (That is similar to we initialize pref usingpref=[0]
. Because there can be a sum of interval starting from the first element)
We have cancelled out pref[L]
. We can also discard pref[R]
. The reason of it is R is in the outer loop, and it can totally combined with the previous loop for calculating pref
. We also discard the pref
array to make it calculate-on-use.
Conclusion
We use a dictionary (HashMap) to replace the inner for loop for variable L, and we modify the code to eliminate the need for the 'pref' array. As a result, the time complexity is reduced from quadratic time to linear time. This technique is quite useful and can be applied to more LeetCode problems.