Leetcode 3. Longest Substring Without Repeating Characters
Very High Frequency
Clarification
Okay, so first I’ll restate the problem. We are given a string s, and we need to return the length of the longest substring without repeating characters. A substring has to be contiguous, so we cannot skip characters.
Should I return the length, or the substring itself?
Can the string be empty?
What kind of characters does the string have? Are characters case-sensitive? For example, is "aA" a repeating substring or not?
Brute-force Approach
The brute force idea is to check every substring and see whether it contains duplicate characters. That works, but the time would be O(n²), so it’s not ideal.
Optimized Approach
Since this is a substring problem, we can consider to use sliding window approach trying to optimize it.
What I’ll do is to maintain a window [left, right] where all characters are unique. As I move right through the string, if I see a repeated character, I move left forward until the window becomes valid again. To do that efficiently, I’ll use a hash set to store the characters currently inside the window.
So why does it work? There are three most important things for a sliding window: when to increase the window; when to shrink the window; when to update the result. The key observation for this question is that my window always contains unique characters. So when to increase the window? Whenever we see an unseen character, we should add it to make the substring longer. But when I encounter a duplicate character, there is no point to keep increasing the window because any substring that still includes the previous duplication must be invalid, so I have to shrink the window from the left until the duplicate is removed, which is the start of next valid substring. Last, since we are supposed to find the longest substring, we have to update the result whenever the window gets bigger, i.e, when we increase the window from the right side.
Walk through an example.
Since both pointers only move forward and never backward, each character enters and leaves the window at most once, this gives us O(n) time. The space complexity should be O(k), where k is the length of the longest valid answer, because we use a hash set that contains at most k elements.
Code
class Solution {
public int lengthOfLongestSubstring(String s) {
int left = 0, right = 0;
Set<Character> chars = new HashSet<>();
int maxLen = 0;
while (right < s.length()) {
char c = s.charAt(right);
if (chars.contains(c)) {
while (left < right && chars.contains(c)) {
chars.remove(s.charAt(left));
left++;
}
}
chars.add(c);
right++;
maxLen = Math.max(maxLen, right - left);
}
return maxLen;
}
}Follow-up
I could optimize the left movement using a hash map from character to its latest index. Then instead of removing one by one, I can jump left directly past the previous duplicate.
