# 139. Word Break I

Reference: LeetCode
Difficulty: Medium

## Problem

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:

• The same word in the dictionary may be reused multiple times in the segmentation.
• You may assume the dictionary does not contain duplicate words.

Example:

## Analysis

### Backtracking

First, we put wordSet into a hash set for quick contains examination.

For each character S[depth], we consider substrings S[depth, i] including S[depth], S[depth, depth + 1], …, S[depth, n - 1]. If one of them is in wordDict, we then redo the task on character S[i + 1]; otherwise, return false.

One of the most difficult part is to write the reject & accept code. Think about it carefully, we notice that if the recursive call goes into a situation where n == s.length(), it means the string s can be successfully segmented; otherwise, it won’t go into that situation. We don’t write reject case in this backtracking function.

Time: $O(N^N)$ since each time it has at most $N$ choices and the depth (problem size) is $N$. (this is an upper bound)
Space: $O(N)$ (string length and call stack depth)

Marked For the time complexity, if we count substring operation ($O(N)$), it would be $O(N \times N^N) = O(N^(N + 1))$.

### Backtracking (Memoization)

Let’s use an example to see if we can optimize the above method.

For the first substring "a", it is in wordDict, so depth becomes 1 and we would examine if "bcde" is breakable.

In the process of checking if "bcde" is breakable, we may check if "cde", "de", "e" are breakable. Once we know the answers, we can cache them for future usage no matter they are true or false.

In future when we’ve done processing the first substring "a", we will examine "b", and you will see there could be a lot of repeated computation for "cde", "de", "e".

Difficulty: Using memoization in a backtracking-style recursive function is quite uncommon than other DP memoization. This is the new form I learned. There are three places that we need to set and get memo[].

Note: We use Boolean since initially we want values to be null.

Time: $O(N^2)$
Space: $O(N)$

### DP

The idea is that given a problem(s) we can divide it into two subproblems s1 and s2. If both of them are breakable, s is breakable (by saying breakable I mean it satisfies the required conditions).

Note: Substring s(i, j) (character i to j) in Java is denoted by s.substring(i, j + 1).

First, we define our dp[] array, where dp[i] is true if the substring s(0, i - 1) or s.substring(0, i) is breakable; otherwise, it should be false.

Then, we process string length from 1 to n in dp[i]. For each substring s(0, i - 1), we examine each combination of substrings s(0, j - 1) (s.substring(0, j)) and s(j, i - 1) (s.substring(j, i)). The first subproblem can be calculated directly by dp[j] while the second one can be checked by wordDict set. Question: Why don’t we break the second substring and examine it further? (e.g. abc is not in wordDict, but ab and c could be in wordDict)

The question is: what happen if "cdef" is not in wordDict while "cd" and "ef" are both in wordDict?

It is handled previously! If "cd" is in wordDict, in the previously fourth round for substring "abcd", we would examine "ab" and "cd". If dp("ab") is true and "cd" is in wordDict, we would mark dp("abcd") as true!

Then in the final round, dp("abcd") is true and "ef" is in wordDict, so we have the whole string breakable.

Note: In addition, please think about the initialization step and the break statement in the code.

Time: $O(N^3)$ since substring takes $O(N)$.
Space: $O(N)$

### BFS

Go To: LeetCode Solution

Comment Junhao Wang
a software engineering cat