POJ #3267 The Cow Lexicon 型如" E[j] = opt{D+w(i,j)} "的简单DP 字符串匹配

Description


 

Few know that the cows have their own dictionary with W (1 ≤ W ≤ 600) words, each containing no more 25 of the characters 'a'..'z'. Their cowmunication system, based on mooing, is not very accurate; sometimes they hear words that do not make any sense. For instance, Bessie once received a message that said "browndcodw". As it turns out, the intended message was "browncow" and the two letter "d"s were noise from other parts of the barnyard.

The cows want you to help them decipher a received message (also containing only characters in the range 'a'..'z') of length L (2 ≤ L ≤ 300) characters that is a bit garbled. In particular, they know that the message has some extra letters, and they want you to determine the smallest number of letters that must be removed to make the message a sequence of words from the dictionary.

Input

Line 1: Two space-separated integers, respectively: W and L 
Line 2: L characters (followed by a newline, of course): the received message 
Lines 3.. W+2: The cows' dictionary, one word per line

Output

Line 1: a single integer that is the smallest number of characters that need to be removed to make the message a sequence of dictionary words.

Sample Input

6 10
browndcodw
cow
milk
white
black
brown
farmer

Sample Output

2

 

思路


 

  虽然 DISCUSS 中总有人说水题,但是我觉得这道题的质量可以 (或许我比较弱ORZ ,在做过的 DP 题里算 medium 难度。

  题目的意思是给你一个主串和一堆子串,需要你将子串和主串完全匹配上,在匹配过程中可以删除主串中匹配不上的字符,最后统计出被删除的最少字符数目。

  比如主串是 carmsr ,子串有 car 、mr 两种。可以只用 car 去匹配,那么匹配不上的字符有 m、s、r 三个,所以需要删除三个字符;可以用 car、mr 去匹配,那么匹配不上的字符有 s 一个,所以需要删除一个字符。那么在个样例里最终的输出就是 1。

  枚举算法的时间复杂度是指数阶,直接出局,决定从后往前简单搜索。在过程中发现这道题是具有最优子结构的,即问题的最优解是由子问题的最优解演变而来的,当前位置最少删除字符数目与之前位置的最少删除字符数目有直接关系,即当前位置没有被匹配上的话,那么当前位置的最少删除字符数目就是前一个位置的最少删除字符数目加一;如果匹配上的话,演变关系比较饶一点,还是通过例子说明吧:

//主串 browndcodw,子串 cow

browndcodw
      co  w
      l   r     

//当前位置位于末尾的 w ,子串 cow 匹配到的位置如图所示
//那么,当前位置的删除字符数目 = r - l +1 - 3 + 前6个字符的最小删除字符数目

  通过这个例子还可以发现,原问题还有很多公共的子问题,若此时再给一个子串 cod 进行匹配,那么两个子串匹配后主串中剩下的子串的长度是一致的,它们的子子问题其实是同一个问题。显然,这两个特征都是动态规划的,所以我们采用动态规划解题。

  状态为前 i 个字符需要删除的字符数,dp[i] 为状态数组,那么可以推出状态转移方程:

    

 

   有了方程,写算法就容易多了。算法使用采用迭代实现,时间复杂度为 O(W·L^2)   

#include<iostream>
#include<vector>
#include<string>
using namespace std;
#define INT_MAX 600000
const int MAX_WORD_NUM = 600;
const int MAX_MES_LENGTH = 300;
int dp[MAX_MES_LENGTH + 1];
int len[MAX_WORD_NUM + 1];
string dir[MAX_WORD_NUM + 1];

int main(void) {
    int word_num, mes_length;
    string mes;
    cin >> word_num >> mes_length;
    cin >> mes;
    //int dp[mes_length+1];
    //int len[word_num+1];
    dp[0] = 0;
    for (int i = 1; i <= mes_length; ++i) {
        dp[i] = INT_MAX;
    }
    //string dir[word_num+1];
    for (int i = 1; i <= word_num; i++) {
        cin >> dir[i];
        len[i] = dir[i].size();
    }
 
    for (int i = 1; i <= mes_length; ++i) {
        bool match_flag = false;    
        for (int k = 1; k <= word_num; k++) { 
            int l = i-1, r = i-1; 
            //从后往前匹配
            int j;
            for (j = len[k]-1; l >= 0; ) { 
                if (dir[k][j] == mes[l]) { 
                    j--; 
                    if (j == -1) //单词匹配成功,l无需再左移
                        break;
                }
                else if (dir[k][j] != mes[l] && j == len[k]-1) {
                    r--;
                }
                l--; 
            }
            //*cout << "现在是第 " << k << "个单词在进行匹配! " <<"l is :" << l << " " << "r is :" << r << endl;
            
            //主串匹配上了当前单词
            if (j == -1) {
                //dp为含前i个字符的主串需要删去的字符总数    
                dp[i] = std::min(dp[i], dp[l] + i - l - len[k] ); //dp[l] + (i-1) - l - len[k]
                match_flag = true; 
            }
        }
        //mes的前i个字符都没有匹配上任何单词
        if (match_flag == false) {
            dp[i] = dp[i-1] + 1;
        }
    }
    cout << dp[mes_length] << endl;
    return 0;
}
View Code

 

  

posted @ 2018-02-05 20:25  bw98  阅读(192)  评论(0编辑  收藏  举报