customized English word breaker for sql server 2008
-
Open the Registry Editor, by:
- Clicking Start, and clicking Run.
- In the Run dialog box, in the Open box, type Regedit.
- In Registry Editor, select the following registry key for the first instance of SQL Server: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSearch\CLSID (Please replace the MSSQLSERVER with the real instance name if it's named instance)
- On the menu bar, click Edit, click New, and click Key.
- Type {9DAA54E8-CD95-4107-8E7F-BA3F24732D95}.
- Press ENTER.
- In the right pane, right-click the Default registry value, and then click Modify.
-
In the Edit String dialog box, in the Value data box, type NaturalLanguage6.dll, and then click OK.
- In Registry Editor, select the following registry key for the first instance of SQL Server:HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSearch\Language\enu
-
Replace the 'WbreakerClass' and 'StemmerClass'with new value as below
WBreakerClass:{9DAA54E8-CD95-4107-8E7F-BA3F24732D95}
StemmerClass :{61A48126-EF74-4d4a-9DDA-43FD542CAD1E}
-
Copy files:
from "C:\Windows\System32"
to "C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn"
NlsData0009.dll
NlsLexicons0009.dll
Now , we are going to create our own customized word breaker
- Log on to sql server box under window administrator account
- Open a notepad
-
Put below words following the rule list in article http://technet.microsoft.com/en-us/library/cc263242.aspx#Rules
red/bl
-st/fl
red/
24-
- On the File menu, click Save As.
- In the Save as type list, select All Files.
- In the Encoding list, select Unicode.
- In the File name box, type the file name in the following format: Custom0009.lex, (Please do not change the file name)
- put the file to the sql server instance binn folder , for example : C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Binn.
- Restart fdhost by executing "exec sp_fulltext_service 'restart_all_fdhosts'" on the sql server instance.
- Now the customized word breaker works, here is a screenshot of test:
Please note, before we configure the word break , the result is as below
select *From sys.dm_fts_parser('red/bl',1033,0,0)