Summary: This walkthrough demonstrates how to use the Microsoft Web browser control and the Microsoft Document Object Model (DOM) to programmatically access the elements of any Web page. (3 pages)
To access the DOM programmatically, you import both the Web browser component and references to the methods, properties, and events of the DOM into your C# project. You direct the Web browser to a URL by calling its Navigate method, and you must then wait for the documentation complete event. You obtain the document by casting the Web browser Document property to an IHTMLDocument2 interface object. You can query this object for its collections, such as its link or image collections, which are returned as IHTMLElementCollection objects.
In this walkthrough, you will use the Web browser and DOM to obtain and display all anchors found in a Web page.
To access the DOM programmatically
- Create a new Visual C# Windows Application project named DOM.
The form name defaults to Form1.
- In Solution Explorer, right-click the References folder and select Add Reference.
The Add Reference dialog box opens.
- Click on the .NET tab and double-click the component named Microsoft.mshtml.
- Click OK.
References to the methods, events, and properties of the Microsoft DOM are added to the project.
- Open the Toolbox, right-click any tool, and choose Customize Toolbox.
The Customize Toolbox dialog box opens.
- Click on the COM Components tab and check Microsoft Web Browser.
The Web browser control labeled Explorer is added to the Toolbox components.
- Select the Explorer component and click the open form.
A Web browser component named axWebBrowser1 is added to the form.
- Add a TextBox component above the browser component and a ListBox component below it, accepting the default names of textBox1 and listBox1.
- Add a Button component to the right of listBox1. Change the Text property to "Submit," and accept the default name of button1.
The resulting form should look similar to the following screen shot:
- Double-click on button1.
The button1_Click method is added to the project.
- Replace the body of the button1_Click method with the following bold code:
private void button1_Click(object sender, System.EventArgs e)
{
object Zero = 0;
object EmptyString = "";
axWebBrowser1.Navigate(textBox1.Text,
ref Zero, ref EmptyString, ref EmptyString, ref EmptyString);
} - Return to the form designer, select the browser component, and click the Events icon in the Properties window.
A list of Web browser events appears.
- Double-click the Document Complete event.
The axWebBrowser1_DocumentComplete event handler is added to the project.
- Add the following bold line of code to the beginning of the file Form1.cs:
using System.Data;
using mshtml; - Replace the body of the axWebBrowser1_DocumentComplete event handler with the following code:
private void axWebBrowser1_DocumentComplete(
object sender,
AxSHDocVw.DWebBrowserEvents2_DocumentCompleteEvent e)
{
IHTMLDocument2 HTMLDocument =
(IHTMLDocument2) axWebBrowser1.Document;
IHTMLElementCollection links = HTMLDocument.links;
listBox1.Items.Clear();
foreach (HTMLAnchorElementClass el in links)
{
listBox1.Items.Add(el.outerHTML);
}
} - Press F5 to build and run the project.
The Form1 application appears.
- Type a URL—such as www.msn.com—into the text box and click Submit.
The Web page is displayed in the browser, and a list of its anchors appears in the list box, as shown in the following screen shot.
For more information, see the following topics in the MSDN Library:
- Dynamic HTML (http://msdn.microsoft.com/library/en-us/iisref/html/psdk/asp/eadg39v0.asp)
- IHTMLDocument2 interface (http://msdn.microsoft.com/workshop/browser/mshtml/reference/ifaces/document2/document2.asp)