DOM Parser Example

The DOM extension in PHP comes with extensive functionality with which we can perform various operations on XML and HTML documents. We can dynamically construct a DOM object, load a DOM document from a HTML file or a string with HTML tag tree. We can also save the DOM document to a XML file, or extract the DOM tree from a XML document.

The DOMDocument class is one the most important classes defined in the DOM extension.

$obj=newDOMDocument($version="1.0",$encoding="")

It represents an entire HTML or XML document; serves as the root of the document tree. The DOMDocument class includes definitions of a number of static methods, some of which are introduced here −

Sr.NoMethods & Description
1createElementCreate new element node
2createAttributeCreate new attribute
3createTextNodeCreate new text node
4getElementByIdSearches for an element with a certain id
5getElementsByTagNameSearches for all elements with given local tag name
6loadLoad XML from a file
7loadHTMLLoad HTML from a string
8loadHTMLFileLoad HTML from a file
9loadXMLLoad XML from a string
10saveDumps the internal XML tree back into a file
11saveHTMLDumps the internal document into a string using HTML formatting
12saveHTMLFileDumps the internal document into a file using HTML formatting
13saveXMLDumps the internal XML tree back into a string

Example

Let us use the following HTML file for this example −

<html><head><title>Tutorialspoint</title></head><body><h2>Course details</h2><table border = "0"><tbody><tr><td>Android</td><td>Gopal</td><td>Sairam</td></tr><tr><td>Hadoop</td><td>Gopal</td><td>Satish</td></tr><tr><td>HTML</td><td>Gopal</td><td>Raju</td></tr><tr><td>Web technologies</td><td>Gopal</td><td>Javed</td></tr><tr><td>Graphic</td><td>Gopal</td><td>Satish</td></tr><tr><td>Writer</td><td>Kiran</td><td>Amith</td></tr><tr><td>Writer</td><td>Kiran</td><td>Vineeth</td></tr></tbody></table></body></html>

We shall now extract the Document Object Model from the above HTML file by calling the loadHTMLFile() method in the following PHP code −

<?php 

   /*** a new dom object ***/ 
   $dom = new domDocument; 

   /*** load the html into the object ***/ 
   $dom->loadHTMLFile("hello.html");

   /*** discard white space ***/ 
   $dom->preserveWhiteSpace = false; 

   /*** the table by its tag name ***/ 
   $tables = $dom->getElementsByTagName('table'); 

   /*** get all rows from the table ***/ 
   $rows = $tables[0]->getElementsByTagName('tr'); 

   /*** loop over the table rows ***/ 
   foreach ($rows as $row) {
   
  /*** get each column by tag name ***/ 
  $cols = $row-&gt;getElementsByTagName('td'); 
  /*** echo the values ***/ 
  echo 'Designation: '.$cols-&gt;item(0)-&gt;nodeValue.'&lt;br /&gt;'; 
  echo 'Manager: '.$cols-&gt;item(1)-&gt;nodeValue.'&lt;br /&gt;'; 
  echo 'Team: '.$cols-&gt;item(2)-&gt;nodeValue; 
  echo '&lt;hr /&gt;'; 
} ?>

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *