Category: XML

  • DOM Parser Example

    The DOM extension in PHP comes with extensive functionality with which we can perform various operations on XML and HTML documents. We can dynamically construct a DOM object, load a DOM document from a HTML file or a string with HTML tag tree. We can also save the DOM document to a XML file, or extract the DOM tree from a XML document.

    The DOMDocument class is one the most important classes defined in the DOM extension.

    $obj=newDOMDocument($version="1.0",$encoding="")

    It represents an entire HTML or XML document; serves as the root of the document tree. The DOMDocument class includes definitions of a number of static methods, some of which are introduced here −

    Sr.NoMethods & Description
    1createElementCreate new element node
    2createAttributeCreate new attribute
    3createTextNodeCreate new text node
    4getElementByIdSearches for an element with a certain id
    5getElementsByTagNameSearches for all elements with given local tag name
    6loadLoad XML from a file
    7loadHTMLLoad HTML from a string
    8loadHTMLFileLoad HTML from a file
    9loadXMLLoad XML from a string
    10saveDumps the internal XML tree back into a file
    11saveHTMLDumps the internal document into a string using HTML formatting
    12saveHTMLFileDumps the internal document into a file using HTML formatting
    13saveXMLDumps the internal XML tree back into a string

    Example

    Let us use the following HTML file for this example −

    <html><head><title>Tutorialspoint</title></head><body><h2>Course details</h2><table border = "0"><tbody><tr><td>Android</td><td>Gopal</td><td>Sairam</td></tr><tr><td>Hadoop</td><td>Gopal</td><td>Satish</td></tr><tr><td>HTML</td><td>Gopal</td><td>Raju</td></tr><tr><td>Web technologies</td><td>Gopal</td><td>Javed</td></tr><tr><td>Graphic</td><td>Gopal</td><td>Satish</td></tr><tr><td>Writer</td><td>Kiran</td><td>Amith</td></tr><tr><td>Writer</td><td>Kiran</td><td>Vineeth</td></tr></tbody></table></body></html>

    We shall now extract the Document Object Model from the above HTML file by calling the loadHTMLFile() method in the following PHP code −

    <?php 
    
       /*** a new dom object ***/ 
       $dom = new domDocument; 
    
       /*** load the html into the object ***/ 
       $dom->loadHTMLFile("hello.html");
    
       /*** discard white space ***/ 
       $dom->preserveWhiteSpace = false; 
    
       /*** the table by its tag name ***/ 
       $tables = $dom->getElementsByTagName('table'); 
    
       /*** get all rows from the table ***/ 
       $rows = $tables[0]->getElementsByTagName('tr'); 
    
       /*** loop over the table rows ***/ 
       foreach ($rows as $row) {
       
    
      /*** get each column by tag name ***/ 
      $cols = $row-&gt;getElementsByTagName('td'); 
      /*** echo the values ***/ 
      echo 'Designation: '.$cols-&gt;item(0)-&gt;nodeValue.'&lt;br /&gt;'; 
      echo 'Manager: '.$cols-&gt;item(1)-&gt;nodeValue.'&lt;br /&gt;'; 
      echo 'Team: '.$cols-&gt;item(2)-&gt;nodeValue; 
      echo '&lt;hr /&gt;'; 
    } ?>
  • SAX Parser Example

    PHP has the XML parser extension enabled by default in the php.ini settings file. This parser implements SAX API, which is an event-based parsing algorithm.

    An event-based parser doesnt load the entire XML document in the memory. instead, it reads in one node at a time. The parser allows you to interact with in real time. Once you move onto the next node, the old one is removed from the memory.

    SAX based parsing mechanism is faster than the tree based parsers. PHP library includes functions to handle the XML events, as explained in this chapter.

    The first step in parsing a XML document is to have a parser object, with xml_parse_create() function

    xml_parser_create(?string$encoding=null):XMLParser

    This function creates a new XML parser and returns an object of XMLParser to be used by the other XML functions.

    The xml_parse() function starts parsing an XML document

    xml_parse(XMLParser$parser,string$data,bool$is_final=false):int

    xml_parse() parses an XML document. The handlers for the configured events are called as many times as necessary.

    The XMLParser extension provides different event handler functions.

    xml_set_element_handler()

    This function sets the element handler functions for the XML parser. Element events are issued whenever the XML parser encounters start or end tags. There are separate handlers for start tags and end tags.

    xml_set_element_handler(XMLParser$parser,callable$start_handler,callable$end_handler):true

    The start_handler() function is called when a new XML element is opened. end_handler() function is called when an XML element is closed.

    xml_set_character_data_handler()

    This function sets the character data handler function for the XML parser parser. Character data is roughly all the non-markup contents of XML documents, including whitespace between tags.

    xml_set_character_data_handler(XMLParser$parser,callable$handler):true

    xml_set_processing_instruction_handler()

    This function sets the processing instruction (PI) handler function for the XML parser parser. <?php ?> is a processing instruction, where php is called the “PI target”. The handling of these are application-specific.

    xml_set_processing_instruction_handler(XMLParser$parser,callable$handler):true

    processing instruction has the following format −

    <?target
       data
    ?>

    xml_set_default_handler()

    This function sets the default handler function for the XML parser parser. What goes not to another handler goes to the default handler. You will get things like the XML and document type declarations in the default handler.

    xml_set_default_handler(XMLParser$parser,callable$handler):true

    Example

    The following example demonstrates the use of SAX API for parsing the XML document. We shall use the SAX.xml as below −

    <?xml version = "1.0" encoding = "utf-8"?><tutors><course><name>Android</name><country>India</country><email>[email protected]</email><phone>123456789</phone></course><course><name>Java</name><country>India</country><email>[email protected]</email><phone>123456789</phone></course><course><name>HTML</name><country>India</country><email>[email protected]</email><phone>123456789</phone></course></tutors>

    Example

    The PHP code to parse the above document is given below. It opens the XML file and calls xml_parse() function till its end of file is reached. The event handlers store the data in tutors array. Then the array is echoed element wise.

    <?php
    
       // Reading XML using the SAX(Simple API for XML) parser 
       $tutors   = array();
       $elements   = null;
    
       // Called to this function when tags are opened 
       function startElements($parser, $name, $attrs) {
    
      global $tutors, $elements;
      if(!empty($name)) {
         if ($name == 'COURSE') {
            // creating an array to store information
            $tutors []= array();
         }
         $elements = $name;
      }
    } // Called to this function when tags are closed function endElements($parser, $name) {
      global $elements;
      if(!empty($name)) {
         $elements = null;
      }
    } // Called on the text between the start and end of the tags function characterData($parser, $data) {
      global $tutors, $elements;
      if(!empty($data)) {
         if ($elements == 'NAME' || $elements == 'COUNTRY' ||  $elements == 'EMAIL' ||  $elements == 'PHONE') {
            $tutors[count($tutors)-1][$elements] = trim($data);
         }
      }
    } $parser = xml_parser_create(); xml_set_element_handler($parser, "startElements", "endElements"); xml_set_character_data_handler($parser, "characterData"); // open xml file if (!($handle = fopen('sax.xml', "r"))) {
      die("could not open XML input");
    } while($data = fread($handle, 4096)) {
      xml_parse($parser, $data);  
    } xml_parser_free($parser); $i = 1; foreach($tutors as $course) {
      echo "course No - ".$i. '&lt;br/&gt;';
      echo "course Name - ".$course['NAME'].'&lt;br/&gt;';
      echo "Country - ".$course['COUNTRY'].'&lt;br/&gt;';
      echo "Email - ".$course['EMAIL'].'&lt;br/&gt;';
      echo "Phone - ".$course['PHONE'].'&lt;hr/&gt;'; 
      $i++; 
    } ?>
  • Simple XML Parser

    The SimpleXML extension of PHP provides a very simple and easy to use toolset to convert XML to an object that can be processed with normal property selectors and array iterators. It is a tree_based parser, and works well with simple XML files, but may face issues when working with larger and complex XML documents.

    The following functions are defined in SimpleXML extension −

    simplexml_load_file

    The simplexml_load_file() function interprets an XML file into an object −

    simplexml_load_file(string$filename,?string$class_name=SimpleXMLElement::class,int$options=0,string$namespace_or_prefix="",bool$is_prefix=false):SimpleXMLElement|false

    A well-formed XML document in the given file is converted into an object.

    The filename parameter is a string representing the XML file to be parsed. class_name is the optional parameter. It specifies the class whose object will be returned by the function. The function returns an object of class SimpleXMLElement with properties containing the data held within the XML document, or false on failure.

    Example

    Take a look at the following example −

    <?php
       $xml = simplexml_load_file("test.xml") or die("Error: Cannot create object");
       print_r($xml);
    ?>

    It will produce the following output −

    SimpleXMLElement Object
    (
       [Course] => Android
       [Subject] => Android
       [Company] => TutorialsPoint
       [Price] => $10
    )
    

    simplexml_load_string

    The simplexml_load_string() function interprets an XML file into an object.

    simplexml_load_string(string$filename,?string$class_name=SimpleXMLElement::class,int$options=0,string$namespace_or_prefix="",bool$is_prefix=false):SimpleXMLElement|false

    A well-formed XML document in the given string is converted into an object.

    The $data parameter is a string representing the XML document to be parsed. class_name is the optional parameter. It specifies the class whose object will be returned by the function. The function returns an object of class SimpleXMLElement with properties containing the data held within the XML document, or false on failure.

    Example

    Take a look at the following example −

    <?php
       $data = "<?xml version = '1.0' encoding = 'UTF-8'?>   
       <note>
    
      &lt;Course&gt;Android&lt;/Course&gt;
      &lt;Subject&gt;Android&lt;/Subject&gt;
      &lt;Company&gt;TutorialsPoint&lt;/Company&gt;
      &lt;Price&gt;$10&lt;/Price&gt;
    </note>"; $xml = simplexml_load_string($data) or die("Error: Cannot create object"); print_r($xml); ?>

    It will produce the following output −

    SimpleXMLElement Object
    (
       [Course] => Android
       [Subject] => Android
       [Company] => TutorialsPoint
       [Price] => $10
    )
    

    simplexml_import_dom

    The simplexml_import_dom() function constructs a SimpleXMLElement object from a DOM node.

    simplexml_import_dom(SimpleXMLElement|DOMNode$node,?string$class_name=SimpleXMLElement::class):?SimpleXMLElement

    This function takes a node of a DOM document and makes it into a SimpleXML node. This new object can then be used as a native SimpleXML element.

    The node parameter is a DOM Element node. The optional class_name may be given so that simplexml_import_dom() will return an object of the specified sub class of the SimpleXMLElement class. The value returned by this function is a SimpleXMLElement or null on failure.

    Example

    Take a look at the following example −

    <?php
       $dom = new DOMDocument;
       $dom->loadXML('<books><book><title>PHP Handbook</title></book></books>');
       if (!$dom) {
    
      echo 'Error while parsing the document';
      exit;
    } $s = simplexml_import_dom($dom); echo $s->book[0]->title; ?>

    It will produce the following output −

    PHP Handbook
    

    Get the Node Values

    The following code shows how to get the node values from an XML file and the XML should be as follows −

    <?xml version = "1.0" encoding = "utf-8"?><tutorialspoint><course category = "JAVA"><title lang = "en">Java</title><tutor>Gopal</tutor><duration></duration><price>$30</price></course><course category = "HADOOP"><title lang = "en">Hadoop</title>.
    
      &lt;tutor&gt;Satish&lt;/tutor&gt;&lt;duration&gt;3&lt;/duration&gt;&lt;price&gt;$50&lt;/price&gt;&lt;/course&gt;&lt;course category = "HTML"&gt;&lt;title lang = "en"&gt;html&lt;/title&gt;&lt;tutor&gt;raju&lt;/tutor&gt;&lt;duration&gt;5&lt;/duration&gt;&lt;price&gt;$50&lt;/price&gt;&lt;/course&gt;&lt;course category = "WEB"&gt;&lt;title lang = "en"&gt;Web Technologies&lt;/title&gt;&lt;tutor&gt;Javed&lt;/tutor&gt;&lt;duration&gt;10&lt;/duration&gt;&lt;price&gt;$60&lt;/price&gt;&lt;/course&gt;&lt;/tutorialspoint&gt;</pre>

    Example

    PHP code should be as follows −

    <?php
       $xml = simplexml_load_file("books.xml") or die("Error: Cannot create object");
    
       foreach($xml->children() as $books) { 
    
      echo $books-&gt;title . "&lt;br&gt; "; 
      echo $books-&gt;tutor . "&lt;br&gt; "; 
      echo $books-&gt;duration . "&lt;br&gt; ";
      echo $books-&gt;price . "&lt;hr&gt;"; 
    } ?>

    It will produce the following output −

    Java
    Gopal
    
    $30
    ________________________________________
    Hadoop
    Satish
    3
    $50
    ________________________________________
    html
    raju
    5
    $50
    ________________________________________
    Web Technologies
    Javed
    10
    $60
    ________________________________________
    
  • XML Introduction

    With the help of PHPs built-in functions and libraries, we can handle manipulation of XML data. XML, which stands for eXtensible Markup Language, is a data format for structured document interchange, especially on the Web.

    XML is a popular file format used for serialization of data storing the data, transmitting it to another location, and reconstructing it at the destination.

    In this chapter, we shall learn about the basics of XML processing with PHP.

    Features of XML

    One of the features of XML is that it is both human readable and machine readable. The specifications of XML are defined and standardized by The World Wide Web Consortium. PHP parser can perform read/write operations on XML data.

    XML Tags

    Like HTML, XML document is also composed with the help of tags. However, you can define your own tags, which is unlike HTML where you need to use predefined tags to compose a HTML document.

    The HTML tags essentially apply formatting attributes over text, image, multimedia resources etc. The XML tags define user specified attributes to the data elements.

    XML Document

    An XML document has a hierarchical structure of tags that define the elements and attributes of data within a document. Each XML document consists of a root element that encloses other elements. Elements can have attributes, which provide additional information or properties about the element. The data within elements are enclosed by opening and closing tags.

    Example

    An example of a typical XML document is given below −

    <?xml version = '1.0' encoding = 'UTF-8'?><note><Course>Android</Course><Subject>Android</Subject><Company>TutorialsPoint</Company><Price>$10</Price></note>

    Types of XML Parsers

    In PHP, there are two types of XML parsers available −

    • Tree based parsers
    • Event based parsers

    Tree-based Parsers

    With this type of a parser, PHP loads the entire XML document in the memory and transforms the XML document into a Tree structure. It analyzes the whole document, and provides access to the Tree elements.

    For smaller documents, tree-based parser works well, but for large XML document, it causes major performance issues. SimpleXML parser and DOM XML parser are the examples of tree-based parsers

    Simple XML Parser

    The Simple XML parser also called as tree-based XML parser and it will parse the simple XML file. Simple XML parse will call simplexml_load_file() method to get access to the xml from specific path.

    DOM Parser

    DOM Parser also called as a complex node parser, Which is used to parse highly complex XML file. It is used as interface to modify the XML file. DOM parser has encoded with UTF-8 character encoding.

    Event-based Parsers

    An event-based parser doesnt load the entire XML document in the memory. instead, it reads in one node at a time. The parser allows you to interact with in real time. Once you move onto the next node, the old one is removed from the memory.

    As there is no memory overload involved, this type of parser is suitable for large XML documents, and the document is parsed faster than any tree-based parser. XMLReader and XML Expat Parser are the examples of event-based parsers.

    XML Parser

    XML parsing is based on SAX parse. It is more faster the all above parsers. It will create the XML file and parse the XML. XML parser has encoded by ISO-8859-1, US-ASCII and UTF-8 character encoding.

    XML Reader

    XML Reader parse also called as Pull XML parse. It is used to read the XML file in a faster way. It works with high complex XML document with XML Validation.