DOMDocument is part of PHP's DOM extension which uses the libxml library
The DOMDocument class extends the DOMNode class which is extended by the DOMElement class. So properties and methods from all three of these classes are available to DOMDocument
The hardest part of developing this parser was trying to figure out how to deal with nested elements. These nested elements, like <strong> for example, would throw off the parser so that following text of the parent was actually inherited by the nested child.
The solution came when I found the getNodePath method in DOMNode, which I use in this class as a unique key. Finally I was able to get text parts with their correct element.
Have you ever run across the web designer who believes (or was taught) that tables should never be used? I have, and how surprised they are when their css techniques don't work very well with tabular data. Use css for design, use tables for tabular data.
Annual Sales to Date | ||
---|---|---|
Salesman | Sales | Commissions |
Jim | 25,585.88 | 2,580.00 |
Lisa | 68,356.22 | 6,830.00 |
Bubba | 3,369.42 | 330.00 |
The real test comes with poorly formatted markup, for example & is a special character and should be an entity.
Starting another paragraph without closing the old one is common bad design.
Using [brackets] can cause some issues
However, the notices generated by DOMDocument have been suppressed, the parser isn't quite as picky.