Perl for XML Processing
Technologies of Interest
XML-LibXML
XML-LibXML is the de-facto standard for XML processing in Perl. It's a comprehensive CPAN module based on the libxml2 library, that provides DOM (Document Object Module), SAX (a stream parser), a pull parser, XPath, and XSLT support. XML-LibXML has good documentation and is actively maintained.
One note is that you should be aware of XML namespaces and how they interact with the DOM and the XML-LibXML API before using this library.
Web Pages about Perl and XML
The Perl XML Project Home Page
Their Frequently Asked Questions List (FAQ)
What to Avoid
XML-Simple
XML-Simple is not so simple when done properly and takes the wrong approach to dealing with XML. Please avoid using it. Look at XML-LibXML for an easy and fast alternative.
Modules for Dealing with Specific Grammars
In addition to generic XML parsers and manipulators, there are many specialised modules on the CPAN for dealing with specific XML grammars. Many of them reside under XML:: namespace. Some prominent examples include:
- XML-RSS - manipulate RSS (Really Simple Syndication) 0.9, 0.91, 1.0 and 2.0.
- XML-Atom - manipulate Atom feeds. (Atom is an alternative syndication format)
- XML-Feed - generate, parse, mix and match web feeds (Atom or RSS).
- OpenOffice-OODoc - manipulate OpenOffice.org-like ODF (OpenDocument format) files.
