I'd like to re-iterate that this is a BAD IDEA. Because whilst XML looks like plain text - it's isn't plain text. And if you treat it as such, you are creating brittle, unmaintainable and unsupportable code, which may well break one day, because someone changes the XML format in a valid way.
I would strongly suggest that your first port of call is go back to your project, and point out how parsing XML without an XML parser is rather like trying to use a hammer to put screws into a piece of wood. In that it sort of works, but the results are rather shoddy, and frankly it's completely unnecessary because screwdrivers exist and they do the job properly, easily and are widely available.
E.g.
can you tell me how I can print the author, title and price for each book id for the above XML file with a XML module ?
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> new -> parsefile ( 'your_file.xml' );
foreach my $book ( $twig -> get_xpath ( '//book' ) ) {
print join ("
",
$book -> att('id'),
$book -> field('author'),
$book -> field('title'),
$book -> field('price'), ),"
----
";
}
However:
Given your very specific sample, you may be able to get away with treating it as 'plain text'. Before you do this, you should point out to your project lead that this is a risky approach - you're putting in screws with a hammer - and therefore creating ongoing risk of support problems, which is trivially resolved by just installing a bit of freely available, open source code.
I am only suggesting this AT ALL because I've had to deal with ludicrously unreasonable similar project demands.
Like this:
#!/usr/bin/env perl
use strict;
use warnings;
while ( <> ) {
if ( m/<book/ ) {
my ( $id ) = ( m/id="(w+)"/ );
print $id,"
";
}
if ( m/<author/ ) {
my ( $author ) = ( m/>(.*)</ );
print $author,"
";
}
}
Now, the reason this doesn't work is your sample above can be perfectly validly formatted as:
<?xml version="1.0"?>
<catalog><book id="bk101"><author>Gambardella, Matthew</author><title>XML Developer's Guide</title><genre>Computer</genre><price>44.95</price><publish_date>2000-10-01</publish_date><description>An in-depth look at creating applications
with XML.</description></book><book id="bk102"><author>Ralls, Kim</author><title>Midnight Rain</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-12-16</publish_date><description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description></book></catalog>
Or
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
Or:
<?xml version="1.0"?>
<catalog
><book
id="bk101"
><author
>Gambardella, Matthew</author><title
>XML Developer's Guide</title><genre
>Computer</genre><price
>44.95</price><publish_date
>2000-10-01</publish_date><description
>An in-depth look at creating applications
with XML.</description></book><book
id="bk102"
><author
>Ralls, Kim</author><title
>Midnight Rain</title><genre
>Fantasy</genre><price
>5.95</price><publish_date
>2000-12-16</publish_date><description
>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description></book></catalog>
Or:
<?xml version="1.0"?>
<catalog>
<book id="bk101"><author>Gambardella, Matthew</author><title>XML Developer's Guide</title><genre>Computer</genre><price>44.95</price><publish_date>2000-10-01</publish_date><description>An in-depth look at creating applications
with XML.</description></book>
<book id="bk102"><author>Ralls, Kim</author><title>Midnight Rain</title><genre>Fantasy</genre><price>5.95</price><publish_date>2000-12-16</publish_date><description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description></book>
</catalog>
This is why you have so many comments that say 'use a parser' - from those snippets above, the simplistic example I gave you... will only work on one and break messily on the others.
But the XML::Twig
solution handles them all correctly. XML::Twig
is freely available on CPAN. (There's other libraries that do the job too just as well). And it's also pre-packaged with a lot of operating systems 'default' repositories.