CPAN上的XML模塊可以分成三大類:對 XML 數據提供獨特的接口(通常有關在XML實例和Perl數據之間的轉換),實現某一標準XML API的模塊,和對一些特定的XML相關任務進行簡化的特殊用途模塊。這個月我們先關注第一個,XML Perl專用接口。
Dromedary, or Arabian Camel 300 to 690 kg. The dromedary camel is characterized by a long-curved neck, deep-narrow chest, and a single hump. ... The dromedary camel is an herbivore. ... The dromedary camel has a lifespan of about 40-50 years ... With the exception of rutting males, dromedaries show very little aggressive behavior. ... The camels prefer desert conditions characterized by a long dry season and a short rainy season. ... Since the dromedary camel is domesticated, the camel has no special status in conservation. ...
現在我們假設此完整文檔(可從本月例子代碼中獲?。┌橊劶易逅谐蓡T的全部信息,而不僅僅是上面的單峰駱駝信息。為了舉例說明每一模塊是如何從此文件中提取某一數據子集,我們將寫一個很簡短的腳本來處理camelids.xml文檔和在STDOUT上輸出我們找到的每一種類的普通名(common-name),拉丁名(用括號包起來),和當前保存狀況。因此,處理完整個文檔,每一個腳本的輸出應該為如下結果: Bactrian Camel (Camelus bactrianus) endangered Dromedary, or Arabian Camel (Camelus dromedarius) no special status Llama (Lama glama) no special status Guanaco (Lama guanicoe) special concernVicuna (Vicugna vicugna) endangered
Hash 如下:
my %camelid_links = ( one => { url => ' http://www.online.discovery.com/news/picture/may99/photo20.html', description => 'Bactrian Camel in front of Great ' . 'Pyramids in Giza, Egypt.'}, two => { url => 'http://www.fotos-online.de/english/m/09/9532.htm', description => 'Dromedary Camel illustrates the ' . 'importance of aclearcase/" target="_blank" >ccessorizing.'}, three => { url => 'http://www.eskimo.com/~wallama/funny.htm', description => 'Charlie - biography of a narcissistic llama.'}, four => { url => 'http://arrow.colorado.edu/travels/other/turkey.html', description => 'A visual metaphor for the perl5-porters ' . 'list?'}, five => { url => 'http://www.galaonline.org/pics.htm', description => 'Many cool alpacas.'}, six => { url => 'http://www.thpf.de/suedamerikareise/galerie/vicunas.htm', description => 'Wild Vicunas in a scenic landscape.'});而我們所期望從hash中創建的文檔例子為:
Charlie - biography of a narcissistic llama. Bactrian Camel in front of Great Pyramids in Giza, Egypt. Dromedary Camel illustrates the importance of accessorizing. Many cool alpacas. A visual metaphor for the perl5-porters list? Wild Vicunas in a scenic landscape.
良好縮進的XML結果文件(如上面所顯示的)對于閱讀很重要,但這種良好的空格處理不是我們案例所要求的。我們所關心的是結果文檔是結構良好的/well-formed和它正確地表現了hash里的數據。 任務定義完畢,接下來該是代碼例子的時候了。
use XML::Simple;my $file = 'files/camelids.xml';my $xs1 = XML::Simple->new();my $doc = $xs1->XMLin($file);foreach my $key (keys (%})){ print $doc->->-> . ' (' . $key . ') '; print $doc->->->->final . "\n";}
use XML::Simple;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $xsimple = XML::Simple->new();print $xsimple->XMLout(\%camelid_links, noattr => 1, xmldecl => '');這數據到文檔的任務的條件要求暴露了XML::Simple的一個弱點:它沒有允許我們決定hash里的哪個key應該作為元素返回和哪個key該作為屬性返回。上面例子的輸出雖然接近我們的輸出要求但還遠遠不夠。對于那些更喜歡將XML文檔內容直接作為Perl數據結構操作,而且需要在輸出方面做更細微控制的案例,XML::Simple和XML::Writer配合得很好。
如下例子說明了如何使用XML::Write來符合我們的輸出要求。
use XML::Writer;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $writer = XML::Writer->new();$writer->xmlDecl();$writer->startTag('html');$writer->startTag('body');foreach my $item ( keys (%camelid_links) ) { $writer->startTag('a', 'href' => $camelid_links->); $writer->characters($camelid_links->); $writer->endTag('a');}$writer->endTag('body');$writer->endTag('html');$writer->end();
use XML::Parser;use XML::SimpleObject;my $file = 'files/camelids.xml';my $parser = XML::Parser->new(ErrorContext => 2, Style => "Tree");my $xso = XML::SimpleObject->new( $parser->parsefile($file) );foreach my $species ($xso->child('camelids')->children('species')) { print $species->child('common-name')->; print ' (' . $species->attribute('name') . ') '; print $species->child('conservation')->attribute('status'); print "\n";}
use XML::TreeBuilder;my $file = 'files/camelids.xml';my $tree = XML::TreeBuilder->new();$tree->parse_file($file);foreach my $species ($tree->find_by_tag_name('species')){ print $species->find_by_tag_name('common-name')->as_text; print ' (' . $species->attr_get_i('name') . ') '; print $species->find_by_tag_name('conservation')->attr_get_i('status'); print "\n";}
use XML::Element;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Element->new('html');my $body = XML::Element->new('body');my $xml_pi = XML::Element->new('~pi', text => 'xml version="1.0"');$root->push_content($body);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Element->new('a', 'href' => $camelid_links->); $link->push_content($camelid_links->); $body->push_content($link);}print $xml_pi->as_XML;print $root->as_XML();
use XML::Twig;my $file = 'files/camelids.xml';my $twig = XML::Twig->new();$twig->parsefile($file);my $root = $twig->root;foreach my $species ($root->children('species')){ print $species->first_child_text('common-name'); print ' (' . $species->att('name') . ') '; print $species->first_child('conservation')->att('status'); print "\n";}
use XML::Twig;require "files/camelid_links.pl";my %camelid_links = get_camelid_data();my $root = XML::Twig::Elt->new('html');my $body = XML::Twig::Elt->new('body');$body->paste($root);foreach my $item ( keys (%camelid_links) ) { my $link = XML::Twig::Elt->new('a'); $link->set_att('href', $camelid_links->); $link->set_text($camelid_links->); $link->paste('last_child', $body);}print qq||;$root->print;
這些例子舉例說明了這些普通XML Perl模塊的基本使用方法。我的目標是提供足夠多的例子讓你感受怎么用每個模塊寫代碼。下個月我們將關注“實現某一標準XML API的模塊”,特別說明的,XML::DOM, XML::XPath 和其他大量的 SAX 和類SAX模塊。