Ну, я делаю return simplexml_load_string($data, 'SimpleXMLElement', LIBXML_COMPACT | LIBXML_NOCDATA | LIBXML_NOBLANKS | LIBXML_NOEMPTYTAG );
и разбираю xml-ответ.
Проблема в том, что содержание [description] действительно запутано, и мне нужно выбрать нужные данные.
[description] =>
<a href="http://www.metacafe.com/watch/cb-YpE1z5IhjWrmCM62DSTU8jQ9X4IZryVR/the_dish_with_doc_willoughby/"><img src="http://s4.mcstatic.com/thumb/8000947/21507982/4/directors_cut/0/1/the_dish_with_doc_willoughby.jpg?v=8" align="right" border="0" alt="THE Dish with Doc Willoughby" vspace="4" hspace="4" width="134" height="78" /></a>
<p>
Doc Willoughby, guru of "America's Test Kitchen," stopped by "CBS The Morning: Saturday" to share his ultimate dish with Rebecca Jarvis and Jeff Glor: Roast Beef Tenderloin with Dried Fruit and Nut Stuffing. <br>Ranked <strong>4.00</strong> / 5 | 2 views | <a href="http://www.metacafe.com/watch/cb-YpE1z5IhjWrmCM62DSTU8jQ9X4IZryVR/the_dish_with_doc_willoughby/">0 comments</a><br/>
</p>
<p>
<a href="http://www.metacafe.com/watch/cb-YpE1z5IhjWrmCM62DSTU8jQ9X4IZryVR/the_dish_with_doc_willoughby/"><strong>Click here to watch the video</strong></a> (04:58)<br/>
Submitted By: <a href="http://www.metacafe.com/channels/CBS/">CBS</a><br/>
Tags:
<a href="http://www.metacafe.com/topics/cbsepisode/">Cbsepisode</a> <a href="http://www.metacafe.com/topics/dish/">Dish</a> <a href="http://www.metacafe.com/topics/doc_willoughby/">Doc Willoughby</a> <a href="http://www.metacafe.com/topics/america%27s_test_kitchen/">America's Test Kitchen</a> <a href="http://www.metacafe.com/topics/roast_beef_tenderloin/">Roast Beef Tenderloin</a> <a href="http://www.metacafe.com/topics/dried_fruit/">Dried Fruit</a> <a href="http://www.metacafe.com/topics/nut_stuffing/">Nut Stuffing</a> <a href="http://www.metacafe.com/topics/cbs_this_morning/">CBS This Morning</a> <br/>
Categories: <a href='http://www.metacafe.com/videos/news_and_events/'>News & Events</a> </p>
Как вы можете видеть, это действительно испортилось, и мне было интересно, как я могу получить, например, первые <p>
данные до "
Ranked ..." и тегов также
Edit:
хорошо, вот код php, который я использую:
$dom = new DOMDocument();
@$dom->loadHTML($result->description); // or you can use loadXML
$dom->normalizeDocument();
/*$dom->resolveExternals = false;
$dom->substituteEntities = false;*/
$xml = simplexml_import_dom($dom);
$data['viewData']['data']['description'] = $xml;
или
$paragraph = $dom->getElementsByTagName('p'); -> this doesn't work
//$xml = simplexml_import_dom($dom);
$data['viewData']['data']['description'] = $paragraph;
и вот вывод:
[description] => SimpleXMLElement Object
(
[body] => SimpleXMLElement Object
(
[a] => SimpleXMLElement Object
(
[@attributes] => Array
(
[href] => http://www.metacafe.com/watch/cb-YpE1z5IhjWrmCM62DSTU8jQ9X4IZryVR/the_dish_with_doc_willoughby/
)
[img] => SimpleXMLElement Object
(
[@attributes] => Array
(
[src] => http://s4.mcstatic.com/thumb/8000947/21507982/4/directors_cut/0/1/the_dish_with_doc_willoughby.jpg?v=8
[align] => right
[border] => 0
[alt] => THE Dish with Doc Willoughby
[vspace] => 4
[hspace] => 4
[width] => 134
[height] => 78
)
)
)
[p] => Array
(
[0] =>
Doc Willoughby, guru of "America's Test Kitchen," stopped by "CBS The Morning: Saturday" to share his ultimate dish with Rebecca Jarvis and Jeff Glor: Roast Beef Tenderloin with Dried Fruit and Nut Stuffing. Ranked / 5 | 2 views |
[1] => SimpleXMLElement Object
(
[a] => Array
(
[0] => SimpleXMLElement Object
(
[@attributes] => Array
(
[href] => http://www.metacafe.com/watch/cb-YpE1z5IhjWrmCM62DSTU8jQ9X4IZryVR/the_dish_with_doc_willoughby/
)
[strong] => Click here to watch the video
)
[1] => CBS
[2] => Cbsepisode
[3] => Dish
[4] => Doc Willoughby
[5] => America's Test Kitchen
[6] => Roast Beef Tenderloin
[7] => Dried Fruit
[8] => Nut Stuffing
[9] => CBS This Morning
[10] => News & Events
)
[br] => Array
(
[0] => SimpleXMLElement Object
(
)
[1] => SimpleXMLElement Object
(
)
[2] => SimpleXMLElement Object
(
)
)
)
)
Есть ли какой-нибудь способ "сделать выход красивее"? Я имею в виду лучше заказал ... Я также пытался использовать getElementsByTagName('p')
, но безуспешно