Domdocument saveHTML () добавление дополнительных кавычек и некоторых других символов в кодировке URL - PullRequest
1 голос
/ 19 августа 2011

Я использовал PHP-расширение Domdocument для поиска тегов изображения без атрибута alt или с пустым атрибутом alt.Вот HTML-код, который я использую для тестирования:

<span style="font-weight:bold;">Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href="http://www.blender.org">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href="http://www.google.com"" target="_blank">Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com. <img src="http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg">

А вот код Domdocument, который я использовал для поиска по тегу IMG и добавления к нему атрибута alt.

$dom=new DOMDocument();
$dom->loadHTML($content);
$dom->formatOutput = true;
$imgs = $dom->getElementsByTagName("img");
foreach($imgs as $img){
 $alt = $img->getAttribute('alt');
 if ($alt == ''){
  $k_alt = $this->keyword;    
 }else{
  $k_alt = $alt;
 }
 $img->setAttribute( 'alt' , $k_alt );
}
$html_mod = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));
return $html_mod;

А вот HTML-код, который я получаю.

<span style='"font-weight:bold;"'>Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href=""http://www.blender.org"">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href=""http://www.google.com""" target='"_blank"'>Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com. 
<img src=""http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg"" alt="Blender">

Обратите внимание на дополнительные цитаты (одинарные и двойные) в тегах img src и anchor, а также в атрибуте style для span.

Пожалуйста, помогите!Я хочу вернуть html без изменений, добавив только новый атрибут alt.

Также хочу отметить, что я использую PHP 5.3.2 с Suhosin Patch в Ubuntu 10.04

...