javascript - Strip all unwanted tags from html string but preserve whitespace in JS -
i trying strip html content of unwanted tags , return text basic formatting (ul, b, u, p etc) or plain text (but preserving new lines, spacing etc) having trouble creating catch solution let me keep structure of content pasted.
example string:
<p class="bodytext" style="color: rgb(51, 51, 51);background-color: rgb(255, 255, 255);"> <span lang="en-gb">hello <span class="apple-converted-space"> world, </span> <span class="cross-reference"> <a href="" style="color: rgb(66, 139, 202);background-color: transparent;">cough </a> </span> <span class="apple-converted-space"></span>and <span class="apple-converted-space"></span> <span class="cross-reference"> <a href="" style="color: rgb(66, 139, 202);background-color: transparent;">feverish - risk assessment</a> </span>. <span class="apple-converted-space"></span> </span> </p> <p class="bodytext" style="color: rgb(51, 51, 51);background-color: rgb(255, 255, 255);"> <span lang="en-gb">fin. </span> </p>
here plain javascript solution remove span
elements within html leave inner content:
var span = document.getelementsbytagname('span'); while(span.length) { var parent = span[ 0 ].parentnode; while( span[ 0 ].firstchild ) { parent.insertbefore( span[ 0 ].firstchild, span[ 0 ] ); } parent.removechild( span[ 0 ] ); }
you can more using jquery, shown in example remove span
tags, p
, b
, ul
, li
tags, leave inner content:
$("span, p, b, ul, li").contents().unwrap();
see also: remove html tag keep innerhtml
it may beneficial note anytime have 2 or more consecutive spaces, modern browser typically truncate these 1 space when display. if want preserve spacing multiple spaces, replace regularly typed space "" characters "
" html encoded spaces. ordinary javascript has string replace method can use that, if desired.
edit: if wish remove html tags within javascript string, try following:
mystring.replace(/<(?:.|\n)*?>/gm, '');
see also: strip html text javascript
Comments
Post a Comment