You are first testing the #text
nodes to see if the text contains the word you are trying to highlight, but then performing the replacement on the .innerHTML
of the parent element. There are a couple of problems with this.
- Infinite replacements: When you modify the
.innerHTML
of the parent element you change the childNodes
array. You do so in a way that has added a node further in the array containing the text which is to be replaced. Thus, when you continue scanning the childNodes
array you always find a (new) node that contains the text you want to replace. So, you replace it again, creating another node that has a higher index in the childNodes
array. Repeat infinitely.
- Using a RegExp to replace text in the
.innerHTML
property. While you have already tested to make sure the text you desire to replace is actually contained in a text node, this does not prevent your RegExp from also replacing any matching words within the actual HTML of the element (e.g. in src="yourWord"
, href="http://foo.com/yourWord/bar.html"
, or if attempting to highlight words like style
, color
, background
, span
, id
, height
, width
, button
, form
, input
, etc.).
- You are not checking to make sure you are not changing text in
<script>
or <style>
tags.
- You are checking that you only make changed in text nodes (i.e. you check for
node.nodeType === 3
). If you weren't checking for this you would also have the following possible problems due to using .innerHTML
to change HTML:
- You could end up changing attributes, or actual HTML tags, depending on what you are changing with
.replace()
. This could completely disrupt the page layout and functionality.
- When you change
.innerHTML
the DOM for that portion of the page is completely recreated. This means the elements, while new elements might be the same type with the same attributes, any event listeners which were attached to the old elements will not be attached to the new elements. This can significantly disrupt the functionality of a page.
- Repeatedly changing large portions of the DOM can be quite compute intensive to re-render the page. Depending on how you do this, you may run into significant user-perceived performance issues.
Thus, if you are going to use a RegExp to replace the text, you need to perform the operation only on the contents of the #text
node, not on the .innerHTML
of the parent node. Because you are wanting to create additional HTML elements (e.g. new <span style="">
elements, with child #text
nodes), there are some complications.
Can not assign HTML text to a text node to create new HTML nodes:
There is no way to assign new HTML directly to a text node and have it evaluated as HTML, creating new nodes. Assigning to a text node's .innerHTML
property will create such a property on the Object (just like it would on any Object), but will not change the text displayed on the screen (i.e. the actual value of the #text
node). Thus, it will not accomplish what you are wanting to do: it will not create any new HTML children of the parent node.
The way to do this that has the least impact on the page's DOM (i.e. least likely to break existing JavaScript on the page) is to create a <span>
to include the new text nodes you are creating (the text that was in the #text
node that is not in your colored <span>
) along with the potentially multiple <span>
elements you are creating. This will result in replacing a single #text
node with a single <span>
element. While this will create additional descendants, it will leave the number of children in the parent element unchanged. Thus, any JavaScript which was relying on that will not be affected. Given that we are changing the DOM, there is no way to not potentially break other JavaScript, but this should minimize that possibility.
Some examples of how you can do this: See this answer (replaces a list of words with those words in buttons) and this answer (places all text in <p>
elements which is separated by spaces into buttons) for full extensions that perform regex replace with new HTML. See this answer which does basically the same thing, but makes a link (it has a different implementation which traverses the DOM with a TreeWalker to find #text
nodes instead of a NodeIterator as used in the other two examples).
Here is code which will perform the replacement which you are desiring on each text node in the document.body
and create the new HTML needed to have the style
be different in a portion of the text:
function handleTextNode(textNode) {
if(textNode.nodeName !== '#text'
|| textNode.parentNode.nodeName === 'SCRIPT'
|| textNode.parentNode.nodeName === 'STYLE'
) {
//Don't do anything except on text nodes, which are not children
// of <script> or <style>.
return;
}
let origText = textNode.textContent;
let newHtml=origText.replace(/(teste)/gi
,'<span style="background-color: yellow">$1</span>');
//Only change the DOM if we actually made a replacement in the text.
//Compare the strings, as it should be faster than a second RegExp operation and
// lets us use the RegExp in only one place for maintainability.
if( newHtml !== origText) {
let newSpan = document.createElement('span');
newSpan.innerHTML = newHtml;
textNode.parentNode.replaceChild(newSpan,textNode);
}
}
let textNodes = [];
//Create a NodeIterator to get the text nodes in the body of the document
let nodeIter = document.createNodeIterator(document.body,NodeFilter.SHOW_TEXT);
let currentNode;
//Add the text nodes found to the list of text nodes to process.
while(currentNode = nodeIter.nextNode()) {
textNodes.push(currentNode);
}
//Process each text node
textNodes.forEach(function(el){
handleTextNode(el);
});
There are other ways to do this. However, they will generate more significant changes to the structure of the children for that specific element (e.g. multiple additional nodes on the parent). Doing so has a higher potential of breaking any JavaScript already on the page which is relying on the current structure of the page. Actually, any change like this has the potential to break current JavaScript.
The code for in this answer was modified from the code in this other answer of mine