Languages

Regex - replace entire HTML tag with content - script, style, link, iframe

2 points
Asked by:
Kate_C
2952

How can I replace entire HTML tag with conent?

For example:

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>

    <div>some text 1</div>

    <script type="text/javascript">
        var tmp = 1;
    </script>

    <div>some text 2</div>

    <link rel="stylesheet" type="text/css" href="/resources/editor.css" />

    <p>some text 3</p>
    
    <script type="text/javascript" src="/resources/editor.js"></script>

    <p>some text 4</p>
    
    <iframe name="test" id="test" src="https://accounts.google.com/o/oauth2/test"
            style="width: 1px; height: 1px; position: absolute; top: -100px;">
    </iframe>

    <p>some text 5</p>

</body>
</html>

How to remove entire tags with content:

  • script with tmp
  • link
  • script with src
  • iframe

Expected html output:

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>

    <div>some text 1</div>

    <div>some text 2</div>

    <p>some text 3</p>
    
    <p>some text 4</p>
    
    <p>some text 5</p>

</body>
</html>

I'd like to create regex to match entire tag. Tags like div, p and others would be left out like on above example.

1 answers
3 points
Answered by:
Kate_C
2952

Here is the expected regex:

(?s)<script.*?</script>|(?s)<iframe.*?</iframe>|(?s)<style.*?</style>|(?s)<link.*?>

Regex for single tag:

(?s)<script.*?</script>
(?s)<iframe.*?</iframe>
(?s)<style.*?</style>
(?s)<link.*?>

Example with intellij IDEA:

And the output:

I like to use intellij IDEA, because it highlights the matching text, we can also use any other editor for example notepad plus plus is also great one. Also we can use it with any programming language, but we need to test if it will work, because for example with java I remember there was something about mulit line matching and we needed to set some additional parameters, but here the question was about only regex, so we can deal with the problem I described in next questions. 

0 comments Add comment
Native Advertising
50 000 ad impressions - 449$
🚀
Get your tech brand or product in front of software developers.
For more information contact us:
Red dot
Dirask - friendly IT community for everyone.

❤️💻 🙂

Join