Languages
[Edit]
EN

Node.js - get text content from HTML with htmlparser2 library

6 points
Created by:
Dollie-Rutledge
416

In this short article, we would like to show how to get Node's text content from HTML using htmlparser2 library under Node.js.

Quick solution (example index.js file):

const htmlparser2 = require('htmlparser2');

const getText = html => {
    const handler = new htmlparser2.DomHandler();
    const parser = new htmlparser2.Parser(handler);

    parser.write(html);
    parser.end();

    return htmlparser2.DomUtils.textContent(handler.root.childNodes);  // or from handler.dom
};


// Example usage:

const html = '<div><p>This is example text 1</p><br /><p>This is example text 2</p></div>';
const text = getText(html);

console.log(text);

Running with:

node ./index.js

Output:

This is example text 1

This is example text 2

 

htmlparser2 installation

Run the following command in the node.js project directory: 

npm install --save htmlparser2

 

See also

  1. Node.js - parse html to DOM with htmlparser2 library

References

  1. htmlparser2 - npm 
Native Advertising
🚀
Get your tech brand or product in front of software developers.
For more information Contact us
Dirask - we help you to
solve coding problems.
Ask question.

❤️💻 🙂

Join