Readability Service

This is a small node server for processing html content with the Readability library of Firefox developed by Mozilla.

See: https://github.com/mozilla/readability/

The goal of this project is to provide an endpoint to use the Readability library to extract the most relevant content of a rendered website.

Docker

Simply run the docker container

docker run -p8080:8080 ese7en/node-readability

Request

The request object must contain the following:

data: the html source code as escaped string

HTTP PUT /
HTTP HEADER: Content-Type: application/json

{
    "data": "...HTML SROUCE CODE AS STRING ..."
}

Response

This response object will contain the following properties:

title: article title
content: HTML string of processed article content
textContent: text content of the article (all HTML removed)
length: length of an article, in characters
excerpt: article description, or short excerpt from the content
byline: author metadata
dir: content direction

Environment Variables

PORT: sets the port on which the server is running

End2End example

Website

<html>
    <head>
        <title>Hello World</title>
    </head>
    <body>
        <h1>This is a website</h1>
        <p>With some text</p>
    </body>
</html>

HTTP PUT Request to http://localhost:8080

{
    "data": "<html>\r\n    <head>\r\n        <title>Hello World<\/title>\r\n    <\/head>\r\n    <body>\r\n        <h1>This is a website<\/h1>\r\n        <p>With some text<\/p>\r\n    <\/body>\r\n<\/html>"
}

with curl

curl --request POST \
  --url http://localhost:8080/ \
  --header 'Content-Type: application/json' \
  --data '{
    "data": "<html>\r\n    <head>\r\n        <title>Hello World<\/title>\r\n    <\/head>\r\n    <body>\r\n        <h1>This is a website<\/h1>\r\n        <p>With some text<\/p>\r\n    <\/body>\r\n<\/html>"
}'

HTTP Response

{
  "title": "Hello World",
  "byline": null,
  "dir": null,
  "content": "<div id=\"readability-page-1\" class=\"page\">\n        <h2>This is a website</h2>\n        <p>With some text</p>\n    \n</div>",
  "textContent": "\n        This is a website\n        With some text\n    \n",
  "length": 55,
  "excerpt": "With some text",
  "siteName": null
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

index.js

index.js

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

Readability Service

Docker

Request

Response

Environment Variables

End2End example

About

Releases

Packages

Languages

License

SbstnErhrdt/node-readability

Folders and files

Latest commit

History

Repository files navigation

Readability Service

Docker

Request

Response

Environment Variables

End2End example

About

Topics

Resources

License

Stars

Watchers

Forks

Languages