Skip to content
/ parsel Public
forked from scrapy/parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Notifications You must be signed in to change notification settings

Digenis/parsel

 
 

Repository files navigation

Parsel

Coverage report

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Features

  • Extract text using CSS or XPath selectors
  • Regular expression helper methods

Example:

>>> from parsel import Selector
>>> sel = Selector(text=u"""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul
        </body>
        </html>""")
>>>
>>> sel.css('h1::text').extract_first()
u'Hello, Parsel!'
>>>
>>> sel.css('h1::text').re('\w+')
[u'Hello', u'Parsel']
>>>
>>> for e in sel.css('ul > li'):
        print(e.xpath('.//a/@href').extract_first())
http://example.com
http://scrapy.org

About

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 95.9%
  • Makefile 4.1%