scrapy - How to exclude certain paths of xpath without getting scraped? -

- January 15, 2013

i tried scrap data neccesary, when trying exclude part not needed, unable that. please in scraping data necessary?

case - 1:

<div class="abc xyz">       <div class="aaaaaa bbbbbb">            "i dont want include this"       </div>       ***"i want scrap this"*** </div>

case - 2:

<div class="abc xyz">       <div class="aaaaaa bbbbbb">       </div>       ***"i want scrap this"*** </div>

both cases, output tried "i want scrap this".

already tried scraping using './/div[contains(@class,"abc")]//text()' - in first case giving output "i dont want include thisi want scrap this", in second case expected output scraped.

this 1 have garbage in result, job:

result = response.xpath('//div[@class="abc xyz"]/text()').extract() result = "".join(result)

Search This Blog

Click Hand

scrapy - How to exclude certain paths of xpath without getting scraped? -

Comments

Post a Comment

Popular posts from this blog

python - pip install -U PySide error -

apache - setting document root in antoher partition on ubuntu -

cytoscape.js - How to add nodes to Dagre layout with Cytoscape -