Getting text of children tags with CSS selector with Scrapy returns nothing

Getting text of children tags with CSS selector with Scrapy returns nothing

Problem Description:

While its a very common question at first, I have tried many different approach to scrap all the text recursively from the following html code, but for some reason none of them worked:

<span class="coupon__logo coupon__logo--for-shops">




      <span class="amount"><b>20</b>%</span>

      <span class="type">Cupom</span>



</span>

What I tried :

p.css('span.coupon__logo coupon__logo--for-shops *::text').get()

p.css('span.amount ::text').get()

p.css('span.amount *::text').get()

And even a xpath one:

p.xpath('//span[@class="coupon__logo coupon__logo--for-shops"]//text()').get()
p.xpath('//span[@class="amount"]//text()').get()

The best thing I got was p.css('span.amount *::text').getall(), but it will extract the text from all of the concurrences, what requires me to create a code to organize them individually, while is way better if i could get only the text of the current instance, especially because I’m looping trough many of them, and because it would be vulnerable to any changes from the website .

Solution – 1

instead of getting all the text of all the children of <span class="coupon__logo coupon__logo--for-shops"> you can get the text of specific children.

CSS:

scrapy shell file:///path/to/file.html

In [1]: ' '.join(response.css('span.coupon__logo.coupon__logo--for-shops span *::text').getall())
Out[1]: '20 % Cupom'

xpath:

scrapy shell file:///path/to/file.html

In [1]: ' '.join(response.xpath('//span[@class="coupon__logo coupon__logo--for-shops"]/span//text()').getall())
Out[1]: '20 % Cupom'

If you have more span tags and you only want amount and type you can use this:

CSS:

scrapy shell file:///path/to/file.html

In [1]: ' '.join(response.css('span.coupon__logo.coupon__logo--for-shops span.amount *::text, span.type::text').getall())
Out[1]: '20 % Cupom'

xpath:

scrapy shell file:///path/to/file.html

In [1]: ' '.join(response.xpath('//span[@class="coupon__logo coupon__logo--for-shops"]/span[@class="amount" or @class="type"]//text()').getall())
Out[1]: '20 % Cupom'
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject