How to grab an output result from website using selenium

Problem Description:

So there is this code that i want to try. if a website exists it outputs available domain names. i used this website

Website available

If the website does not exist, currently parked, or registered it says this.

not valid

The code that i’m thinking involves selenium and chrome driver input the text and search it up.

from selenium import webdriver
from import ChromeDriverManager

cli = ['', '', '', '']
Exists = []

for i in cli:
    Name = driver.find_element(By.CSS_SELECTOR, "input[name='whoisDomainName']")

Is there a way where for example if website available, exist.append(cli), elif web not valid, print('Not valid') so that it can filter out a website that exists and the website that does not. i was thinking of using beautifulsoup to get outputs, but i’m not sure how to use it properly.

Thank you!

Solution – 1

There is no need to use other libraries.

Rather than using XPATHs like that, because it may change the structure of the page. Always try to search for elements by ID, if it exists associated with that particular element (which by their nature should be unique on the page) or by class name (if it appears to be unique) or by name attribute.

Some notes on the algorithm:

  1. We can visit the homepage once and then submit the url from time to time. Thus we save execution time.

  2. Whenever we submit a url, we simply need to verify that that url does not exist (or that it does)

  3. Name the variables in a more conversational/descriptive way.

  4. Pay attention to sending too many requests quickly to the site. It may block you. Perhaps this is not the right approach for your task? Are there no APIs that can be used for such services?

Your code becomes:

from selenium import webdriver
from import ChromeDriverManager
from import Service
from import Options
from import By
from import WebDriverWait
from import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time

opts = Options()

# make web scraping 'invisible' if GUI is not required

user_agent = "user-agent=[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36]"
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=opts)

urls = ['', '', '', '']
exists = []


for url in urls:

    # send url to textarea
    textarea = driver.find_element(By.NAME, "whoisDomainName")
    textarea.clear()  # make sure to clear textarea

    # click 'WHOIS LOOKUP' button
    driver.find_element(By.ID, "submitBasic").click()

    # try to find error message (wait 3 sec)
        WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.CLASS_NAME, 'whoisSearchError')))
        print(f'URL {url} is not valid')
    except TimeoutException:
        print(f'URL {url} is valid')

    time.sleep(30)  # wait 30 seconds to avoid '429 too many requests'

print(f"nURLs that exist:n", exists)

Output will be:

URL is valid
URL is valid
URL is valid
URL is not valid

URLs that exist:
 ['', '', '']
