SEO and marketing specialists often monitor search engine results and popular user queries (commonly referred to as "keywords"). Understanding trends allows them to build a more effective market positioning strategy, refining the wording of article titles, section names, product categories, meta tags, internal headings, and subheadings, developing a comprehensive content plan, etc.
Sometimes, however, understanding a primary keyword query may be incomplete. In such cases, related search phrases come in handy. Google provides these suggestions in a special block alongside the main search results.
Below is a guide on how to parse the Google PAA (People Also Ask) block.
What Is People Also Ask in Google?
Google has long been enriching search results with additional blocks (referring to supplementary content besides ads). These additional blocks include:
- A brief information snippet (often sourced from Wikipedia, but it can also be the data about an information source, a short definition of a term, company/brand information, etc.).
- Thematic images and links to sections with related media.
- A topic-related news block.
- A video block featuring content related to the search query.
- Related category blocks: similar locations, famous personalities, etc.
- Product listings, online maps;
- Etc.
While some extra features are displayed for specific thematic queries only, others are consistently present in search results. These include:
People also ask (Google PAA). It comes as an expandable "accordion" block containing related search queries. It is always displayed at the top of search results (after the first answer or in certain types of thematic queries — after specialized thematic blocks). The questions within the PAA block provide more details and context on the primary search query, effectively forming a structured, in-depth answer.
For example, searching for "proxy" on Google may display a People Also Ask block with related questions like: "What is a proxy used for?", "What was a proxy?", "Is proxy the same as VPN?", and "Are proxies illegal?". The structure and number of questions in this block can vary depending on the search term, but they always remain contextually relevant to the topic.
People also search for. The block appears at the bottom of the organic search results. These are basically related search queries. They do not intend to focus on the main query, but on related niche queries instead. It helps users explore other relevant topics that others are also interested in.
The People Also Ask block was introduced in 2015 and is a comprehensive response to a user’s query. Having browsed it, you may not even have to visit websites listed in the organic search results (SERP). The PAA section already provides a complete answer.
Parsing Google’s People Also Ask can be extremely valuable for:
- Marketing Tasks: It helps understand users' meaning when they type a query. A deeper understanding of the audience's intent is always a significant advantage.
- Content Creation Tasks: The PAA block provides a ready-made content structure, reflecting how Google perceives and organizes the topic.
- SEO Tasks: Awareness of Google's interpretation of related queries allows for better content optimization, increasing the chances of ranking high in search results. Analyzing the PAA block can also help expand a website’s semantic core.
Exploration of the People Also Search For block can help with similar tasks.
How to Scrape Google People Also Ask?
Previously, we discussed how to scrape Google SERP. In that guide, however, we used the Froxy SERP Scraper API (you can find more details in the documentation). This article will review the step-by-step process of creating our own parser in Python.
Python makes the process of code writing extremely convenient and provides numerous ready-to-use libraries for web scraping. These include the Beautiful Soup parser as well as web drivers like Playwright and Selenium, which are essential for managing headless browsers and handling anti-detection mechanisms via API.
A Few Words about Scraping Google People Also Ask
- Firstly, Google does not provide an official API for search results. While Google offers APIs for various services, there is no direct API for search results. This means that the only way to collect this data is to use web scraping.
- Secondly, Google actively fights against automated traffic. Instead of completely blocking the connections (it doesn’t ban the IPs or blacklist them), Google protects suspicious connections using CAPTCHAs. Thus, if you don’t wish to waste your effort and time solving tasks, it is more convenient to use rotating proxies with "clean" IP addresses - preferably residential proxies (IP addresses of home users) or mobile proxies.
- Thirdly, Google’s search results use dynamic elements (AJAX & JavaScript). To bypass the problem of processing dynamic content, you will need a headless browser. This will also be useful for managing cookies and digital fingerprints (anti-detect browsers will handle this task better).
- Fourthly, Google’s search result page structure is not static. If the CSS classes and DOM attributes in your parsing script change, the Google People Also Ask scraping will stop working. In such cases, you’ll need to inspect the latest HTML structure of the search results and update your script accordingly.
We’ll review the sample of the simplest script below. It works well for custom tasks. You can describe a more complex logic: with data export in the required format, proxy rotation, flow paralleling, etc. This is, however, the corporate segment and more complicated tasks.
Let’s start with the basics — environment setup.
Step 1. Installing Python and Downloading Libraries
We have covered the basic setup before. You can download the Python distribution from the official website or install it via an integrated package manager (for Linux users).
Since we are using Windows, we downloaded and installed the .exe package.
During installation, check the option to add Python to the Path environment variable automatically.
If you encounter errors when running pip due to missing executables, manually specify the path, for example:
In our case, the pip cmdlet was added along with the path to the executable file: «C:\Program Files\Python313\python.exe». If you installed Python for a certain user, then pip will be likely stored on the path «C:\Users\LOGIN\AppData\Local\Programs\Python\Python313\python.exe».
Note: Adjust the path based on your current Python version. The path will contain other numbers if it is higher than 313 (3.13).
Similarly, the path to the executed python.exe file is added (the cmdlet will be called “python”, but you can call it as you wish - just don’t forget how to address it when entering commands in the console).
Now, install the Beautiful Soup library. To do this, first launch the Windows console (Powershell) and run the command:
pip install beautifulsoup4
If you couldn’t add pip to the variable environments, then enter the command:
python -m pip install beautifulsoup4
In other package messengers, the syntax may vary.
Wait until the script completes all operations (other modules will be downloaded simultaneously).
If you work with the search results directly, you’ll also install the library to work with HTTP requests:
pip install requests
If you intend to use a headless or anti-detect browser, the requests library is not needed. Instead, the web driver should be installed.
Selenium, for example:
pip install selenium
For Playwright, the algorithm will be somewhat different. First, you need to install the library and then headless browsers:
pip install pytest-playwright
playwright install
If you don’t need all browsers at the same time, you may install only separate ones:
playwright install chrome firefox
We’ll use the nodriver below, which is installed by the command:
pip install nodriver
When all the required libraries are installed, you may write your script for Google People Also Ask scraping.
Step 2. Studying the HTML Structure of Google Search Results
Access your favorite browser and add a new tab. Open Google.com and enter any search query. Google will display the search results. Look for the People Also Ask section. Right-click on the first question and select "Inspect" (in Chrome; in other browsers, the option may have a different name).
We see the unreadable class names. They will vary with every new browser instance and every new request. Here is what we’ve got: class="L3Ezfd."
Let’s now check if working with the search results directly and getting the related HTML code is possible.
Create a separate catalog for your Python scripts. For example: «C:\scrape-google-people-also-ask».
Create a new text file inside the catalog and name it «scrape-google-people-also-ask».
Change the .txt extension if needed.
Open the file in any text editor and fill it with content:
import requests
from bs4 import BeautifulSoup
# Import the BeautifulSoup library and requests
# Define the simplest function for scraping Google SERP
def get_soup_from_google_search(query):
# Replace the spaces with pluses to add the requests to the line as required
query = query.replace(' ', '+')
# The URL address is used to enter the search query, which is correspondingly provided via the variable
url = f"https://www.google.com/search?q={query}"
# Present the current Chrome browser version (copy the user-agent from your browser)
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36"
}
# Send the GET-request to Google.com
response = requests.get(url, headers=headers)
# If everything is ok, the website will return the code 200
if response.status_code == 200:
print("Server OK, there are no errors")
# Scrape the content with BeautifulSoup
soup = BeautifulSoup(response.content, "html.parser")
print("Starting to scrape Google People Also Ask...")
print("Query: ", query)
return soup
else:
print(f"The server returned an error: {response.status_code}")
return None
# Define your search request here; you can replace it with any other request
query = "how to scrape google people also ask"
# Send the request for scraping
soup = get_soup_from_google_search(query)
# Return the entire HTML page code to view it…
if soup:
print("Page HTML-code:")
html_content = soup
print(html_content)
Save the file and launch it in the console:
cd C:\scrape-google-people-also-ask
python scrape-google-people-also-ask
As you can see, the code is unreadable; it doesn’t contain HTML tags (except for HEAD and BODY), and it only includes the JS scripts. This means that you won’t go without the headless browser.
Residential Proxies
Perfect proxies for accessing valuable data from around the world.
Step 3. The Simplest Script for Google People Also Ask Scraping
We used the nodriver, but you can use any other, including the original API for Chrome browsers – the CDP (Chrome Devtool Protocol).
Here is how our script for collecting data from the Google People Also Ask block will look:
# Import libraries
import asyncio
import nodriver as hdlschrome
from bs4 import BeautifulSoup
import json
import re
# Create the asynchronous function and provide the variable with our search query for processing
async def scrape_google_people_also_ask(search_query):
try:
# Export the log in the console that you’ve launched the scraping process
print(f"Search query: {search_query}")
# Replace the spaces with pluses in the query, which is required by the Google URL syntax
search_query_plus = search_query.replace(' ', '+')
# Launch the browser instance, disabling the Headless flag for it
browser = await hdlschrome.start(headless=False)
print("Starting browser")
# Provide the URL that has to be opened to the browser
page = await browser.get(f"https://www.google.com/search?q={search_query_plus}")
print("Open Search Page")
# Ask a browser to wait for 5 seconds. This time can be changed, if needed. A long timeout may be allowed for slow connections.
await asyncio.sleep(5)
print("Wait 5 sec...")
# Here we provide the resulting HTML code to the html_content variable for further scraping
html_content = await page.evaluate('document.documentElement.outerHTML')
print(f"HTML-content extracted: {len(html_content)} symbols")
# Pass the HTML code for further processing to the BeautifulSoup library
soup = BeautifulSoup(html_content, 'html.parser')
# Search the Google People Also Ask block
# Find the block with a corresponding name.
# Go from it up along the structure to find the parent DIV
# You can list the alternative block names, for various languages and locations, through the vertical line
print("BeautifulSoup parse block Google People Also Ask")
paa_main = soup.find(string=re.compile('People also ask|Thematic questions')).find_parent().find_parent().find_parent().find_parent().find_parent().find_parent()
# Fill the structure array
paa_info = {
# Display the content only, with no HTML tags
'Content': paa_main.get_text(),
}
print("Information retrieved successfully")
# Return the data
return paa_info
# Process the exceptions and errors
except Exception as err:
print(f"An error occurred while parsing: {str(err)}")
return None
# To free the resources, stop the browser
finally:
if 'browser' in locals():
browser.stop()
print("Closing the browser")
async def main():
# Here you can redefine the search query
search_query = "What is the color of night"
# Replace the spaces with underlined symbols in the query to properly save the file names
search_query_minus = search_query.replace(' ', '_')
# Create the asynchronous process and provide the variable with the request to the Google scraping process
paa_info = await scrape_google_people_also_ask(search_query)
# If the data is not blank, display the scraping result in the console
if paa_info:
print("\nPeople Also Ask:")
# Browse the array by keywords, replacing the underlined symbols with spaces to make them look great
for key, value in paa_info.items():
print(f"{key.replace('_', ' ').title()}: {value}")
# Save the data in the JSON file
with open(f"{search_query_minus}_query.json", 'w', encoding='utf-8') as f:
json.dump(paa_info, f, ensure_ascii=False, indent=4)
print(f"Information saved to file {search_query_minus}_query.json")
else:
print("Unable to parse information from Google People Also Ask.")
if __name__ == "__main__":
hdlschrome.loop().run_until_complete(main())
This is a ready-made implementation, so read the comments in the code and modify the script according to your needs.
The retrieved data is conveniently saved in the folder with the script as a JSON file (the file name will include your search query).
The main challenge of scraping Google People Also Ask is navigating the tree of HTML elements. Additionally, Google actively protects itself by displaying a CAPTCHA. As a result, we do not recommend using the incognito mode (it is intentionally disabled in the browser launch parameters).
Google reliably detects Selenium and always generates a CAPTCHA. The scraping script will only work if you manually solve the challenges (if any).
Attention! In EU countries, Google also displays a consent window for processing cookies (following GDPR policies).
Since Google defines the language and other regional settings based on location, the phrase "People also ask" may appear differently. First, check the output of your headless browser and update the translation accordingly (it serves as a reference point for locating the required elements).
Step 4. Connecting the Scraper via Rotating Proxies
To avoid solving a CAPTCHA every time, setting up the parser connection through a proxy makes sense.
If you need to rotate proxies at the level of new tabs, you should use third-party extensions like NodriverProxy, at least for nodriver (the library we used in the code example).
Nodriver does not natively support working through proxies, so you must configure them at the browser level (using the CDP protocol) or use another library with a web driver.
For example, in Selenium, proxies can be configured at the level of launch parameters for a headless browser instance:
# Here the libraries are imported, including the Options module
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
# here you can specify your current proxy connection parameters, you can replace the parameters written in Caps Lock
proxy_host = "PROXY_ADRESS"
proxy_port = " PROXY_PORT"
proxy_user = "YOUR_SERNAME"
proxy_pass = " YOUR_PASSWORD"
# Here you can specify other options and browser launch flags…
options = Options()
proxy_server_url = f"https://{proxy_user}:{proxy_pass}@{proxy_host}:{proxy_port}"
options.add_argument(f'--proxy-server={proxy_server_url}')
# The browser is further launched based on the specified options
driver = webdriver.Chrome(
service=ChromeService(ChromeDriverManager().install()),
options=options
)
# Your scraper logic further
…
Conclusion and Recommendations
Scraping Google search results has become significantly more challenging, as the company has moved away from static HTML pages (now it's entirely JavaScript code) and has intentionally introduced unique identifiers for CSS classes and other attributes.
Overcoming the new markup requires considerable effort, but that's not all. The search engine actively protects its pages with CAPTCHAs. To bypass Google's restrictions, rotating proxies and specialized tools capable of masking the use of a headless browser are necessary.
You can find high-quality residential and mobile proxies with Froxy. A budget-friendly trial package is available for convenient testing. Proxies connect to the parser once, with further rotation settings managed in the user dashboard.
Additionally, we offer a ready-to-use SERP scraper. There is no need to write scripts — the data is downloaded as a chart or in JSON format.