Sign In Sign Up

Cases

How to Scrape Google People Also Ask: Guide with 4 Steps

Learn how to effectively parse Google People Also Ask (PAA) section using Python and automation tools. Extract valuable insights for your SEO strategy!

Team Froxy 1 May 2025 8 min read
How to Scrape Google People Also Ask: Guide with 4 Steps

SEO and marketing specialists often monitor search engine results and popular user queries (commonly referred to as "keywords"). Understanding trends allows them to build a more effective market positioning strategy, refining the wording of article titles, section names, product categories, meta tags, internal headings, and subheadings, developing a comprehensive content plan, etc.

Sometimes, however, understanding a primary keyword query may be incomplete. In such cases, related search phrases come in handy. Google provides these suggestions in a special block alongside the main search results.

Below is a guide on how to parse the Google PAA (People Also Ask) block.

What Is People Also Ask in Google?

Google has long been enriching search results with additional blocks (referring to supplementary content besides ads). These additional blocks include:

  • A brief information snippet (often sourced from Wikipedia, but it can also be the data about an information source, a short definition of a term, company/brand information, etc.).
  • Thematic images and links to sections with related media.
  • A topic-related news block.
  • A video block featuring content related to the search query.
  • Related category blocks: similar locations, famous personalities, etc.
  • Product listings, online maps;
  • Etc.

While some extra features are displayed for specific thematic queries only, others are consistently present in search results. These include:

People also ask (Google PAA). It comes as an expandable "accordion" block containing related search queries. It is always displayed at the top of search results (after the first answer or in certain types of thematic queries — after specialized thematic blocks). The questions within the PAA block provide more details and context on the primary search query, effectively forming a structured, in-depth answer.

For example, searching for "proxy" on Google may display a People Also Ask block with related questions like: "What is a proxy used for?", "What was a proxy?", "Is proxy the same as VPN?", and "Are proxies illegal?". The structure and number of questions in this block can vary depending on the search term, but they always remain contextually relevant to the topic.

People also ask

People also search for. The block appears at the bottom of the organic search results. These are basically related search queries. They do not intend to focus on the main query, but on related niche queries instead. It helps users explore other relevant topics that others are also interested in.

People also search for

The People Also Ask block was introduced in 2015 and is a comprehensive response to a user’s query. Having browsed it, you may not even have to visit websites listed in the organic search results (SERP). The PAA section already provides a complete answer.

Parsing Google’s People Also Ask can be extremely valuable for:

  • Marketing Tasks: It helps understand users' meaning when they type a query. A deeper understanding of the audience's intent is always a significant advantage.
  • Content Creation Tasks: The PAA block provides a ready-made content structure, reflecting how Google perceives and organizes the topic.
  • SEO Tasks: Awareness of Google's interpretation of related queries allows for better content optimization, increasing the chances of ranking high in search results. Analyzing the PAA block can also help expand a website’s semantic core.

Exploration of the People Also Search For block can help with similar tasks.

How to Scrape Google People Also Ask?

Previously, we discussed how to scrape Google SERP. In that guide, however, we used the Froxy SERP Scraper API (you can find more details in the documentation). This article will review the step-by-step process of creating our own parser in Python.

Python makes the process of code writing extremely convenient and provides numerous ready-to-use libraries for web scraping. These include the Beautiful Soup parser as well as web drivers like Playwright and Selenium, which are essential for managing headless browsers and handling anti-detection mechanisms via API.

A Few Words about Scraping Google People Also Ask

  • Firstly, Google does not provide an official API for search results. While Google offers APIs for various services, there is no direct API for search results. This means that the only way to collect this data is to use web scraping.
  • Secondly, Google actively fights against automated traffic. Instead of completely blocking the connections (it doesn’t ban the IPs or blacklist them), Google protects suspicious connections using CAPTCHAs. Thus, if you don’t wish to waste your effort and time solving tasks, it is more convenient to use rotating proxies with "clean" IP addresses - preferably residential proxies (IP addresses of home users) or mobile proxies.
  • Thirdly, Google’s search results use dynamic elements (AJAX & JavaScript). To bypass the problem of processing dynamic content, you will need a headless browser. This will also be useful for managing cookies and digital fingerprints (anti-detect browsers will handle this task better).
  • Fourthly, Google’s search result page structure is not static. If the CSS classes and DOM attributes in your parsing script change, the Google People Also Ask scraping will stop working. In such cases, you’ll need to inspect the latest HTML structure of the search results and update your script accordingly.

We’ll review the sample of the simplest script below. It works well for custom tasks. You can describe a more complex logic: with data export in the required format, proxy rotation, flow paralleling, etc. This is, however, the corporate segment and more complicated tasks.

Let’s start with the basics — environment setup.

Step 1. Installing Python and Downloading Libraries

We have covered the basic setup before. You can download the Python distribution from the official website or install it via an integrated package manager (for Linux users).

Since we are using Windows, we downloaded and installed the .exe package.

During installation, check the option to add Python to the Path environment variable automatically.

If you encounter errors when running pip due to missing executables, manually specify the path, for example:

Installing Python and Downloading Libraries

In our case, the pip cmdlet was added along with the path to the executable file: «C:\Program Files\Python313\python.exe». If you installed Python for a certain user, then pip will be likely stored on the path «C:\Users\LOGIN\AppData\Local\Programs\Python\Python313\python.exe».

Note: Adjust the path based on your current Python version. The path will contain other numbers if it is higher than 313 (3.13).

Similarly, the path to the executed python.exe file is added (the cmdlet will be called “python”, but you can call it as you wish - just don’t forget how to address it when entering commands in the console).

Now, install the Beautiful Soup library. To do this, first launch the Windows console (Powershell) and run the command:

pip install beautifulsoup4

If you couldn’t add pip to the variable environments, then enter the command:

python -m pip install beautifulsoup4

In other package messengers, the syntax may vary.

Wait until the script completes all operations (other modules will be downloaded simultaneously).

If you work with the search results directly, you’ll also install the library to work with HTTP requests:

pip install requests

If you intend to use a headless or anti-detect browser, the requests library is not needed. Instead, the web driver should be installed.

Selenium, for example:

pip install selenium

For Playwright, the algorithm will be somewhat different. First, you need to install the library and then headless browsers:

pip install pytest-playwright

playwright install

If you don’t need all browsers at the same time, you may install only separate ones:

playwright install chrome firefox

We’ll use the nodriver below, which is installed by the command:

pip install nodriver

When all the required libraries are installed, you may write your script for Google People Also Ask scraping.

Step 2. Studying the HTML Structure of Google Search Results

Access your favorite browser and add a new tab. Open Google.com and enter any search query. Google will display the search results. Look for the People Also Ask section. Right-click on the first question and select "Inspect" (in Chrome; in other browsers, the option may have a different name).

We see the unreadable class names. They will vary with every new browser instance and every new request. Here is what we’ve got: class="L3Ezfd."

Let’s now check if working with the search results directly and getting the related HTML code is possible.

Create a separate catalog for your Python scripts. For example: «C:\scrape-google-people-also-ask».

Create a new text file inside the catalog and name it «scrape-google-people-also-ask».

Change the .txt extension if needed.

Open the file in any text editor and fill it with content:

import requests

from bs4 import BeautifulSoup

# Import the BeautifulSoup library and requests

# Define the simplest function for scraping Google SERP

def get_soup_from_google_search(query):

# Replace the spaces with pluses to add the requests to the line as required

query = query.replace(' ', '+')

# The URL address is used to enter the search query, which is correspondingly provided via the variable

url = f"https://www.google.com/search?q={query}"

# Present the current Chrome browser version (copy the user-agent from your browser)

headers = {

"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36"

}

# Send the GET-request to Google.com

response = requests.get(url, headers=headers)

# If everything is ok, the website will return the code 200

if response.status_code == 200:

print("Server OK, there are no errors")

# Scrape the content with BeautifulSoup

soup = BeautifulSoup(response.content, "html.parser")

print("Starting to scrape Google People Also Ask...")

print("Query: ", query)

return soup

else:

print(f"The server returned an error: {response.status_code}")

return None

# Define your search request here; you can replace it with any other request

query = "how to scrape google people also ask"

# Send the request for scraping

soup = get_soup_from_google_search(query)

# Return the entire HTML page code to view it…

if soup:

print("Page HTML-code:")

html_content = soup

print(html_content)

Save the file and launch it in the console:

cd C:\scrape-google-people-also-ask

python scrape-google-people-also-ask

As you can see, the code is unreadable; it doesn’t contain HTML tags (except for HEAD and BODY), and it only includes the JS scripts. This means that you won’t go without the headless browser.

Residential Proxies

Perfect proxies for accessing valuable data from around the world.

Choose Proxy $1.99, 100Mb

Step 3. The Simplest Script for Google People Also Ask Scraping

We used the nodriver, but you can use any other, including the original API for Chrome browsers – the CDP (Chrome Devtool Protocol).

Here is how our script for collecting data from the Google People Also Ask block will look:

# Import libraries

import asyncio

import nodriver as hdlschrome

from bs4 import BeautifulSoup

import json

import re

# Create the asynchronous function and provide the variable with our search query for processing

async def scrape_google_people_also_ask(search_query):

try:

# Export the log in the console that you’ve launched the scraping process

print(f"Search query: {search_query}")

# Replace the spaces with pluses in the query, which is required by the Google URL syntax

search_query_plus = search_query.replace(' ', '+')

# Launch the browser instance, disabling the Headless flag for it

browser = await hdlschrome.start(headless=False)

print("Starting browser")

# Provide the URL that has to be opened to the browser

page = await browser.get(f"https://www.google.com/search?q={search_query_plus}")

print("Open Search Page")

# Ask a browser to wait for 5 seconds. This time can be changed, if needed. A long timeout may be allowed for slow connections.

await asyncio.sleep(5)

print("Wait 5 sec...")

# Here we provide the resulting HTML code to the html_content variable for further scraping

html_content = await page.evaluate('document.documentElement.outerHTML')

print(f"HTML-content extracted: {len(html_content)} symbols")

# Pass the HTML code for further processing to the BeautifulSoup library

soup = BeautifulSoup(html_content, 'html.parser')

# Search the Google People Also Ask block

# Find the block with a corresponding name.

# Go from it up along the structure to find the parent DIV

# You can list the alternative block names, for various languages and locations, through the vertical line

print("BeautifulSoup parse block Google People Also Ask")

paa_main = soup.find(string=re.compile('People also ask|Thematic questions')).find_parent().find_parent().find_parent().find_parent().find_parent().find_parent()

# Fill the structure array

paa_info = {

# Display the content only, with no HTML tags

'Content': paa_main.get_text(),

}

print("Information retrieved successfully")

# Return the data

return paa_info

# Process the exceptions and errors

except Exception as err:

print(f"An error occurred while parsing: {str(err)}")

return None

# To free the resources, stop the browser

finally:

if 'browser' in locals():

browser.stop()

print("Closing the browser")

async def main():

# Here you can redefine the search query

search_query = "What is the color of night"

# Replace the spaces with underlined symbols in the query to properly save the file names

search_query_minus = search_query.replace(' ', '_')

# Create the asynchronous process and provide the variable with the request to the Google scraping process

paa_info = await scrape_google_people_also_ask(search_query)

# If the data is not blank, display the scraping result in the console

if paa_info:

print("\nPeople Also Ask:")

# Browse the array by keywords, replacing the underlined symbols with spaces to make them look great

for key, value in paa_info.items():

print(f"{key.replace('_', ' ').title()}: {value}")

# Save the data in the JSON file

with open(f"{search_query_minus}_query.json", 'w', encoding='utf-8') as f:

json.dump(paa_info, f, ensure_ascii=False, indent=4)

print(f"Information saved to file {search_query_minus}_query.json")

else:

print("Unable to parse information from Google People Also Ask.")

if __name__ == "__main__":

hdlschrome.loop().run_until_complete(main())

This is a ready-made implementation, so read the comments in the code and modify the script according to your needs.

The retrieved data is conveniently saved in the folder with the script as a JSON file (the file name will include your search query).

The main challenge of scraping Google People Also Ask is navigating the tree of HTML elements. Additionally, Google actively protects itself by displaying a CAPTCHA. As a result, we do not recommend using the incognito mode (it is intentionally disabled in the browser launch parameters).

Google reliably detects Selenium and always generates a CAPTCHA. The scraping script will only work if you manually solve the challenges (if any).

Attention! In EU countries, Google also displays a consent window for processing cookies (following GDPR policies).

Since Google defines the language and other regional settings based on location, the phrase "People also ask" may appear differently. First, check the output of your headless browser and update the translation accordingly (it serves as a reference point for locating the required elements).

Step 4. Connecting the Scraper via Rotating Proxies

To avoid solving a CAPTCHA every time, setting up the parser connection through a proxy makes sense.

If you need to rotate proxies at the level of new tabs, you should use third-party extensions like NodriverProxy, at least for nodriver (the library we used in the code example).

Nodriver does not natively support working through proxies, so you must configure them at the browser level (using the CDP protocol) or use another library with a web driver.

For example, in Selenium, proxies can be configured at the level of launch parameters for a headless browser instance:

# Here the libraries are imported, including the Options module

from selenium import webdriver

from selenium.webdriver.chrome.service import Service as ChromeService

from webdriver_manager.chrome import ChromeDriverManager

from selenium.webdriver.chrome.options import Options

from selenium.webdriver.common.by import By

# here you can specify your current proxy connection parameters, you can replace the parameters written in Caps Lock

proxy_host = "PROXY_ADRESS"

proxy_port = " PROXY_PORT"

proxy_user = "YOUR_SERNAME"

proxy_pass = " YOUR_PASSWORD"

# Here you can specify other options and browser launch flags…

options = Options()

proxy_server_url = f"https://{proxy_user}:{proxy_pass}@{proxy_host}:{proxy_port}"

options.add_argument(f'--proxy-server={proxy_server_url}')

# The browser is further launched based on the specified options

driver = webdriver.Chrome(

service=ChromeService(ChromeDriverManager().install()),

options=options

)

# Your scraper logic further

Conclusion and Recommendations

Scraping Google

Scraping Google search results has become significantly more challenging, as the company has moved away from static HTML pages (now it's entirely JavaScript code) and has intentionally introduced unique identifiers for CSS classes and other attributes.

Overcoming the new markup requires considerable effort, but that's not all. The search engine actively protects its pages with CAPTCHAs. To bypass Google's restrictions, rotating proxies and specialized tools capable of masking the use of a headless browser are necessary.

You can find high-quality residential and mobile proxies with Froxy. A budget-friendly trial package is available for convenient testing. Proxies connect to the parser once, with further rotation settings managed in the user dashboard.

Additionally, we offer a ready-to-use SERP scraper. There is no need to write scripts — the data is downloaded as a chart or in JSON format.

Get notified on new Froxy features and updates

Be the first to know about new Froxy features to stay up-to-date with the digital marketplace and receive news about new Froxy features.

Related articles

How to Scrape Google Scholar with Python

Cases

How to Scrape Google Scholar with Python

Learn how to scrape Google Scholar using Python to extract research papers, citations, and author details. Discover techniques for efficient web...

Team Froxy 20 Feb 2025 7 min read
Using Rotating Proxy in Scrapy: Comprehensive Guide

Cases

Using Rotating Proxy in Scrapy: Comprehensive Guide

Learn how to use rotating proxies in Scrapy to enhance your Python web scraping. This detailed guide eliminates bans and improves data accuracy while...

Team Froxy 21 Apr 2025 8 min read