If you have a simple script and a small number of pages to scrape, the parser can operate in a semi-manual mode, meaning you'll be around and ready to solve CAPTCHAs at any moment.
The more extensive the scale of parsing becomes, the more time you’ll have to spend bypassing bot protection systems. A simpler and more effective solution is to use proxy servers, preferably with automatic rotation.
If you're writing your script in Python and using the Requests library, this material is for you. We’ll explain how to use a proxy with Python Requests: how, what, and where to configure, whether you need to rotate proxies, what pitfalls or errors you may encounter, and how to work around them.
What Is Python Requests?
Python Requests is a library used to handle (send and receive) HTTP requests: technical headers, errors and content.
You can find the official Python Requests website along with detailed technical documentation there.
According to the library’s developers, Python Requests is HTTP "for humans."
Normally, Python Requests works in tandem with another library — urllib3. This library is responsible for creating an HTTP client in Python. It handles SSL/TLS certificate verification (used for encrypting traffic and establishing secure HTTPS connections) and it also compresses and decompresses content in formats like gzip, deflate, brotli, and zstd.
Python Requests makes the entire functionality more accessible through a simple and easy-to-understand syntax.
Key features of the library include:
- Connection persistence and support for multiple connections;
- Ability to work with any domains and URLs;
- Processing of cookies, sessions, SSL certificates (similar to real browsers), server response codes, headers, forms, files, etc.
- Full support for request methods like GET, POST, PUT, DELETE, HEAD, OPTIONS, etc.
- Built-in support for working with JSON
- Ability to handle exceptions and errors
- Automatic decoding of response bodies into Unicode
- Support for connection timeouts
- Ability to connect through proxies
- Support for the most in-demand authentication types: Basic, Digest, OAuth (versions 1 and 2), OpenID Connect, etc.
- Detailed documentation and API
- Clean and human-friendly syntax, especially when compared to libraries like urllib2
Python Requests is one of the most popular libraries used for building custom web scrapers.
Understanding the Way Proxies Work in Python Requests
A proxy is an intermediary server (a computer or any other network device) that acts as an intermediary between the user and the target resource (like a website or web application).
When Python Requests using a proxy are sent, the server forwards them to the target resource on its own behalf, usually hiding your data in the process. As a result, the target resource responds to the proxy, unaware of your presence.
Proxies are often used to increase privacy, bypass regional restrictions, speed up network performance through caching systems (the proxy can also modify your requests and content), and manage access or distribute resources/traffic in corporate networks.
There are different types of proxies. For example, proxies can forward your requests “as is” (these are called transparent proxies), add special headers but still hide your data (semi-anonymous proxies), and completely hide all traces of proxy use (anonymous or elite proxies).
- Based on connection type, proxies are generally categorized by supported protocols: HTTP proxies, SOCKS proxies (using the SOCKS protocol), SSH proxies, etc. You can read more about the comparison between HTTP and SOCKS proxies.
- Based on the type of IP address used, proxies can be residential (based on IPs of home users/devices), mobile (IPs from mobile carrier pools, mobile operators are responsible for), or server proxies. Sometimes, corporate proxies are also singled out — these use IPs owned by companies or legal entities.
- Based on IP rotation, proxies can be static (when each proxy has a fixed IP assigned to a client) or rotating (also called BackСonnect proxies — they use a fixed entry point but rotate the exit IPs automatically).
As you might guess, different proxy types can be combined. For example, residential proxies can use a backconnect setup over the HTTP protocol and still remain fully anonymous.
We now have to figure out how to use a proxy in Requests Python and what it takes to do that.
The requests library works with the HTTP/HTTPS protocol, so it makes the most sense to use HTTP/HTTPS proxies with it.
However, if you specifically need support for SOCKS proxies, you'll need to install an additional extension requests[socks].
Proxies in Python Requests can be used at the individual request level or for entire sessions (when your script connects to a site and exchanges a series of requests/responses with it).
Enough theory — let’s proceed to the practical part!
Basic Proxy Setup in Python Requests
Suppose you already have your Python environment installed and configured. If not, download the Python distribution from the official website and install it on your operating system. On Linux, you can use the default package manager (APT, Pacman, DPKG, RPM) and default repositories. In many cases, Python is already pre-installed.
On Windows, make sure that Python is added to your system’s environment variables (Path). Similarly, ensure the pip package manager is accessible via the command line (this is the package manager for Python).
Here is a sample of how this may look:
We now have to install the library itself. This is done by the command in the console:
pip install requests
If you haven’t managed to explore the pip, you can use the following command:
python -m pip install requests
To add the library to your scripts, it has to be imported. This is done in the code:
# Your first Python script, the lines starting with # are comments and won't affect your code
import requests
# The rest of your script goes below...
…
Python Requests Using Proxy
Here is the simplest Python Requests proxy integration into your parser:
# Import the requests library first
import requests
# Define your Python proxy server settings in a dictionary (a set of variables)
my_proxies = {
# Create the variables in the dictionary and set their parameters
'http': 'http://123.58.199.17:8168',
'https': 'http://123.58.199.17:8168',
}
# Add your proxies to the request when sending it
# In the configuration below the GET method is used, but it can be replaced with POST and other alternatives, depending on the script tasks.
response = requests.get('http://google.com', proxies=my_proxies)
# Print the HTTP status code from the server's response
# The expected code for stable work is 200
print(f"The server returned code: {response.status_code}")
# The rest of the Python parser code below…
Let’s make the script more complicated by adding a User-Agent header (to mimic a browser) and printing the server’s status code:
# Import the requests library first
import requests
# Define the proxy list, then
proxies = {
'http': 'http://198.49.68.80:80', # Example HTTP proxy
'https': 'https://198.49.68.80:80' # Example HTTP proxy
}
# User-Agent headers
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36'
}
# Sending the request via proxy
try:
response = requests.get('http://www.google.com', proxies=proxies, headers=headers, timeout=5)
# Server response returns to the console
# We need code 200, which means everything is OK
print(f"Server response code: {response.status_code}")
# The errors are processed separately and then printed in the console for debugging.
except requests.exceptions.RequestException as e:
print(f"An error occurred: {e}")
How to Set Proxy in Python Requests at the Session Level
So, how to add proxy to Python Requests? The Requests library supports working through a proxy at the session level.
The Python parser code would look like this:
# Import the Requests library first
import requests
# Next, define your proxy server parameters by creating a dictionary (a set of variables)
proxies = {
'http': 'http://45.155.226.176:3128',
'https': 'http://172.67.182.20:1080',
}
# Add a method of session handling
session = requests.Session()
# Add proxy parameters to the session
session.proxies.update(proxies)
# Now use the session mechanism to access the desired site
# «example-site.com» should be replaced with your own address
session.get('http://example-site.com')
# Continue with the rest of your Python script logic...
Note: Back in 2014, developers discovered that the session.proxies method does not behave as expected. The issue is that it relies on the variables returned by the urllib.request.getproxies function (meaning it overrides environment-level variables). Because of this, it's recommended to define proxies at the request level from the start — using the structure we mentioned at the very beginning.
This issue with sessions is still relevant today.
Using Proxies with Authentication
Proxies don’t necessarily have to be specified as IP addresses. You can also use domain names in the string format. For example, instead of the construction:
'http': 'http://45.155.226.176:3128',
You can write something like:
'http': 'http://my-proxies-for-python-requests.com:3128',
Even if your proxy server requires mandatory authentication with a username and password, that’s not a problem. You just need to format the string correctly. The structure should look like:
http://[YOUR_USERNAME]:[PASSWORD]@[HOST]:[PORT_NUMBER]
A more detailed example:
'http': 'http://my-real-login:1234PASS4321@45.155.226.176:3128',
Or:
'http': 'http://my-real-login:1234PASS4321@my-proxies-for-python-requests.com:3128',
Using Proxies for Python Requests at the Environment Variables Level
Since proxy parameters can be considered sensitive (almost confidential) information, it makes sense to store them separately from your main code.
The Python Requests library supports using environment variables for proxy settings.
You can define your proxies like this (console commands):
export HTTP_PROXY="http://45.155.226.176:80"
export HTTPS_PROXY="http://45.155.226.176:8080"
If you’re working in a Windows environment, use the set instead of the export command:
Once the environment variables are set, you can remove proxy mentions entirely from your scripts — they will be picked up automatically.
Just like with local proxy parameters in code, environment variables support the following format: [YOUR_USERNAME]:[PASSWORD]@[HOST]:[PORT_NUMBER]
Using SOCKS Proxy with Python Requests
Starting from version 2.10, the Python Requests library supports working with SOCKS proxies.
To use them, however, you’ll need to install an extra extension:
pip install requests[socks]
SOCKS proxies are defined just like HTTP proxies in your parsers or scripts:
proxies = {
'http': 'socks5://your-user:pass@proxy-host:port',
'https': 'socks5://your-user:pass@proxy-host:port'
}
Actually, there’s no difference other than specifying the SOCKS5 protocol.
If you want to set a SOCKS5 proxy for Python Requests on the level of environment variables, use ALL_PROXY. For example:
export ALL_PROXY="socks5://45.155.226.176:3434"
For Windows:
set ALL_PROXY="socks5://45.155.226.176:3434"
Proxy Error Handling and Troubleshooting
All the classes for handling errors and exceptions in the Requests library can be found in the file:
Python313\Lib\site-packages\requests\exceptions
Python313 – is a general directory where Python is installed (usually named after the current version, in our case, 3.13).
Specifically, this file contains error handling for:
- timeouts,
- URL requests (if the request format is incorrect),
- redirects,
- incorrect schemes and headers,
- connections (including SSL-specific connection errors),
- encoding issues, etc.
Example of printing server responses:
# Print the server's response code
print(f"Server response code: {response.status_code}")
# Print any errors or exceptions from the library
except requests.exceptions.RequestException as e:
print(f"An error occurred: {e}")
Another example, where we proceed to parsing only if the server's response code is 200 (meaning the requested page is accessible):
# Check if the server response is code 200
if response.status_code == 200:
# If the code matches, print confirmation
print("Server OK, there are no errors")
# Then, parse the content using BeautifulSoup
soup = BeautifulSoup(response.content, "html.parser")
# Return the page content (i.e., the HTML code)
return soup
# In all other cases, print only the error code
else:
print(f"The server returned an error: {response.status_code}")
# And return nothing
return None
Since we have mentioned that already, this refers to working with BeautifulSoup.
Residential Proxies
Perfect proxies for accessing valuable data from around the world.
Proxy Rotation for Web Scraping with Python Requests
In the example below, you need to add all your proxies to a list. The script will count the number of items in the list and select a random one. The final request will be sent through the chosen proxy.
When the script is restarted, a new random proxy will be selected.
Here is the script itself:
# Import the Python Requests and Random libraries (for working with random numbers)
import requests
import random
# Insert your proxy connection data here. The number of entries can be arbitrary.
proxy_list = [
"http://222.252.194.204:5000",
"http://my-proxy-service.com:5001",
"http://162.223.90.130:5100",
"http://my-real-login:1234PASS4321@45.155.226.176:3128",
"http://50.217.226.41:5050",
"http://211.128.96.206:8080",
"http://72.10.160.170:8001"
]
# Define the function logic that will handle sending requests through rotating proxies
# It takes the HTTP request type, the target URL, and "magic" keyword arguments
def proxy_request(request_type, url, **kwargs):
while True:
try:
# Install the current proxy
# Pick a random integer between 0 and the number of proxies minus one (as far as the numeration of list elements starts with 0? It ends with N-1)
# Set the element with this number as your current proxy
current_proxy = random.randint(0, len(proxy_list) - 1)
# Set up the proxies dictionary for both HTTP and HTTPS using the standard scheme
proxies = {"http": proxy_list[current_proxy], "https": proxy_list[current_proxy]}
# Send the request to the target page with the selected proxy
response = requests.request(request_type, url, proxies=proxies, timeout=5, **kwargs)
# Print the currently used proxy to the console
print(f"Proxy used: {current_proxy['https']}")
# Exit the loop
break
# Process exceptions and errors
except Exception as err:
# If the proxy didn't work, notify in the console
print("Error, restart the script and proxy will change")
# Return the response object
return response
#Run our function and pass in the necessary parameters – at minimum, the request type "GET" and the target URL
proxy_request('GET', 'http://site.com')
Instead of selecting a random number, you could organize a sequential rotation, going from the first to the last item in the list. In that case, you'd increment a starting counter with each new request. Once the list is exhausted, you can restart from zero.
A simpler and more logical approach is to use rotating proxies, such as those provided by services like Froxy.
In this case, proxy rotation for Python Requests will be handled on the service side. The rotation logic can be configured in your account dashboard (you can choose rotation based on time, on every new request, by maximizing IP retention, etc.). You can also set the geographical location of the proxy server here.
For the Python Requests library (and your entire scraper), the proxy connection is made only once. You just need to specify the connection parameters for the port — it looks like a regular proxy and connects in the same way, but it's not the final exit point (this is an entry point into the proxy network). Find out the details in the material on BackConnect proxies.
Best Practices and Security Considerations for Web Scraping with Python Requests
Modern websites actively combat unwanted traffic and use various protection mechanisms.
Yes, sending requests via proxies using the Python Requests library is one of the most effective ways to bypass such protections and improve your anonymity (privacy).
However, proxies alone may not be enough. Here are some best practices to help you scrape efficiently and avoid getting blocked:
- Use random delays between requests. Consistent intervals between requests are a clear sign of bots or scrapers.
- Avoid high request frequency to the same website. Sending too many requests can strain the target site and increase the risk of detection. If you need to make many requests, spread them out over different proxies (parallel processing helps both speed and stealth).
- If you have any doubts regarding the quality of your proxy in Python requests, take the time to test it in advance.
- Set a proper User-Agent and headers.
- Emulate cookies and digital fingerprints correctly. If your scraping involves multiple user profiles or sessions, consider using anti-detect browsers to simulate realistic behavior.
- If a site is fully built with JavaScript or has dynamic elements, Python Requests can’t render the content. A headless browser should be used instead.
- Watch out for honeypot traps in HTML.
- Consider integrating CAPTCHA-solving services for automated CAPTCHA solving (this is the most popular way of anti-bot protection - the so-called “first line”.
- If a website or a web service has an official API, use it. By the way, the Python Requests library supports JSON format out of the box.
Find out more details in the article Best Practices for Web Scraping without Blocks.
Conclusion and Recommendations
The Requests library is a powerful tool for building web scrapers in Python. Its user-friendly syntax and wide range of features make it a great choice for working with HTTP/HTTPS requests.
However, the weakest point of any scraper is its detectability and blocking possibility. To avoid this, you should at least use proxy servers. While it’s easy to connect proxies to Python Requests in just a couple of lines of code, the type and quality of the proxy matter significantly. The best solution is to use anonymous rotating proxies based on mobile or residential IPs.
You can find such proxies through our service. Froxy offers an affordable trial package for testing.
Our IP pool exceeds 10 million addresses. Rotation can be set by time or for each new request. Targeting is up to the city and mobile carrier.