In this guide, we are going to use the very popular Axios Node module to make HTTP requests. But, we are going to do so via a proxy server. We are going to explore both free and paid proxies. With paid proxies, you are also usually given authentication credentials. So, the Axios example will show you how you can use the Axios library in proxy mode with auth.
If you are going to try to do any sort of web-scraping activity using Node.js, this guide will have all the information you need.
A proxy is usually used in web scraping because the target website might not allow you to make requests to it via a Data Center server and IP address. So, you can get yourself a residential proxy so that the target website gets an IP address that is a residential IP address. That way, you do not get blocked when scraping.
The Example In The Axios Docs Does Not Work
According to the official docs, you are supposed to simply pass in a proxy object in the config like so:
// `proxy` defines the hostname, port, and protocol of the proxy server. // You can also define your proxy using the conventional `http_proxy` and // `https_proxy` environment variables. If you are using environment variables // for your proxy configuration, you can also define a `no_proxy` environment // variable as a comma-separated list of domains that should not be proxied. // Use `false` to disable proxies, ignoring environment variables. // `auth` indicates that HTTP Basic auth should be used to connect to the proxy, and // supplies credentials. // This will set an `Proxy-Authorization` header, overwriting any existing // `Proxy-Authorization` custom headers you have set using `headers`. // If the proxy server uses HTTPS, then you must set the protocol to `https`. proxy: { protocol: 'https', host: '127.0.0.1', // hostname: '127.0.0.1' // Takes precedence over 'host' if both are defined port: 9000, auth: { username: 'mikeymike', password: 'rapunz3l' } },
However, this does not work. Not really sure why. The request just takes ages and then finally fails with:
AxiosError.call(axiosError, error.message, code, config, request, response); ^ AxiosError: socket hang up at AxiosError.from (/home/khoj/livefiredev/axios proxy with auth/app/node_modules/axios/dist/node/axios.cjs:825:14) at RedirectableRequest.handleRequestError (/home/khoj/livefiredev/axios proxy with auth/app/node_modules/axios/dist/node/axios.cjs:2965:25) at RedirectableRequest.emit (node:events:513:28) at eventHandlers.<computed> (/home/khoj/livefiredev/axios proxy with auth/app/node_modules/follow-redirects/index.js:14:24)
What this means and what to do about this, I have no idea. So, let’s move on to a way that works.
Working Way To Use A Proxy With Axios
So, in order to get things to work, you need to use another NPM package along with Axios. Its called: node-https-proxy-agent
So, start by adding the module to your project with something like:
> npm install https-proxy-agent
Then, you are ready to use the package in a script like so:
const axios = require('axios'); const httpsProxyAgent = require('https-proxy-agent'); const httpsAgent = new httpsProxyAgent('http://USERNAME:PASSWORD@PROXY_HOST:PROXY_PORT'); (async () => { let url = 'URL_TO_SCRAPE'; const response = await axios.get(url, { headers: { "accept-language": "en-GB,en-US;q=0.9,en;q=0.8", "cache-control": "no-cache", "pragma": "no-cache", "sec-ch-ua": "\"Chromium\";v=\"109\", \"Not_A Brand\";v=\"99\"", "sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "\"macOS\"", "sec-fetch-dest": "document", "sec-fetch-mode": "navigate", "sec-fetch-site": "none", "sec-fetch-user": "?1", "upgrade-insecure-requests": "1", "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 13_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.3 Safari/605.1.15" }, httpsAgent: httpsAgent }); console.log(response.data); })();
Line 4, is where you need to make all your updates. All the details of your proxy come here. Just replace it with your host, port, username, and password. Maintain the below format. You might also want to turn HTTP into HTTPS depending on your needs.
http://USERNAME:PASSWORD@PROXY_HOST:PROXY_PORT
As you might have noticed in the above script, I have added other headers that a real browser would add. Because without those it’s a dead giveaway that it’s a script that is trying to scrape data. You can use those headers or replace them to suit your needs.
Also notice line 23. That is where the httpsAgent we set up on line 4 is used as part of the Axios config.
Trying To Use Free Proxies. Most Do Not Work. So, let’s find working ones fast!
If you need a super low-cost option, you can use free proxies. But, in my testing, most free proxies just don’t work. But, in order to help you find the ones that work, I have made a script.
So, the way the script works is, you start by getting a massive list of free proxies from here.
They have created a URL like this one where you can get a huge proxy list in JSON format. So, the script below will download the list, and then start to test each proxy one after another.
const axios = require('axios'); const httpsProxyAgent = require('https-proxy-agent'); const urlWithAllProxies = "https://proxylist.geonode.com/api/proxy-list?limit=500&page=1&sort_by=lastChecked&sort_type=desc&protocols=http&anonymityLevel=anonymous"; axios.get(urlWithAllProxies).then((response) => { if (response.status == 200) { const proxiesJson = response.data; proxiesJson.data.forEach(proxyRecord => { let ip = proxyRecord.ip; let port = proxyRecord.port; const httpsAgent = new httpsProxyAgent(`http://${ip}:${port}`); axios.get('https://ipv4.icanhazip.com/', { httpsAgent: httpsAgent }).then((response) => { if (response.status == 200) { console.log(`Proxy ${ip}:${port} works! The IP we got is ${response.data}`); } }).catch((error) => {}); }); } });
Once you run the above script, you start to get some results printed on the console like so:
Proxy 182.253.105.123:8080 works! The IP we got is 180.251.232.226 Proxy 193.138.178.6:8282 works! The IP we got is 193.138.178.6 Proxy 173.212.200.30:3128 works! The IP we got is 173.212.200.30 Proxy 122.155.165.191:3128 works! The IP we got is 122.155.165.191 Proxy 1.1.189.58:8080 works! The IP we got is 1.1.189.58 Proxy 185.131.172.51:5050 works! The IP we got is 185.131.172.51
The URL we are using to test the proxies is:
https://ipv4.icanhazip.com/
It’s a very simple URL that just echoes back the IP address of the machine making the request. Click here to check it out.
As you can see above, these proxies do not need auth. They are free and open. So, the username and password section is dropped from the httpsAgent line when setting up the proxy.
If you have time, and are not super concerned about the reliability of the proxies and can afford to re-try with another proxy if one fails, then you can use the above method. It should be noted that many of these will be “data center” proxies. That means that many of them are going to get blocked when web scraping.
In any case, those are the pros and cons. You can figure out what needs to be done in your case after considering all this.
Best Place To Get A Low-Cost Residential Proxy
There is a service called IPRoyal. I have had a good experience with it. They charge per GB. This makes it very affordable for web scraping. Because usually, we are just making a request for a single page or internal API. So, so the data sent back is usually in KB.
Just to do a little math and get a sense of what things are like, using the network tab of your browser, you can look at the size of the response from the website you are trying to scrape.
Let’s say it’s about 137kb. And you are paying IPRoyal $7 per GB, then which means you can make about 7299 requests for $7.
Many services in this space charge per request. Which is easy to calculate and estimate, but, can get more expensive.
Starts at $7/GB: But There Is No Trial
In order to check out the pricing for the “residential proxies” you can go here. The more GB you buy the lower the proxy cost.
Residential proxies as the name suggest give you an IP address of somebodies home broadband or 4G mobile phone etc. So, these IPs are never blocked by websites you are going to scrape. Because then they would risk having their service down for real users.
The downside of IPRoyal is that there is no trial. That was a bit annoying. But since it was so cheap, I decided to take a leap of faith and make the first $7 purchase. Things worked out well thankfully.
Next, let me take you through the steps of buying from IPRoyal.
Buying Residential Proxies
The process is pretty simple. Go to IPRoayl and create a new account. It’s free.
After logging in, go to the “Royal Residential Proxies” section and create a new order. You can start by just buying 1 GB for $7.
You will be asked to make a purchase. Once you do the same, you will be able to configure your proxies, username, and password and also set a country from where you will be assigned an IP address. It seems the password changes on basis of the country.
Also, note the “Rotation” option. You can choose to have:
- Randomize IP – IP change on each request.
- Sticky IP – IP stays the same as long as possible
If you need to make a few requests one after the other then you might choose the “Sticky IP” option. Something like login and then making another request.
Finally, once you have chosen all the options you want, you can copy the URL that has been highlighted in the red box. That is exactly what you need in order to use the proxy.
Using The Credentials With Axios
Using the IPRoyal given URL is exactly as we have seen before. Here is the code snippet once again for reference.
const axios = require('axios'); const httpsProxyAgent = require('https-proxy-agent'); // The line below is where you will stick in // the URL you get from IPRoyal const httpsAgent = new httpsProxyAgent('http://USERNAME:PASSWORD@PROXY_HOST:PROXY_PORT'); (async () => { let url = 'URL_TO_SCRAPE'; const response = await axios.get(url, { headers: { "accept-language": "en-GB,en-US;q=0.9,en;q=0.8", "cache-control": "no-cache", "pragma": "no-cache", "sec-ch-ua": "\"Chromium\";v=\"109\", \"Not_A Brand\";v=\"99\"", "sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "\"macOS\"", "sec-fetch-dest": "document", "sec-fetch-mode": "navigate", "sec-fetch-site": "none", "sec-fetch-user": "?1", "upgrade-insecure-requests": "1", "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 13_2) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.3 Safari/605.1.15" }, httpsAgent: httpsAgent }); console.log(response.data); })();
Conclusion
I hope this guide helped with all your questions about how to use Axios with a proxy and auth. I have tried to make the guide a one stop resource for everything needed when trying to use free and paid proxy services.