You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My Walmart scraper using CDP Mode worked reliably (10% captcha rate) but now gets blocked 90% of the time, even with residential proxies. Looking for guidance on what changed or what I'm missing.
Environment
SeleniumBase version: [your version - run pip show seleniumbase]
Deployment: GitHub Actions (Ubuntu runner)
Mode: CDP Mode (sb.activate_cdp_mode())
Proxies: Webshare residential (tested with/without - same failure rate)
Browser: Chrome (UC mode enabled)
Code Structure
Based on SeleniumBase examples (walmart examples in particular):
with SB(uc=True, test=True, ad_block=True, proxy=proxy) as sb:
sb.activate_cdp_mode("https://www.walmart.com/")
sb.sleep(2.8)
-----What Changed
Before (2-3 months ago): ~10% captcha rate, scraper ran successfully
Now: ~90% captcha rate (PerimeterX #px-captcha)
Observation: Same code, same approach, drastically different results
What I've Tried
✅ Residential proxies (Webshare) - no improvement
✅ Longer and random sleep timings (3-7 seconds) - no improvement
✅ Running locally vs GitHub Actions - locally works way better than github actions (I get a captcha sometimes (way less than 20%) and its mostly after clicking the search button ... while on github actions . i get it right after opening the first page)
✅ gui_click_and_hold() for captcha (from examples) - works when captcha appears, but appears too often
✅ Removing detection scripts (skyline-ad, sba-container)
I am still a learner and trying to add this scraper to a pipeline I am building . if the IP ranges of github actions is getting flagged or something . I dont mind if you guys could point to any other alternatives (even paid ones (: )
thank you guys . and special thanks to mdmintz .
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
My Walmart scraper using CDP Mode worked reliably (10% captcha rate) but now gets blocked 90% of the time, even with residential proxies. Looking for guidance on what changed or what I'm missing.
Environment
SeleniumBase version: [your version - run pip show seleniumbase]
Deployment: GitHub Actions (Ubuntu runner)
Mode: CDP Mode (sb.activate_cdp_mode())
Proxies: Webshare residential (tested with/without - same failure rate)
Browser: Chrome (UC mode enabled)
Code Structure
Based on SeleniumBase examples (walmart examples in particular):
with SB(uc=True, test=True, ad_block=True, proxy=proxy) as sb:
sb.activate_cdp_mode("https://www.walmart.com/")
sb.sleep(2.8)
-----What Changed
Before (2-3 months ago): ~10% captcha rate, scraper ran successfully
Now: ~90% captcha rate (PerimeterX #px-captcha)
Observation: Same code, same approach, drastically different results
What I've Tried
✅ Residential proxies (Webshare) - no improvement
✅ Longer and random sleep timings (3-7 seconds) - no improvement
✅ Running locally vs GitHub Actions - locally works way better than github actions (I get a captcha sometimes (way less than 20%) and its mostly after clicking the search button ... while on github actions . i get it right after opening the first page)
✅ gui_click_and_hold() for captcha (from examples) - works when captcha appears, but appears too often
✅ Removing detection scripts (skyline-ad, sba-container)
I am still a learner and trying to add this scraper to a pipeline I am building . if the IP ranges of github actions is getting flagged or something . I dont mind if you guys could point to any other alternatives (even paid ones (: )
thank you guys . and special thanks to mdmintz .
Beta Was this translation helpful? Give feedback.
All reactions