Tag Archive for 'google-hack'

05
Aug

Forum SPAMMER Site Mirrored

I used GNU wget to mirror the forum SPAMMER’s site and ended up with about 3G of data. I’m going to mirror it every day to see if anything new pops up. After the first run it gets easier, since wget only downloads new and changed files.

As far as the proxies go, it was just as I suspected. Out of 2879 unique proxies (from 268M of log data), not including SOCKS, there was nothing but TIMEOUT and CLOSED proxies. Not one single live box. Probably 85% were already in the database, but that’s just an eyeball estimate.

There were some hints at the scams this guy is running SEO on, along with some telling graphics. It was a good education on these guys, but I’ve barely scratched the surface. I have a lot going on today and I’m going to be very busy until next week, so I’ll save my findings for later.

04
Aug

Google Hack: Not Dead Yet

No sooner do I declare the Google Hack played out, it turns me into a liar.

Shortly after my last post, I ran the Hack one more time before putting it out to pasture. A few minutes into the run, it came up with one of “those” sites. It seems to have every kind of scam on the Net (lose weight, make money from Google/Craigslist, payday loans, etc.) and some type of blackhat SEO angle. There are dozens of text files listing “users” of various online forums, the accounts they use, URLs for their profiles, and…

… the proxies they use to post.

I ran a few Google searches on some of these names and they are absolutely everywhere. There’s hardly an online forums that hasn’t been hit by these accounts so it’s definitely a forum or comment SPAM operation of some sort.

I’m not sure if this is good news or bad news. I’ve done a few runs against what they’ve got and so far it appears they don’t have anything that isn’t already in my database, although I am getting a few new TIMEOUTs and port CLOSED results. In other words, dead crap.

Anyway, there is so much stuff here that I’m going to mirror the site and bang at it at my leisure. It’s well-indexed by Google, so it’s no Big Secret, but the last time I ran across a site like this it didn’t last more than a couple of months. And the information is so… varied… that an offline study is the best way to mine it.

Check back in to see how it goes!

04
Aug

Proxy List Going On Full Auto

Seventeen months and two million proxies later, I think I finally have this proxy business sorted out.

I’m pretty sure have everything I can possibly get out of the Google Hack and I’m hitting every proxy list that has anything to offer. The recent hack of the proxy recheck code (with the addition of an HTTP Referer header – from of a selection of over 40,000 random URLs) made a big, BIG difference. The last recheck only carved the list down by a third, so it’s finally going into the daily cron schedule.

Previously, the recheck hacked the list down by two-thirds, to about 250 or so non-CoDeeN servers (and usually taking a chunk of CoDeeNs with it, but who cares about those anyway?). Running the resurrection code on the dead proxies generally doubled that number.

Now, at 11:15AM EST every day, whenever there are more than 925 “live” proxies, the recheck code will kick in and purge the dead ones. At 12:15PM, if there are less than 275 in the list, the proxy resurrection code will trigger to beef up the list.

The net result should be a decent equilibrium with about 400-650 LIVE proxies at any given time.

Although the Google Hack has run its course, there is still the possibility that new proxy lists will pop up and old lists will die, so I’m going to run it about once a month. Right now I run it every other week and I’m getting one or two new proxies for the trouble.

After that I may move it from the mrhinkydink.com domain and put it here on ProxyObsession for good. I’m already looking into my hosting options to put everything here anyway. That may happen next week.

What happens after that? IPv6 (Internet Protocol version 6) is coming sooner or later, with a long period of IPv4 backward compatibility required. 4-to-6 proxies are going to be with us for a long time.

25
Jul

Refining The Process

Since I’ve been keeping track of where I’ve been going with the Google Hack, I’ve managed to harvest over 14,000 unique proxy list URLs.  There seems to be very few new lists and now I’m down to running The Hack once a week at most.

However, the top-level URLs are a different story.  I can hit these once a day and get a handful of new proxies.  So, that’s exactly what I have been doing for the last week.

There is a good enough influx of new proxies that I’ve dropped back on resurrecting the old ones.  And, now that I have a database of 14,000 URLs, I have leveraged them to get more precise results.

I have begun using a random URL from this database as a fake Referer in my requests through the proxy judges.  The presence of the Referer in the page returned from the judge (almost) guarantees I haven’t hit a vanilla Web server.  It is always present if the page was from a judge.  It’s absense means I have hit useless junk.

After I added that, things got interesting.

I have found that there are proxies out there that toggle between being a Web server and being a proxy!  That’s actually a nice cover when you think about it.  For what?  Who knows, but there it is.

There are still issues with what I like to call “false proxy judges”.  Some jokers like to run Web pages that look like the output of a proxy judge.  They always display exactly what you want to see when you’re testing proxies.  For some reason this was popular in Japan last year.

For now, this is all development code.  It only runs when the proxies are rechecked or resurrected, not when the proxy is originally discovered.  Until I move it into production there will be still be some crap in the list.

03
Jul

1.99 Million Proxies

It happened around 9:30PM EST on the 29th.

It’s been a long march since the first million, which happened in August 2008.  That took five months and it only happened because the original Google Hack stumbled across a single file with over 700,000 (dead) proxies in it.

The site is finally getting some traffic, even though most of it is still from Cameroon.  Here’s the latest:

june-2009

Very cool.  Traffic nearly doubled since May.  I find this odd since, according to “general wisdom” on proxies, traffic drops off during the Summer months when kids are out of school.  Of course, with Cameroon Puppy Scammers driving 60+% of the traffic, the List bucks the usual demographics.

Iran, which made a lot of proxy news in the last month, pretty much ignored Hinky.  They checked in at #43, with only 97 unique users.

Well, screw them anyway!

21
Jun

Garbage In

Here is a site the Google Hack barfed up the other day.  At first glance it would appear to have dozens of proxies listed, but on closer examination you will notice there are only two distinct IP addresses.  The only difference is the ports.

Garbage

When refreshed, the page is updated.

If you go to the trouble of scanning the IPs, at least one has a possible proxy port (8080).  Apparently, these are connections from proxies.  That is, the listed port is the dynamic port the proxy is using to connect to this (?) site.

This is not helpful.

This site added about a thousand rows of junk to the database.  In the Grand Scheme of Things, that’s not a lot, especially if your definition of “junk” includes “dead proxies”.  If so, the database is 99.95172% junk.  However, my junk is required to have been a proxy at some time in the past, so these had to go and the URL has been banned from subsequent scans.

06
Jun

1.9 Million Proxies

While twiddling around with the settings on the Google Hack today I hit on a winning combination.  Two URLs in a dot-org domain yielded thousands of “new” proxies.    

We hit the 1.9M mark on the 8PM run.   1,900,269 to be exact. 

It’s been banging away for about four hours now, all on two lousy URLs.  There may be a third, fourth, and fifth… who knows?

As usual they’re all dead proxies so far.

06
Jun

List of “Proxy” Domains by Google Hack Hits

This list is probably skewed by the way the Google Hack works and I’ve only been taking these numbers for a few weeks, but anyway all these domains have “prox” somewhere in the 2nd or 3rd level of the domain name.  They are ordered by the number of times seen in a Google search, from most often to least often.  This has nothing to do with “popularity”.

www.forum.freeproxy.ru
blog.qualityproxylist.com
www.proxyconf.net
www.aliveproxy.com
proxies.my-proxy.com
www.proxyfire.net
www.downloadproxylist.com
www.freshproxy.org
www.anonymousproxyserverlists.com
www.proxytm.com
www.anonymousproxylists.net
www.proxycn.net
www.antiproxy.net
www.proxylistservice.com
www.cnproxy.com
www.proxytr.net
www.proxytop.net
community.aliveproxy.com
www.proxyforest.com
www.proxyfire.org
www.antiproxy.org
proxyjudge1.proxyfire.net
www.allproxyinfo.co.cc
www.proxy4free.info
www.anonymousproxylists.com
forum.my-proxy.com
www.proxyguru.info
bbs.proxycn.com
www.proxynet24.com
www.proxycemetery.com
www.multiproxy.org
www.proxy-servers.org
http.proxy-world.org
www.proxylists.net
www.proxyhunter.net
www.proxylisteleri.com
www.fresh-proxy-list.net
www.textproxylists.com
www.dcsproxy.com
forum.proxy.net.pl
www.antiproxy.com
www.godlikeproxy.com
www.proxybox.com
www.proxy-list.org
www.freshproxylist.org
www.proxyserverlists.net
www.proxycn.com
info.proxyfire.org
www.bypassproxyserverlist.com
www.proxyt.net
www.proxyblind.org
www.mydrtproxy.com
www.freshproxylist.com
www.proxysecurity.com
www.proxys.com.ar
www.proxylists.us
www.5uproxy.net
www.cool-proxy.net
www.gravyproxy.com
www.freeproxylist.cn
www.proxy360.cn
www.proxy4free.com
checker.proxyfire.net
www.1proxyfree.com
www.proxycity.com
www.myproxyservers.com
en.proxy.net.pl
www.newproxy.org.ru
www.findproxy.org
www.proxytrade.net
us1.proxy12345.net.ru
www.onlinechecker.freeproxy.ru
www.runproxy.info
www.dartproxy.info
www.proxyforfree.hp-lp.net
www.speedy-proxy.cn
www.proxyshell.com
www.bestproxylist.cn
www.x-proxy.info
www.fire-proxy.com
www.proxylist.free-web-proxy.de
www.webproxy.com.es
www.goproxytoday.co.cc
www.proxy-listen.de
www.proxyonline.co.cc
jp5.proxy12345.cn
cz.proxy12345.cn
www.fresh-proxy.net
ip.proxyfire.net
www.proxybucket.com
www.changeallproxy.com
www.blog-proxy.com
www.microproxy3.blogfa.com
www.proxyanonsurf.com
www.mamproxy.com
www.url-proxy.com
www.92proxy.com
us.proxy12345.net.ru
www.proxydb.ru
jp2.proxy12345.cn
www.chinaproxy.cn
www.proxylist.net
proxy.freeproxyland.info
www.proxyplanet.blogfa.com
www.ipproxylists.com
www.proxyserverfinder.com
www.free-web-proxy.de
onlinechecker.freeproxy.ru
jp16.proxy12345.cn
www.freeproxylists.com
www.proxypad.com
www.proxy-hispano.com
edu.proxycn.com
www.proxybase.de
www.aplusproxy.com
www.proxyleech.com
www.proxymore.net
www.proxyway.com
www.proxyserverprivacy.com
www.advproxy.net
www.freeproxy.ru
www.freeproxy.info
www.proxy-servers.biz

Again, whatever they have, I have too.

I got it all, bitches!

06
Jun

Google Hack Breaks

Since I started tracking proxy list URLs a few weeks back I’ve only been running the Google Hack about once a week.  That seems to be the right frequency to catch new stuff.

Yesterday I fired it up and… nothing.

At first I thought maybe they banned my ass again, but Google had changed the way they displayed their search results and my kiddie scripts stopped working.

After about ten minutes of tinkering I got it working again.  I also managed to slash a dozen or so lines and simplified everything.  Now it runs better than ever.

I’m actually getting good at this shit.

31
May

Chinese Junkbuster

The List was up to 850 proxies this morning, many Chinese, so I ran the China Recheck.  By the next page publish, about a hundred of them dropped out.

Since it’s my goal to have active proxies – a very rare commodity – rather than dead ones  in the list, I’m going to run the recheck/purge after the page is published (every other hour).  This isn’t really going to help because it means that dead Chinese proxies will be in the list anyway.  The way I move things around in the database, I can’t really do a recheck unless the address has already been published.  They shouldn’t be there for more than a couple of hours, when, if things keep going the way they have been, a new set of dead Chinese proxies will take their place.

Hopefully this problem will eventually work itself out.

As an experiment, I ran the Resurrection Hack on the dead Chinese proxies to see exactly how dead they are.   The vast majority time out.  The rest are closed.  A small handful came back from the dead.

Using the SwitchProxy Tool for Firefox, I pulled one of the resurrected proxies, 58.17.3.2:80, and I’m putting it through its paces.  The speed is reasonable, but the first time I tried a Google search through it I got the “looks like you have a virus” page.  You know what that means.

I’m not sure how representative 58.17.3.2 is of the rest of the Chinese bunch.  I first encountered that address back in February (on four different ports – it may also be a SOCKS proxy).  It appears to be a business, registered to “Nanchang Jianmin Nuitrition Products Factory” (proud makers of melamine, I’m sure), does not reverse-resolve, and the IP itself can be found on no less than “ about 9,270″ Web sites, according to Google (very good results there – that particular search is going into the Google Hack).

Obviously, a well-known, heavily abused proxy (due to the Google warning and a permanent IP ban at 4chan.org, which is always an excellent abuse acid test).

I think a combination of agressive purging and selective resurrection of the Chinese Junk will result in having only the most available proxies show up in the list.

We’ll see what happens with that theory.