Python script to get links from yahoo search

This was a quick script I made to pull links from yahoo search using the boss search api, and then list the unique domains.

If you want the entire links, just modify so that the whole links are appended to the list. Yahoo does not allow to get all the results, but only a certain predefined number so this code only extracts about 800 domains. But it is still good enough for a start and for most uses.

I am also working on getting citation values for google scholar for a friend. I will post that soon here. Heres the code for now.

#! /usr/bin/python
import urllib,json
from urlparse import urlparse

#print yahoo_application_id

#print ""+yahoo_application_id+"&format=xml"
	print "trying result from " + str(nextresult)
	f = urllib.urlopen(""+yahoo_application_id+"&format=json&count=100&start="+str(nextresult))
	ssjson= ss.decode(
	print totalhits
	for x in ssjson["ysearchresponse"]["resultset_web"]:
		url= x["url"]
		o = urlparse(url)
		link = o[0]+"://"+o[1]
		if link not in links:
	if (nextresult>10000):
print "Obtained results: " + str(nextresult) + " of which " + str(len(links)) + " were unique."
for x in links:
	print x

Cool huh? If you want any help modifying this, drop me a line.

Python script to get addresses from google maps

A simple python script to get addresses of businesses in a city. Just a quick demo for a client I wrote in an hour.

import urllib;
def getdata(idstr, matchstr):
	return line
location="&q=" + urllib.quote("airport loc: New Delhi, India") + "&btnG=" + urllib.quote("Search Maps")

fp=urllib.urlopen(url + "?" + data + location)
for line in fp.readlines():
	filecontents=filecontents + line
while morecontent==True:
	startpos=filecontents.find("id:", startloc)
	if startpos>-1:
		endpos=filecontents.find("}}}", startpos)
		if endpos>-1:
			sxti=getdata(section, "sxti:"")
			sxsn=getdata(section, "sxsn:"")
			sxst=getdata(section, "sxst:"")
			sxpr=getdata(section, "sxpr:"")
			sxpo=getdata(section, "sxpo:"")
			sxph=getdata(section, "sxph:"")
			actual_url=getdata(section, "actual_url:"")
			print sxti + ", " + sxsn + ", " + sxst + ", " + sxpr + ", " + sxpo + ", " + sxph + ", " + actual_url

Solving Error with bzr in debian etch. PathNotChild Stack Trace

I installed bazaar on my debian box in order to host the ‘Leave Status Board’ Project on launch pad. Following the instructions to init, add, commit worked fine. But on trying to push to the launchpad account, (had to add the RSA public key to Launchpad first) I got a whole bunch of python errors, a stack trace.

bzr: ERROR: bzrlib.errors.PathNotChild: Path ‘bzr+ssh://’ is not a child of path ‘bzr+ssh://’

Traceback (most recent call last):
File “/usr/lib/python2.4/site-packages/bzrlib/”, line 611, in run_bzr_catch_errors
return run_bzr(argv)
File “/usr/lib/python2.4/site-packages/bzrlib/”, line 573, in run_bzr
ret = run(*run_argv)
File “/usr/lib/python2.4/site-packages/bzrlib/”, line 282, in run_argv_aliases
File “/usr/lib/python2.4/site-packages/bzrlib/”, line 601, in run
relurl = to_transport.relpath(location_url)
File “/usr/lib/python2.4/site-packages/bzrlib/transport/”, line 375, in relpath
raise errors.PathNotChild(abspath, self.base)
PathNotChild: Path ‘bzr+ssh://’ is not a child of path ‘bzr+ssh://’

bzr 0.11.0 on python (linux2)
arguments: ['/usr/bin/bzr', 'push', 'bzr+ssh://']

** please send this report to

A helpful soul at irc ( mentioned that 0.11-1.1 version of bazaar that I had installed was quite old. Considering that 1.11 is the current version, that was actually quite old.

A quick trip to for the stable version for i386 with a download with wget got me the deb installer.

Uninstall the old version, install the deb file. A minor hiccup was the python-celementtree needed to be installed, which I did. The installer also complained about python 2.5 required, while my system had 2.4. But didn’t stop installing.

Pushing to launchpad works fine for now, so I take it that python 2.4 is quite ok for this version of bzr.

Getting Skype to work on Debian Etch

I was having some trouble getting the new skype 2.0 to work on my system for testing. I downloaded the package from and downloaded file was skype-debian_2.0.0.63-1_i386.deb.

Double clicking on the file just opened up gdebi which closed again.

Running from the terminal, gdebi crashes with the following error.
Traceback (most recent call last):
File “/usr/bin/gdebi”, line 31, in ?
if not[1]):
File “/var/lib/python-support/python2.4/GDebi/”, line 31, in open
if not self._deb.checkDeb():
File “/var/lib/python-support/python2.4/GDebi/”, line 185, in checkDeb
if arch != “all” and arch != apt_pkg.CPU:
AttributeError: ‘module’ object has no attribute ‘CPU’

A bit of research later, figured it was a problem with the python libraries and found this patch at

It simply involves opening /var/lib/python-support/python2.4/GDebi/

and replacing the line in the #check arch section

if arch != “all” and arch != apt_pkg.CPU:

if arch != “all” and arch != apt_pkg.Config["APT::Architecture"]:

Skype now installs without any trouble.