replacing New Words in the media
I am writing a script that allows you to read in a text and if it contains a word that was only put into the English Language (ie, entered into the Oxford English Dictionary as of 2010), I print out that. This is the latest update for the OED, and contains a list of new words. I created a text file from all the new words of 2010.
In order to start to mess around with Python and see if I could pick out a certain word from a list, I just tried to replace every letter in a text file with a number : a–>z corresponded to 1–>26.
I am still working on this code (shouldn’t be too complicated…). I am also still working on conceptually, what to do… or why this might be interesting. More to come…
Here is the code: (I will shortly upload the plugin to allow me to write in legible code)
i=0
num = len(alphabet)
#print num
alphadict = {}
while i
Here is the result:
What I have been working on mostly for the past week is to take a file of carpool data consisting of two zip codes, in for format |origin, destination|
and getting this: (FINALLY I cleaned up the data and got lat/longitude points
[origin_latitude origin_longitude destination_latitude destination_longitude]
See the code:
import sys
#define a function
def len(array):
i=0
for entry in array:
i=i+1
return idef clean(line):
line = line.replace(“\'”, “”)
line = line.replace(” “, “”)
line = line.replace(“,”, “.”)
line = line.replace(“(“, “”)
line = line.replace(“)”, “”)
line = line.replace(“…”, “,”)
line = line.strip(“(),’\””)
return linedef zipToLat(origin, destination, zipCodeMap):
#for each origin and destination zipcode that exists in the zipCodeMap, print it out in the format : (latOrigin, lonOrigin; latDestination, lonDestination)if zipCodeMap.has_key(origin) and zipCodeMap.has_key(destination):
print zipCodeMap[origin][0], zipCodeMap[origin][1], zipCodeMap[destination][0], zipCodeMap[destination][1]
#print zipCodeMap[origin]+zipCodeMap[destination]
#print origindef maxloc(locationsDict,zipmap):
#return string of both values – location and how many times it appears
tmp_max_count = -1
tmp_max_value = ”
zipMapRF={}
for entry in locationsDict:
if len(entry)>1:
zipcode = entry[0]
count = entry[1]
if zipmap.has_key(zipcode):
# print zipmap[zipcode] + ” , ” , count
#print “%s %s” % (zipmap[entry[0]],entry[1])zipMapRF[entry[0]] = entry[1]
if tmp_max_count < count: tmp_max_count = count tmp_max_value = zipmap[zipcode] #entry 1 num times apprs, 0 is name of location #chek beforehand - does zipmap have this key.. # print locationsDict print "popular location is " + tmp_max_value + " which was visited " , tmp_max_count , " times." return tmp_max_value #now get value and the map zipNames = {} zipCoordinates = {} starts = {} destinations= {} locations={} for line in open('zips.txt'):#can i just load it from http://www.census.gov/tiger/tms/gazetteer/zips.txt tup = line.split(",") # print tuple if len(tup)>=4:
zip=tup[1][1:-1]
state=tup[2][1:-1]
town=tup[3][1:-1]
lon=tup[4]
lat=tup[5]
point = lon,lat
#print point
#zipmap[zip]=pair
#print zip + ” (” + pair[0] + “,” + pair[1] + “)”
zipNames[zip] = town + “, ” + statezipCoordinates[zip] = point
#print zipmapfor line in open(‘phpErid.txt’):
for pointPair in line.split(“|”):
#build concordence
pointPair = pointPair.strip()
points = pointPair.split(“,”)#print “points!”
#print points
if len(points)>=2:
dest = points[1]
loc = points[0]
destinations[dest]=destinations.get(dest,0) +1
starts[loc] = starts.get(loc,0) +1
locations[dest] = locations.get(dest,0) +1
locations[loc] = locations.get(loc,0) +1zipToLatValue = zipToLat(loc,dest,zipCoordinates)
l=len(locations)
topZipCode = maxloc(locations,zipNames)
locations = locations.items()
maxloc(locations,zipNames)
locations.sort
#print locations
if len(locations)>1:
locations.sort()
# print locations[0]
# print locations[l-1]
maxloc(locations,zipNames)
#{‘M’: 1, ‘s’: 4, ‘p’: 2, ‘i’: 4}#print zipNames
#print zipmap#60647,60120| 60647,60120| L1X2W8,M5S1N5| L1X2W8,M5S1N5| 22601,20041| 22601,20041| 19446,19477| 19446,19477| 19446,19477| 19446,19477| 21224,20019| 90840,92660| 02114,02132| 17047,17128| 94612,94903| 94612,94903| 94612,94903| 62034,62035| 62034,62035| 94110,94403| 94110,94403| 07306,07059|
<\blockquote>