Tuesday, April 24, 2012
numpy: combining 'less' and 'more' operators in where expression, bitwise AND
Efficiency 101: reading large FITS table with Python
Last week I hastily wrote some code to search the NSAtlas. It was almost ok for a single galaxy: the array is huge, and iterating through it 1000 times...it took 1 minute for each loop, so I would have to wait all day for the full query to complete.
I've changed the 'if' loop to numpy.where statements and used the collections module: here. It is almost usable, though it takes 15 minutes to search for some variable for 1000 galaxies, and the machine slows down to an almost complete halt.
There is a significant overhead associated with opening and closing files, especially as the one in question was quite huge (0.5 GB) in this case. Not calling the function from within a loop, but passing an array of search arguments and then looping through it within the function reduced the running time to 16 seconds. A 60-fold improvement in two minutes.
I'd also like to mention a fine module called collections, which helped me find an overlap between two lists, like that:
a_multiset = collections.Counter(list(ras[0]))
b_multiset = collections.Counter(list(decs[0]))
overlap = list((a_multiset & b_multiset).elements())
Wednesday, April 18, 2012
NSAtlas, surface photometry
Monday, April 16, 2012
Surface photometry
Friday, April 13, 2012
SDSS database utilities
Thursday, April 12, 2012
Latex degree symbol
Wednesday, April 11, 2012
shell: counting files in the current directory
Wednesday, April 4, 2012
Creating a discrete Gaussian kernel with Python
def gauss_kern(size, sizey=None):
""" Returns a normalized 2D gauss kernel array for convolutions """
size = int(size)
if not sizey:
sizey = size
else:
sizey = int(sizey)
x, y = mgrid[-size:size+1, -sizey:sizey+1]
g = exp(-(x**2/float(size)+y**2/float(sizey)))
return g / g.sum()
Note that 'size' here refers to (shape[0]/2 - 1), i.e. the resulting kernel side is size*2 + 1 long. I need to use a kernel with a central value of 1, so I adjusted the normalisation accordingly:
def gauss_kern(size, sizey=None):
""" Returns a normalized 2D gauss kernel array for convolutions """
size = int(size)
if not sizey:
sizey = size
else:
sizey = int(sizey)
x, y = scipy.mgrid[-size:size+1, -sizey:sizey+1]
g = scipy.exp(-(x**2/float(size)+y**2/float(sizey)))
return g / g.max()
Tuesday, April 3, 2012
A quick way to check if a list of files is complete
# -*- coding: utf-8 -*-
import os
import csv
import string
def touch(fname, times = None):
with file(fname, 'a'):
os.utime(fname, times)
for i in range(0, 939):
with open('../data/maskFilenames.csv', 'rb') as f:
mycsv = csv.reader(f)
mycsv = list(mycsv)
fname = string.strip(mycsv[i][1])
touch(fname)
print fname
Check if files in a list exist with Python
wrongFilenames = []
for i in range(0, noOfGalaxies):
maskFile = GalaxyParameters.getMaskUrl(listFile, dataDir, simpleFile, i)
print i, 'maskfile: ', maskFile
try:
f = open(maskFile)
print "maskfile exists:", maskFile
except IOError as e:
wrongFilenames.append(maskFile)
Creating masks with GIMP
masked = np.where(mask != 0)
mask[masked] = mask[masked]/mask[masked]
Monday, April 2, 2012
WCS to pixel values and vice versa in Python
from astLib import astWCS
WCS=astWCS.WCS("Galaxy.fits")
center = WCS.wcs2pix(ra, dec)
print center
print 'brightness at the galaxy center', img[center[1], center[0]]
#thanks, numpy -- python indexes arrays ass-backward (or, rather, column-backward) for some reason. One have to be careful not to swap them.
Getting tsField files from SDSS
wget http://das.sdss.org/imaging/'+run+'/'+rerun+'/calibChunks/'+camcol+'/tsField-'+runstr+'-'+camcol+'-'+rerun+'-'+field_str+'.fit
After