« October 2008 »
SunMonTueWedThuFriSat
   
4
10
15
17
18
19
21
22
24
25
26
27
28
29
31
 
       
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20081008 Wednesday October 08, 2008
A reader suggestion on how to solve the Perl script

Neil doesn't like that our comment section wipes out whitespace. His concern is certainly valid where it comes to the way Python uses indentation.

He suggested the following implementation:

#!/usr/bin/env python

import csv

def main(dfile,format,delimiter=","):
        db=open(dfile,'U')
        start=0
        for line in db:
                if line.startswith(format):
                        db.seek(start+len(format))
                        return csv.DictReader(db,delimiter=delimiter)
                else:
                        start+=len(line)+(len(db.newlines)==2) #windows hackery
        raise "There is no %s header line in %s" % (format,dfile)


if __name__ == "__main__":
        for row in main('data.txt','!'):
                print "%s - %s: %s for %s\n\t%s\n\n" % \
                                tuple([row[column] for column in ['started','ended','title','company','jobdesc']])

And he provided the following note:

So what about something like this?

The csv module should take care of delimiters within columns
Simplification is possible if you don't need to deal with windows or
unix style line terminators
Changing delimiters is easy too.

I like that he caught on to making the separator an argument. It makes the code much more portable. I'm not sure it is as robust with respect to error handling, but in all fairness that could easily be handled and I did add those after posting the Perl script. Oh, and it does easily handle the addition of a new column in the data file.

I like the use of raise, I'm certainly not used to exception handlers any more.

I can see part of what is going on here:

>>> for row in neil.main("r4.txt",'!'):
...     print row
...
{'description': 'NFS development', 'title': 'Staff Engineer Software', 'started': '1/05', 'company': 'Sun Microsystems', 'ended': 'present', 'mad': 'money'}
{'description': 'WAFL and NFS development', 'title': 'File System Engineer', 'started': '6/01', 'company': 'Network Appliance', 'ended': '12/05', 'mad': 'honey'}
{'description': 'Manager of Engineering Internal Test', 'title': 'Manager', 'started': '4/01', 'company': 'Network Appliance', 'ended': '6/01', 'mad': 'scot'}
{'description': 'Perl hacker and filer administrator', 'title': 'System Administrator', 'started': '10/99', 'company': 'Network Appliance', 'ended': '4/01', 'mad': 'dam'}

And I thing the stuff with 'start' is what gets over the '!' in the first line???

>>> import csv
>>> help(csv.DictReader)

>>> for row in csv.DictReader(file("r4.txt")):
...     print row
...
{'!started': '1/05', 'description': 'NFS development', 'title': 'Staff Engineer Software', 'company': 'Sun Microsystems', 'ended': 'present', 'mad': 'money'}
{'!started': '6/01', 'description': 'WAFL and NFS development', 'title': 'File System Engineer', 'company': 'Network Appliance', 'ended': '12/05', 'mad': 'honey'}
{'!started': '4/01', 'description': 'Manager of Engineering Internal Test', 'title': 'Manager', 'company': 'Network Appliance', 'ended': '6/01', 'mad': 'scot'}
{'!started': '10/99', 'description': 'Perl hacker and filer administrator', 'title': 'System Administrator', 'company': 'Network Appliance', 'ended': '4/01', 'mad': 'dam'}

But I haven't figured out yet how the result is being built up. Okay, yes I have. I was fixated on the 'if' and 'else' thinking that was handling the header line versus the data line. But no, all it does it get you to the header line (i.e., there are comments in the file) and then the 'db.seek' gets you to the start of the header line and + 1 (via 'len(format)') for the format character. Then, just as in my interactive example, 'csv.DictReader' does the magic for you!

Sweet, Neil's code does what I had the Perl script doing!

It also shows I'm not used to all of the Python way of doing things. But it was fun to figure out what his script was doing!


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Branch merge to a specific tagged revision

I need to do a branch merge between nfs41-gate and onnv-clone. And specifically, I want to not get the 'tip', but rather the tag for release 100. I found a good reference - Chapter 8 Managing releases and branchy development.

So I'll follow along with it. I need the tag:

[th199096@jhereg onnv-play]> hg tags
tip                             7782:716c23b2ce2e
onnv_100                        7757:bf4a45ecb669
onnv_99                         7613:e49de7ec7617
onnv_98                         7473:fad192e9bc57

It turns out I don't need much more:

[th199096@jhereg nfs41-100]> hg reparent ssh://onnv.eng//export/onnv-clone
[th199096@jhereg nfs41-100]> hg tags | more
tip                             7744:763bfa203d1a
closedv1                        7742:9fab48a31a4a
onnv_99                         7652:e49de7ec7617

So I haven't merged yet:

[th199096@jhereg nfs41-100]> hg pull -u -r onnv_100
pulling from ssh://onnv.eng//export/onnv-clone
searching for changes
adding changesets
adding manifests
adding file changes
added 64 changesets with 475 changes to 462 files (+1 heads)
not updating, since new heads added
(run 'hg heads' to see heads, 'hg merge' to merge)

The tag can be used as a revision!


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Getting around a tool repository which is not updating

With the introduction of Mercurial, we have a need to keep our tools directory up to date. We could simply NFS mount the one in Menlo Park, but for WAN and build performance, that sucks. So, the Austin Labs have a local copy. And it is not being kept up to date. We've all been bitten by an old copy of the BFU script.

To get around this, we've built our own local repository and made sure that our paths all take this into account. Well, that just failed for me:

[th199096@jhereg mms]> hg outgoing -v
running ssh onnv.eng "hg -R /export/onnv-clone serve --stdio"
comparing with ssh://onnv.eng//export/onnv-clone
searching for changes
abort: style not found: /ws/onnv-tools/onbld/etc/hgstyle

I know the 'hgstyle' stuff is new, I saw Flag Day info on it. And sure enough: [th199096@jhereg mms]> df -k /ws/onnv-tools/onbld/etc Filesystem kbytes used avail capacity Mounted on mool-ha1-nfs.central:/export/ds01/d531/tools/01/elpaso.eng/opt/onbld 140454588 109105801 29944242 79% /ws/onnv-tools/onbld

I don't want to hack on the script, which I think shouldn't be using the full path. So I'll have to change where I'm getting my copy of the tools in /ws.

Okay, I don't have permissions on the NIS server, but I can get the map:

[th199096@jhereg ~]> ypcat -k auto.ws | grep onnv-tool
onnv-tools /SUNWspro   -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/slug-17.eng/export/$CPU/opt/SUNWspro    /teamware   -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/slug-17.eng/export/$CPU/opt/SUNWspro/SOS8    /onbld      -ro mool-ha1-nfs.central:/export/ds01/d531/tools/01/elpaso.eng/opt/onbld

And I can add it to my local /etc/auto_ws:

#
# Local copies of /ws workspaces
#
# For /ws/on10-clone use:
# /ws/on10-patch-clone-auspen or on10-feature-clone-auspen
#
on10-clone-aus          iquad:/pool/ws/on10-clone
on10-patch-clone-aus    iquad:/pool/ws/on10-patch-clone
onnv-clone-aus          iquad:/pool/ws/onnv-clone
on10-test-aus           iquad:/pool/ws/on10-test
onnv-test-aus           iquad:/pool/ws/onnv-test
onnv-stc2-aus           iquad:/pool/ws/onnv-stc2
on10-tools-aus  -ro     iquad:/pool/ws/on10-tools-$CPU
onnv-tools-aus  -ro     aus1500-home:/pool/ws/onnv-tools-$CPU
onnv-tools      /SUNWspro       -ro     /opt/SUNWspro /teamware       -ro     /opt/SUNWspro/SOS8    /on
bld     -ro     /opt/onbld

And no go:

[th199096@jhereg /etc]> sudo svcadm restart autofs
...
[th199096@jhereg th199096]> ls -la /ws/onnv-tools
/ws/onnv-tools: Permission denied
total 1
[th199096@jhereg th199096]> dmesg
...
Oct  8 16:54:23 jhereg automountd[883428]: [ID 406441 daemon.error] parse_entry: mapentry parse error: map=auto_ws key=onnv-tools
Oct  8 16:55:55 jhereg automountd[883477]: [ID 406441 daemon.error] parse_entry: mapentry parse error: map=auto_ws key=onnv-tools

I turn spaces into tabs, no luck. I check other machines and they do the hierarchy locally for other things. Well, I then convert the pathnames from /opt/SUNWspro to localhost:/opt/SUNWspro. And that turns the trick:

[th199096@jhereg th199096]> ls -la /ws/onnv-tools
total 5
dr-xr-xr-x   4 root     root           4 Oct  8 17:04 .
dr-xr-xr-x   2 root     root           2 Oct  8 17:04 ..
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 SUNWspro
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 onbld
dr-xr-xr-x   1 root     root           1 Oct  8 17:04 teamware

I probably need to put a real fix into our jumpstart servers and make the path dependent on $CPU, but I think I was doing something when this happened.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
pdf to jpg via ImageMagick

I'm the volunteer webmaster for my son's soccer club: Blitz United Soccer Club. We occasionally get logos and such from sponsors. We want jpeg images for the website and they want high quality pdf for printing. Until now, I've simply asked them for the images in a format we can handle.

I got tired of doing that and googled 'pdf to jpg'. There were a lot of hits of sites that either wanted to install to my windows box or get an email address. I added 'linux' to my search parameter and found a nice hit: Batch converting PDF to JPG/JPEG using free software.

Having heard of ImageMagick vaguely in the past, and since they had many download sites, I installed it on my WinXP desktop. And it didn't convert for me:

C:\Documents and Settings\thud\Desktop\Downloads\97red>convert cooper.pdf cooper.jpg
convert: `%s': %s "gswin32c.exe" -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE -d NOPROMPT -dMaxBitmap=500000000 -dEPSCrop -dAlignToPixels=0 -dGridFitTT=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72"  "-sOutputFile=C:/DOCUME~1/thud/LOCALS~1/Temp/magick-UtqkGDcw" "-fC:/DOCUME~1/thud/LOCALS~1/Temp/magick-MpE4YxWI" "-fC:/DOCUME~1/thud/LOCALS~1/Temp/magick-z6ByBicB".
convert: Postscript delegate failed `cooper.pdf': No such file or directory.
convert: missing an image filename `cooper.jpg'.

Well, I solved that fairly quickly by:

[thud@adept ~/tmp]> sudo yum install ImageMagick
Setting up Install Process
Parsing package install arguments
Package ImageMagick-6.3.5.9-1.fc8.i386 already installed and latest version
Nothing to do
[thud@adept ~/tmp]> convert -density 600 cooper.pdf cooper.jpg

Which is probably what I should have tried in the first place.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Not able to mount from Fedora Core 9

Helen Chao, a colleague who had never really used Linux, asked me to help configure a kernel. I asked why and she said she needed to test RDMA over NFSv4. It turns out that the stock 2.6.25 kernel with Fedora Core 9 already had the support in it. We followed the directions at the nfs-rdma.txt and were not able to get it running.

Helen (a great test engineer) proceeded to investigate from there and couldn't get a simple loopback or NFS mount to succeed.

So I exported the root to all hosts and went to work debugging this issue. A 'rpcinfo -p' on the server showed the expected registered services. The same call from a client failed, but a ping worked:

[th199096@jhereg ~]> rpcinfo -p pnfs-9-30
^C
[th199096@jhereg ~]> rpcinfo -p pnfs-9-30
^C
[th199096@jhereg ~]> sudo mount -o vers=3 pnfs-9-30:/ /mnt
^C
[th199096@jhereg ~]> sudo mount -o vers=3 pnfs-9-30:/ /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: retrying: /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
^C
[th199096@jhereg ~]> ping pnfs-9-30
pnfs-9-30 is alive

I thought that perhaps it was a firewall issue and disabled IPTABLES.

No luck and I knew the mount should succeed - I tried it with my home Core 8 box and an OpenSolaris server. It worked, but then again, that Linux box has been configured for ages. Long story short, I asked Chuck Lever for help.

His only suggestion was to turn off selinux or as he puts it:

Also disable selinux, just so your systems behave like normal Unix.

So I followed the directions I found here: How to Disable SELinux and now the mount works:

# mount -o vers=3 pnfs-9-30:/ /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: retrying: /mnt
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: pnfs-9-30: : RPC: Rpcbind failure - RPC: Timed out
nfs mount: /mnt: mounted OK
# 

Most of the help I found with google on the RPC messages wasn't informative. Either the suggestion was to turn off IPTABLES or there was no reply.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Finally, the Python version of the old Perl script

I played about in the interactive Python shell trying to understand the data and how to tie it together. I learned about the difference between exec and eval for Python. I learned about capturing stdio and stdout for exec, but I couldn't figure out a way to automatically create variables in the proper scope in Python.

I even finally found a good quote on this at http://mail.python.org/pipermail/tutor/2005-January/035253.html:

> This is something I've been trying to figure out for some time.  Is
> there a way in Python to take a string [say something from a
> raw_input] and make that string a variable name?  I want to to this so
> that I can create class instances on-the-fly, using a user-entered
> string as the instance name.

This comes up regularly from beginners and is nearly always a bad
idea!

The easy solution is to use a dictionary to store the instances.

Nice to know I'm not the first to want to do this. But it did get me thinking, I have been calling this set of Perl scripts 'data dictionaries' for longer than I care to remember. And the code is not very legible at times. So, I decided to redo the script as:

#!/usr/bin/python

import sys

first_line = True

lang = []
iCounter = 0
for line in open(sys.argv[1]):
        line2 = line.lstrip()
        iCounter += 1

        if line2.startswith("!") or line2.startswith("#"):
                if first_line:
                        lang = line2[1:].split(",")
                        first_line = False
                continue
        splity = line2.split(",")
        dtemp = {}

        if len(splity) != len(lang):
                print "Error - args do not match header on line %d" % (iCounter)
                continue

        for i in range(len(splity)):
                dtemp[lang[i]] = splity[i]

        print "%s - %s: %s for %s\n\t%s\n" % (
                dtemp['started'],
                dtemp['ended'],
                dtemp['title'],
                dtemp['company'],
                dtemp['description'])

dtemp['started'] is more verbose than $started, but it is clearer how I am generating the data. And I have more error checking (which I have yet to sanity check :->).

Anyway, this fails and I knew why almost right off the bat:

> ./r3.py r2.txt
Traceback (most recent call last):
  File "./r3.py", line 33, in 
    dtemp['description'])
KeyError: 'description'

I was suspicious about that extra newline I mentioned way back in The simple version of the old perl script. I suspected that the entry line still had an extra one that I needed to remove. I.e., the data dictionary has a key for 'dictionary\n' and not 'dictionary'.

The following change proved that:

for line in open(sys.argv[1]):
        line1 = line.lstrip()
        line2 = line1.rstrip()
        iCounter += 1

And some quick sanity checking of removing a column in one row and adding one in another row shows that my error checking works:

> ./r3.py r3.txt
Error - args do not match header on line 2
Error - args do not match header on line 3
4/01 - 6/01: Manager for Network Appliance
        Manager of Engineering Internal Test

10/99 - 4/01: System Administrator for Network Appliance
        Perl hacker and filer administrator

So I learned what I set out to do. I may never use this script, but it helped me learn some things the hard way. I didn't show all of the little syntax errors I had to fix (forgetting the ':', not indenting in the interactive shell, etc). But hopefully, I'll remember them.

I'll also claim that the script does meet my needs as did the old one. If I add a new field to the flat file, I won't have to change the script to get the current output! And yes, I just tried that and I didn't have a problem.

I could do some more error checking (i.e., don't access an entry unless it is set), but I've already gone above the error checking in the Perl script.

Final Copy

#!/usr/bin/python

import sys

first_line = True

lang = []
iCounter = 0
for line in open(sys.argv[1]):
        line1 = line.lstrip()
        line2 = line1.rstrip()
        iCounter += 1

        if line2.startswith("!") or line2.startswith("#"):
                if first_line:
                        lang = line2[1:].split(",")
                        first_line = False
                continue
        splity = line2.split(",")
        dtemp = {}

        if len(splity) != len(lang):
                print "Error - args do not match header on line %d" % (iCounter)
                continue

        for i in range(len(splity)):
                dtemp[lang[i]] = splity[i]

        print "%s - %s: %s for %s\n\t%s\n" % (
                dtemp['started'],
                dtemp['ended'],
                dtemp['title'],
                dtemp['company'],
                dtemp['description'])

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Analyzing that old perl script

Guess I have to understand that script to rewrite it in Python. :->

First, gethead.pl reads through the file until it finds a line which starts with a '!'. In which case it creates a list of names of the form '$'column name:

        $format = '$' . join(', $', split(/,/, $first_line));
        print $format . "\n";

Yields:

> ./r.pl r.txt
$started, $ended, $title, $company, $description

The magic really occurs in the main processing loop:

do main'read_txtfile_format(*LNG_FILE, *languages);

lang: while (<LNG_FILE>) {
        next lang if (/^#/ || /^!/);
        eval "($languages) = split(/[,\n]/)";

        print "$started - $ended: $title for $company\n\t$description\n\n";
}

The first line gets '$languages' setup to the 'variable' names. Each time through the while loop, we call the eval to translate/associate the columns to variable names.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Python strings are Immutable

So I added some code to my simple script that wasn't in the Perl:

for line in open(sys.argv[1]):
        line.lstrip()

And my intent was to strip out all of the leading spaces. I didn't have to, but I created a simple test case with the first line pushed over by a tab and the second line pushed over by 8 spaces. The first one worked correctly and the second did not:

Update, I wasn't thinking correctly here, I knew I had two bad lines here and I didn't know why. After solving the coding problem, I can see that both lines of input failed. The header line is being treated here as if it were a normal line and being processed.

> ./simple2.py r2.txt
        !started - ended: title for company
        description



     1/05 - present: Staff Engineer Software for Sun Microsystems
        NFS development

Well d'oh, even in Perl I just told it to strip out one character. I've got to tell it that while there is whitespace, strip it out:

for line in open(sys.argv[1]):
        while line.isspace(): line.lstrip()

And this doesn't work either. At which point I realize it must be because strings are immutable, right? I mean it is never changing! Note I get to the right conclusion, but for the wrong reasons. If it were immutable and the string had whitespace at the start, I should be stuck in an endless loop here. See the ending section for that analysis.

It also points out that I never did anything with that line.lstrip(). It never changes line, but does create a reference to a new string. Which we can see here:

>>> st2 = "     This is the radio clash!"
>>> print st2.lstrip()
This is the radio clash!
>>> print st2
     This is the radio clash!
>>>

See, st2.lstrip() actually works!

I've fixed up the script (in a boring way) and it works:

for line in open(sys.argv[1]):
        line2 = line.lstrip()
        if line2.startswith("!") or line2.startswith("#"): continue
        print "%s - %s: %s for %s\n\t%s\n\n" % tuple(line2.split(","))

Another mistake I just made

Okay, to try to understand this, I did the following in the shell:

>>> st2 = "     This is the radio clash!"
>>> while st2.isspace():
...     print st2
...     st2.lstrip()
...

Which should be an endless loop according to what I know now. But nothing gets done. Which means that st2.isspace() is FALSE. And a help(st2.isspace) shows that:

Help on built-in function isspace: isspace(...) S.isspace() -> bool Return True if all characters in S are whitespace and there is at least one character in S, False otherwise.

I.e., my misunderstanding of st2.lstrip() being immutable made me think that st2.isspace() worked on the first character of the string. Actually, I made a bad assumption based on what I thought C would do. My bad.

So I don't ever want to do that while loop on a string which is really all whitespace.

All the reading in the world about Python strings will not help me understand the immutability of them as much as this simple example.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily