I played about in the interactive Python shell trying to understand the data and how to tie it together. I learned about the difference between exec and eval for Python. I learned about capturing stdio and stdout for exec, but I couldn't figure out a way to automatically create variables in the proper scope in Python.
I even finally found a good quote on this at http://mail.python.org/pipermail/tutor/2005-January/035253.html:
> This is something I've been trying to figure out for some time. Is > there a way in Python to take a string [say something from a > raw_input] and make that string a variable name? I want to to this so > that I can create class instances on-the-fly, using a user-entered > string as the instance name. This comes up regularly from beginners and is nearly always a bad idea! The easy solution is to use a dictionary to store the instances.
Nice to know I'm not the first to want to do this. But it did get me thinking, I have been calling this set of Perl scripts 'data dictionaries' for longer than I care to remember. And the code is not very legible at times. So, I decided to redo the script as:
#!/usr/bin/python
import sys
first_line = True
lang = []
iCounter = 0
for line in open(sys.argv[1]):
line2 = line.lstrip()
iCounter += 1
if line2.startswith("!") or line2.startswith("#"):
if first_line:
lang = line2[1:].split(",")
first_line = False
continue
splity = line2.split(",")
dtemp = {}
if len(splity) != len(lang):
print "Error - args do not match header on line %d" % (iCounter)
continue
for i in range(len(splity)):
dtemp[lang[i]] = splity[i]
print "%s - %s: %s for %s\n\t%s\n" % (
dtemp['started'],
dtemp['ended'],
dtemp['title'],
dtemp['company'],
dtemp['description'])
dtemp['started'] is more verbose than $started, but it is clearer how I am generating the data. And I have more error checking (which I have yet to sanity check :->).
Anyway, this fails and I knew why almost right off the bat:
> ./r3.py r2.txt Traceback (most recent call last): File "./r3.py", line 33, indtemp['description']) KeyError: 'description'
I was suspicious about that extra newline I mentioned way back in The simple version of the old perl script. I suspected that the entry line still had an extra one that I needed to remove. I.e., the data dictionary has a key for 'dictionary\n' and not 'dictionary'.
The following change proved that:
for line in open(sys.argv[1]):
line1 = line.lstrip()
line2 = line1.rstrip()
iCounter += 1
And some quick sanity checking of removing a column in one row and adding one in another row shows that my error checking works:
> ./r3.py r3.txt
Error - args do not match header on line 2
Error - args do not match header on line 3
4/01 - 6/01: Manager for Network Appliance
Manager of Engineering Internal Test
10/99 - 4/01: System Administrator for Network Appliance
Perl hacker and filer administrator
So I learned what I set out to do. I may never use this script, but it helped me learn some things the hard way. I didn't show all of the little syntax errors I had to fix (forgetting the ':', not indenting in the interactive shell, etc). But hopefully, I'll remember them.
I'll also claim that the script does meet my needs as did the old one. If I add a new field to the flat file, I won't have to change the script to get the current output! And yes, I just tried that and I didn't have a problem.
I could do some more error checking (i.e., don't access an entry unless it is set), but I've already gone above the error checking in the Perl script.
#!/usr/bin/python
import sys
first_line = True
lang = []
iCounter = 0
for line in open(sys.argv[1]):
line1 = line.lstrip()
line2 = line1.rstrip()
iCounter += 1
if line2.startswith("!") or line2.startswith("#"):
if first_line:
lang = line2[1:].split(",")
first_line = False
continue
splity = line2.split(",")
dtemp = {}
if len(splity) != len(lang):
print "Error - args do not match header on line %d" % (iCounter)
continue
for i in range(len(splity)):
dtemp[lang[i]] = splity[i]
print "%s - %s: %s for %s\n\t%s\n" % (
dtemp['started'],
dtemp['ended'],
dtemp['title'],
dtemp['company'],
dtemp['description'])
...
for line in open(sys.argv[1]):
...line = line.strip()
...iCounter += 1
...
Posted by Anonymous on October 08, 2008 at 06:45 AM CDT #
for i in range(len(splity)):
dtemp[lang[i]] = splity[i]
could also be
for index,value in enumerate(splity):
...dtemp[lang[index]] = value
I think this would be considered more pythonic as it avoids iterating on a integer and iterates on the data instead
Posted by Neil McCallum on October 15, 2008 at 04:49 PM CDT #