textpad tech support

I have not used complex regular expressions before and I’ll I’ll be darned if I can figure out textpad’s regular expressions help file.  Here’s what I want to do.  I have a codebook with entries like this:

99999  Something County

and I want to use search and replace in textpad to turn all these lines into  this:

replace countyname=”Something County” if fips==99999

Anybody know how?

Author: olderwoman

I'm a sociology professor but not only a sociology professor. I keep my name out of this blog because I don't want my name associated with it in a Google search. Although I never write anything in a public forum like a blog that I'd be ashamed to have associated with my name (and you shouldn't either), it is illegal for me to use my position as a public employee to advance my religious or political views, and the pseudonym helps to preserve the distinction between my public and private identities. The pseudonym also helps to protect the people I may write about in describing public or semi-public events I've been involved with. You can read about my academic work on my academic blog http://www.ssc.wisc.edu/soc/racepoliticsjustice/ --Pam Oliver

12 thoughts on “textpad tech support”

  1. I’d convert to another application. You can always then convert it back into a text document after you’ve made your changes if you need that for formatting purposes. You can probably open the file in Word or something if you pull it in from word (if you click on the file, it’ll go to text).

    Like

  2. me too — i’ve done exactly this with census data. i opened the text file in excel and added columns with the text i wanted to add. then, i pasted it back into textpad and got rid of the spaces if needed with search/replace. there are probably many better ways to do it but this is always what i end up doing…

    Like

  3. Thanks, I figured out how to do it in a spreadsheet myself after posting, but I’m still irritated at not being able to understand the regular expression stuff in the textpad documentation, so if there is someone who does speak that language, I’d appreciate an example, as it seems likely this will come up in the future & others might want to know.

    Like

  4. That’s the kind of situation where I’d use awk, especially if there are lots of files that need manipulating. It’s made to chew up and rearrange delimited files. Assuming your datafile is tab-delimited, this should do the trick and dump the rearranged output into out.txt:

    awk -F t ‘{printf “replace countyname=\”%s\” if fips=%s\n”, $2, $1}’ datafile.txt > out.txt

    Like

  5. TextWrangler (a mac text editor) has an excellent genexp help file (which i will send you as a pdf if you email me and ask for it. Since regexp are a standard thing they tend to be the same across programs.

    anyway, this code should work
    find:
    ^([0-9]+)\t([A-Za-z ]+)\n
    replace:
    replace countyname=”\2″ if fips==\1\n

    To briefly explain the code it means find a line that starts with a number, then a tab, a string (with embedded spaces), and a return
    keep the number in memory as subpattern 1 and the string as subpattern 2
    replace it with the stata syntax you want with #1 and #2 in the appropriate places.

    Like

  6. Gabriel’s code should work if your text strings are separated by tabs. I have a slightly less elegant solution that I just tested in text pad:

    Assuming that all you have on the relevant lines of the codebook is fips code and the county string:

    99999 Something County

    Search Box: ^\([0-9]+\) +\(.*$\)

    note: I don’t know why, but textpad forces you to backslash before the parentheses to tell it you want to capture that text.

    This search tells textpad to capture all the digits in register 1, skip any space and then capture anything after the space to the end of the line (dollar sign) in register two. If my code doesn’t work, substitute text pad’s {ws} code for the space +.

    Replace Box: replace countyname = “\2” if fips==\1

    In my test:

    Before:

    99999 Something County
    99999 Something County
    99999 Something County

    After
    replace countyname = “Something County” if fips==99999
    replace countyname = “Something County” if fips==99999
    replace countyname = “Something County” if fips==99999

    Like

  7. Helpful computing advice along the lines of 1-6 above always reminds me of both of the stereotypical responses to asking for directions in Ireland: a) “I wouldn’t start from here” or b) “You can’t get there from here.” I think there is probably a moral here about the state of our computing platforms.

    Like

  8. Corey: your syntax worked except I had to take the ^ out, probably because the text to be changed is indented. Gabriel, the help file you sent is a LOT clearer than what is in the TextPad help files, although once I started understanding what this is, then I could make sense of the TextPad files, which give its own particular syntax twists. I could not get your (Gabriel’s) syntax to work as written. After examining your syntax and Coreys (including the fact that those are spaces not tabs), I got the following to work:
    \([0-9]+\) \([a-z ]+\)\n
    replace countryname = “\2” if fips==\1\n

    Experimenting with what happened when I left out the + helped me to understand the syntax. (The + takes as many repetitions of the characters as are there.)

    \([0-9]+\) \([a-z ]+\)\n
    and
    \([0-9]+\) \([a-z A-Z]+\)\n
    were equivalent because I had match case turned off.

    The TextPad help file says to prefer $ which is an anchor to the end to the \n line feed, so revising Gabriel’s code would give this:
    \([0-9]+\) \([a-z A-Z]+$\)
    which contrasts to Corey’s
    \([0-9]+\) +\(.*$\)
    which (I’m now understanding) says in the second part that anything, including all blanks, matches the pattern, whereas Gabriel’s syntax (revised) requires that there be some actual letters in the string, but would not match punctuation.

    Thanks again to both of you! This is what I needed. Once you get the idea, it isn’t that hard, but the first step is confusing.

    Like

  9. Olderwoman, here is a website that I have found especially useful dealing with regular expressions–it has a nice tutorial that explains the logic behind most of the terms, symbols, etc.

    http://www.regular-expressions.info/

    Also, it seems like another way you could do this would be through reading the file line-by-line into Stata and then writing a .do file as an output. That would have the advantage that you could change things in the future without having to write down the search/replace code anywhere…

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.