Thursday, December 31, 2009

Craziest

Craziest - a dictionary for Scrabble and other word games.

It works on the premise that you have some assortment of letters, and aybe a blank or two (which can be used as any letter in the alphabet) and you want to find out what words can you make out of those letters. And that's what this script is for.

For instance, you have the letters: GZXATA and a blank tile, so you go:

craziest.py GZXATA?

(the question mark represents a single blank) and the script will print out a list of words and their value in points, e.g.:

14   ZAP
14 GAZE
13 ZETA
13 WAX
13 FAX
13 ADZ
12 ZIT

By default, the program uses the master aspell dictionary, but you can force it to use any list of words (the -D switch) or any command that produces it (the -d switch).

Also, it's possible to specify what length the word should be by manipulating the maximum and minimum number of letters (-M and -m switches, respectively) - the default values are 7 and 2.

You can also choose to provide your own alphabet, or letter scores, using an alphabet file (switch -a) where each line would contain a single letter and its score, separated by whitespace.

Oh, and as a rule, the entire script is case insensitive, except for when the dictionary is read in, where the words that start with a capital letter are filtered out.

Also, the name craziest is a reference to the triple-triple that is the highest achievable single word score in Scrabble (that I know of). Here, it's all explained here in this vidlit.

And finally, ladies and gentlemen, the code:
 
1 #!/usr/bin/python
2 #
3 # Craziest
4 #
5 # A script to find all legal Scrabble words that can be made using some list
6 # of letters and a dictionary of all legal words, and sort the output by the
7 # total score of each word.
8 #
9 # Parameters:
10 # -m, --min-length= Specify the minimum length of a word.
11 # -M, --max-length= Specify the maximum length of a word.
12 # -d, --dictionary-command= Provide a command that gives the list
13 # of correct words.
14 # -D, --dictionary-file= Provide a file with the list of correct words.
15 # -a, --alphabet= Specify available letters and their weights.
16 # -n, --no-scores Do not display word scores.
17 # -b, --blank-symbol= Define the symbol used to represent a blank.
18 # -h, --usage Command usage information.
19 #
20 # Author:
21 # Konrad Siek
22 #
23 # License
24 # Copyright 2009 Konrad Siek
25 #
26 # This program is free software: you can redistribute it and/or modify
27 # it under the terms of the GNU General Public License as published by
28 # the Free Software Foundation, either version 3 of the License, or
29 # (at your option) any later version.
30 #
31 # This program is distributed in the hope that it will be useful,
32 # but WITHOUT ANY WARRANTY; without even the implied warranty of
33 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
34 # GNU General Public License for more details.
35 #
36 # You should have received a copy of the GNU General Public License
37 # along with this program. If not, see <http://www.gnu.org/licenses/>.
38
39 SCORES = {
40 'A': 1, 'B': 3, 'C': 3, 'D': 2, 'E': 1, 'F': 4, 'G': 2, 'H': 4,
41 'I': 1, 'J': 8, 'K': 5, 'L': 1, 'M': 3, 'N': 1, 'O': 1, 'P': 3,
42 'Q': 10, 'R': 1, 'S': 1, 'T': 1, 'U': 1, 'V': 4, 'W': 4, 'X': 8,
43 'Y': 4, 'Z': 10,
44 }
45
46 def usage(program):
47 """ Print the usage information."""
48 print """Usage: %s [OPTIONS] LETTERS [, ...]'
49 OPTIONS
50 -m, --min-length= Specify the minimum length of a word.
51 -M, --max-length= Specify the maximum length of a word.
52 -d, --dictionary-command= Provide a command that gives the list
53 of correct words.
54 -D, --dictionary-file= Provide a file with the list of correct words.
55 -a, --alphabet= Specify available letters and their weights.
56 -n, --no-scores Do not display word scores.
57 -b, --blank-symbol= Define the symbol used to represent a blank.
58 -h, --usage Command usage information.
59 """ % program
60
61 class AspellDictionary:
62 def __init__(self, command='aspell dump master', keep_capitals = False):
63 """ Create an instance of the dictionary using a command."""
64 import commands
65 self.words = set([])
66 for word in commands.getoutput(command).split():
67 if len(word) > 0 and word[0].islower():
68 self.words.add(word.upper())
69
70 class FileDictionary:
71 def __init__(self, path, keep_capitals = False):
72 """ Create a dictionary from a file."""
73 self.words = set([])
74 for line in file(path).readlines():
75 for word in line.split():
76 if len(word) > 0 and word[0].islower():
77 self.words.add(word.upper())
78
79 def read_scores(path):
80 """ Read a list of available letters and their scores from a file."""
81 score = {}
82 for line in file(path).readlines():
83 letter, value = line.split(None, 1)
84 score[letter.upper()] = int(value)
85 return score
86
87 def create_words(letters, dictionary, min_length=2, max_length=8):
88 """ Create a list of all possible words that can be created from the given
89 list of letters. All the words will be correct according to the provided
90 dictionary.
91 """
92 from itertools import permutations
93 possible_words = set([])
94 for length in range(min_length, max_length + 1):
95 for permutation in permutations(letters, length):
96 possible_words.add("".join(permutation).upper())
97 return dictionary.words.intersection(possible_words)
98
99 def weigh_word(word, scores):
100 """ Add the score of each letter to produce the score of the word."""
101 sum = 0
102 for letter in word:
103 sum += scores[letter]
104 return sum
105
106 def expand_blanks(word, blank_symbol = '?', letters = SCORES.keys()):
107 """ Expand the symbol representing a blank into all possible letters."""
108 if word.find(blank_symbol) < 0:
109 return [word]
110 from itertools import permutations
111 words = []
112 template = word.replace(blank_symbol, "%s")
113 for tuple in permutations(letters, word.count(blank_symbol)):
114 words.append(template % tuple)
115 return words
116
117 def generate(argument, dictionary, min_length, max_length, blank, scores):
118 """ Generate a list of possible words."""
119 words = set([])
120 for expanded_set in expand_blanks(argument, blank, scores.keys()):
121 word_list = create_words(expanded_set, dictionary, min_length, max_length)
122 word_set = set(word_list)
123 words = words.union(word_set)
124 return words
125
126 def weigh(words, scores = SCORES):
127 """ Add scores to the words in the list."""
128 weighed_words = []
129 for word in words:
130 weighed_words.append((weigh_word(word, scores), word))
131 weighed_words.sort(None, None, True)
132 return weighed_words
133
134 def printout(weighed_words, show_weights = True):
135 """ Print out the list of words."""
136 if show_weights:
137 for pair in weighed_words:
138 print "%-4s\t%s" % pair
139 else:
140 for _, word in weighed_words:
141 print word
142
143 if __name__ == '__main__':
144 import sys
145 from getopt import getopt
146
147 # Default values for parameters.
148 dictionary = AspellDictionary()
149 min_length, max_length = (2, 7)
150 blank_symbol = '?'
151 show_scores = True
152 scores = SCORES
153
154 # Definitions of switches.
155 shorts = [ 'm:', 'M:', 'd:', 'D:', 'h', 'n', 'b:', 'a:']
156 longs = [
157 'min-length=', 'max-length=', 'dictionary-command=',
158 'dicionary-file=', 'usage', 'no-scores', 'blank-symbol=',
159 'alphabet='
160 ]
161
162 # Process user-supplied commandline parameters.
163 opts, args = getopt(sys.argv[1:], ''.join(shorts), longs)
164 for opt in opts:
165 if opt[0] in ['-m', '--min-length']:
166 min_length = int(opt[1])
167 elif opt[0] in ['-M', '--max-length']:
168 max_length = int(opt[1])
169 elif opt[0] in ['-d', '--dictionary-command']:
170 dictionary = AspellDicionary(opt[1])
171 elif opt[0] in ['-D', '--dictionary-file']:
172 dictionary = FileDictionary(opt[1])
173 elif opt[0] in ['-a', '--alphabet']:
174 scores = read_scores(opt[1])
175 elif opt[0] in ['-b', '--blank-symbol']:
176 blank_symbol = opt[1]
177 elif opt[0] in ['-n', '--no-scores']:
178 show_scores = False
179 elif opt[0] in ['-h', '--usage']:
180 usage(sys.argv[0])
181 sys.exit(0)
182 else:
183 usage(sys.argv[0])
184 sys.exit(1)
185
186 # Process each of the aguments.
187 for argument in args:
188 words = generate(
189 argument, dictionary, min_length, max_length, blank_symbol, scores
190 )
191 weighed_words = weigh(words, scores)
192 printout(weighed_words, show_scores)
193


The code is also available at GitHub as python/craziest.py.

Sunday, November 1, 2009

Crayon

Colorful text, right? What else could you possibly want?

When you're playing around with your shell you can typically use a bunch of control sequences to set colors and styles to your text, but these are hard to remember... at least, in my experience.

This script should provide a slightly easier way to use those controle sequences. E.g. when you want to have red bold text on white background, you'd normally go:

\033[31m\033[1m\033[47mSome text\033[m

With this script, you can just go:

./crayon.py -c red -s bold -b white "Some text"

You have to admit, it's more intelligible.

Enough banter; the code:
 
1 #!/usr/bin/python
2 #
3 # Crayon
4 #
5 # A simple script to color your text and wotnot. This basically uses
6 # \033-type sequences and color codes that may or may not work with your
7 # shell.
8 #
9 # Parameters:
10 # -s STYLE | --style=STYLE Specify a style (see list below).
11 # -c COLOR | --color=COLOR Specify a color (see list below).
12 # -b COLOR | --background=COLOR Specify a color (see list below).
13 # -k | --keep-style Do not reset styles after echo-ing.
14 # -d | --default Reset settings.
15 # -n | --no-newline Do not append a new line to the result.
16 # -r | --raw Print control sequences without applying them.
17 # -h | --help Print usage information (this).
18 #
19 # Author:
20 # Konrad Siek <konrad.siek@gmail.com>
21 #
22 # License:
23 # Copyright 2009 Konrad Siek
24 #
25 # This program is free software: you can redistribute it and/or modify
26 # it under the terms of the GNU General Public License as published by
27 # the Free Software Foundation, either version 3 of the License, or
28 # (at your option) any later version.
29 #
30 # This program is distributed in the hope that it will be useful,
31 # but WITHOUT ANY WARRANTY; without even the implied warranty of
32 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
33 # GNU General Public License for more details.
34 #
35 # You should have received a copy of the GNU General Public License
36 # along with this program. If not, see <http://www.gnu.org/licenses/>.
37
38 import sys
39 from getopt import getopt
40
41 # Styles.
42 BOLD = 1
43 UNDERLINE = 4
44 BLINK = 5
45 REVERSE = 7
46 CONCEALED = 8
47
48 # Colors.
49 BLACK = 0
50 RED = 1
51 GREEN = 2
52 YELLOW = 3
53 BLUE = 4
54 MAGENTA = 5
55 CYAN = 6
56 WHITE = 7
57
58 # Color string to code map.
59 COLORS = {
60 'black': BLACK,
61 'red': RED,
62 'green': GREEN,
63 'yellow': YELLOW,
64 'blue': BLUE,
65 'magenta': MAGENTA,
66 'cyan': CYAN,
67 'white': WHITE,
68 }
69
70 # Style to string map.
71 STYLES = {
72 'bold': BOLD,
73 'underline': UNDERLINE,
74 'blink': BLINK,
75 'reverse': REVERSE,
76 'concealed': CONCEALED,
77 }
78
79 def escape(string):
80 r"""Change the \033 character to its sring representation."""
81 return string.replace('\033', '\\033')
82
83 def command(c):
84 """Add the command character sequence to the specified code."""
85 return '\033[%sm' % c
86
87 def style(s):
88 """Apply the given code as a style command sequence."""
89 return command(s)
90
91 def fg(c):
92 """Apply the given code as a foreground color command sequence."""
93 return command(30 + c)
94
95 def bg(c):
96 """Apply the given code as a background color command sequence."""
97 return command(40 + c)
98
99 def default():
100 """Apply an empty command sequence - reset setting to default."""
101 return command('')
102
103 def echo(style, out = sys.stdout):
104 """Print out the string to standard output or some other stream."""
105 out.write(style)
106
107 def color_of_str(s):
108 """Translate the string to a color code."""
109 if s.isdigit() and int(s) in COLORS.values():
110 return int(s)
111 if s.lower() in COLORS:
112 return COLORS[s.lower()]
113 raise Exception("Unrecognized color: %s" % s);
114
115 def style_of_str(s):
116 """Translate the string to a string code."""
117 if s.isdigit() and int(s) in STYLES.values():
118 return int(s)
119 if s.lower() in STYLES:
120 return STYLES[s.lower()]
121 raise Exception("Unrecognized style: %s" % s);
122
123 def usage(command_name):
124 """Print script usage information."""
125 print 'Usage: %s [OPTIONS] TEXT' % command_name
126 print 'Options:'
127 shorts = ['s:', 'c:', 'b:', 'r', 'd', 'h', 'k', 'n']
128 longs = [
129 'style=', 'color=', 'background=', 'keep-style',
130 'default', 'help=', 'raw', 'no-newline'
131 ]
132 print "\t-s STYLE | --style=STYLE\tSpecify a style (see list below)."
133 print "\t-c COLOR | --color=COLOR\tSpecify a color (see list below)."
134 print "\t-b COLOR | --background=COLOR\tSpecify a color (see list below)."
135 print "\t-k | --keep-style\t\tDo not reset styles after echo-ing."
136 print "\t-d | --default\t\t\tReset settings."
137 print "\t-n | --no-newline\t\tDo not append a new line to the result."
138 print "\t-r | --raw\t\t\tPrint control sequences without applying them."
139 print "\t-h | --help\t\t\tPrint usage information (this)."
140 print "Styles (use either names or codes):"
141 for style in STYLES.items():
142 print "\t%s - %s\t" % style
143 print "Colors (use either names or codes):"
144 for color in COLORS.items():
145 print "\t%s - %s\t" % color
146
147 if __name__ == '__main__':
148 shorts = ['s:', 'c:', 'b:', 'r', 'd', 'h', 'k', 'n']
149 longs = [
150 'style=', 'color=', 'background=', 'keep-style',
151 'default', 'help', 'raw', 'no-newline'
152 ]
153
154 opts, args = getopt(sys.argv[1:], ''.join(shorts), longs)
155
156 prefix = ''
157 postfix = default()
158 raw = False
159 newline = True
160
161 for opt in opts:
162 if opt[0] in ['-c', '--color']:
163 prefix += fg(color_of_str(opt[1]))
164 elif opt[0] in ['-b', '--background']:
165 prefix += bg(color_of_str(opt[1]))
166 elif opt[0] in ['-s', '--style']:
167 prefix += style(style_of_str(opt[1]))
168 elif opt[0] in ['-k', '--keep-style']:
169 postfix = ''
170 elif opt[0] in ['-r', '--raw']:
171 raw = True
172 elif opt[0] in ['-d', '--default']:
173 prefix = default()
174 postfix = ''
175 elif opt[0] in ['-n', '--no-newline']:
176 newline = False
177 else:
178 usage(sys.argv[0])
179 sys.exit(1)
180
181 string = prefix + ' '.join(args) + postfix
182 if raw:
183 string = escape(string)
184 if newline:
185 string += "\n"
186 echo(string)


The code is also available at GitHub as python/crayon.py.

Friday, September 18, 2009

Concordancer

Concordancers are a sort of pet project for me - I'm often in the process of making one. They're simple enough to be fun, and complex enough to be interesting.

An additional perk is that nobody knows what concordancers are. If want to know about them, you probably want to start here and dig on.

This particular instance is meant to be very simple, and passes any pre- and post-processing straight onto the user. The text is split into words just by whitespaces, for instance, so all the punctuation sticks to words, and distorts the result, but if you want that fixed, you have to do it before passing it on to the concordancer.

Ok, so here's a quick walkthrough.

Basically, the concordancer read text from stdin and the keyword list from the arguments. Here's an example of searching or the word Sword in the text file de-bello-galico.txt:

cat de-bello-gallico.txt | ./concordancer.py Sword

That should output something like:

then much reduced by the *sword* of the enemy, and by
rest, perish either by the *sword* or by famine." XXXI.--They rise
rushes on briskly with his *sword* and carries on the combat
Therefore, having put to the *sword* the garrison of Noviodunum and
had advanced up the hill *sword* in hand, and had forced
labour, should put to the *sword* all the grown-up inhabitants, as
made a blow with his *sword* at his naked shoulder and
by the wound of a *sword* in the mouth; nor was

Actually, an even simpler invocation is available, if you want to create conordances for all the words in the text - in that case, you needn't provide a list of keywords, and go:

cat de-bello-gallico.txt | ./concordancer.py

... but I'm not sure how useful that'll be to you.

Typically, you'd probably want to find concordances for a word in all its forms. You can do that using aspell to generate the list of keywords from a given root:

cat de-bello-gallico.txt | ./concordancer.py `aspell dump master | grep sword`

And that'll produce output for all words containing the substring sword.

Now, there are probably better ways to use aspell for that purpose, but honestly, I played around with it and this is the only one that got me the result I wanted...

You can play around with different formats too, by just converting them to text prior to creating the concordance. For instance, for PDFs, use pdftotext:

pdftotext de-bello-gallico.pdf - | ./concordancer.py `aspell dump master | grep sword`

Right. You can also play around with the output of the concordancer. By default it marks the keywords in concordances with asterisks, but you can change that, to e.g. some HTML tags, by going:

cat de-bello-gallico.txt | ./concordancer.py -p '<b>' -s '</b>' Sword

And that'll produce something like:

then much reduced by the <b>sword> of the enemy, and by
rest, perish either by the <b>sword> or by famine." XXXI.--They rise
...

Another thing you can do is change the size of the context, here to up to 3 words on each side.:

cat de-bello-gallico.txt | ./concordancer.py -c 3 Sword 

That will output:

reduced by the *sword* of the enemy,
either by the *sword* or by famine."
briskly with his *sword* and carries on
put to the *sword* the garrison of
up the hill *sword* in hand, and
put to the *sword* all the grown-up
blow with his *sword* at his naked
wound of a *sword* in the mouth;

Also, you can group the output by keywords:

cat de-bello-gallico.txt | ./concordancer.py -d group reserves declares

And that gives you something like this:

*reserves*:
neither could proper *reserves* be posted, nor
*declares*:
suddenly assaulted; he *declares* himself ready to
that council he *declares* Cingetorix, the leader
Hispania Baetica, _Carmone_; *declares* for Caesar, and

Enough rambling, here's the code:
 
1 #!/usr/bin/python
2 #
3 # Concordancer
4 #
5 # A script for finding concordances for given keywords in the
6 # specified text.
7 #
8 # A concordance is a keyword with its context (here, the closest
9 # n words), a combination used, for instance, in lexicography to
10 # deduce the meaning of the keyword based on the way it is used
11 # in text.
12 #
13 # Parameters:
14 # c - the number of words that surround a keyword in context
15 # p - the string that is stuck in front of keywords
16 # s - the string that is stuck at the ends of keywords
17 # d - formatting of the display,
18 # 'simple' - one concordance per line (default)
19 # 'group' - group concordances by keywords
20 #
21 # Example:
22 # to find concordances for the word 'list' in the bash manual:
23 # man bash | concordancer.py arguments options
24 #
25 # Author:
26 # Konrad Siek <konrad.siek@gmail.com>
27 #
28 # License:
29 #
30 # Copyright 2009 Konrad Siek
31 #
32 # This program is free software: you can redistribute it and/or modify
33 # it under the terms of the GNU General Public License as published by
34 # the Free Software Foundation, either version 3 of the License, or
35 # (at your option) any later version.
36 #
37 # This program is distributed in the hope that it will be useful,
38 # but WITHOUT ANY WARRANTY; without even the implied warranty of
39 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
40 # GNU General Public License for more details.
41 #
42 # You should have received a copy of the GNU General Public License
43 # along with this program. If not, see <http://www.gnu.org/licenses/>.
44
45 # Imports.
46 import getopt
47 import sys
48
49 # Option sigils - the characters associated with various options.
50 CONTEXT_SIZE = 'c'
51 PREFIX = 'p'
52 SUFFIX = 's'
53 DISPLAY = 'd'
54
55 # Option default values, represented as a map for convenience.
56 OPTIONS = {\
57 CONTEXT_SIZE: str(5), \
58 PREFIX: '*', \
59 SUFFIX: '*', \
60 DISPLAY: 'simple'\
61 }
62
63 # Character constants, also for convenience.
64 EMPTY = ""
65 SPACE = " "
66 NEWLINE = "\n"
67 TAB = "\t"
68 COLON = ":"
69 SWITCH = "-"
70
71 def display_help(program_name):
72 """ Display usage information.
73
74 @param program_name - the name of the script"""
75
76 help_string = \
77 """Usage:
78 %s [OPTION] ... [WORD] ...
79 Options:
80 %s the number of words that surround a keyword in context
81 %s the string that is stuck in front of keywords
82 %s the string that is stuck at the ends of keywords
83 %s formatting of the display,
84 'simple' - one concordance per line (default)
85 'group' - group concordances by keywords
86 Words:
87 The list of words that concordances will be searched for. If
88 no list is provided, a complete concordance is made - that is,
89 one using all input words.""" \
90 % (program_name, CONTEXT_SIZE, PREFIX, SUFFIX, DISPLAY)
91 print(help_string)
92
93 def find_concordances(keywords, words, context_size):
94 """ Finds concordances for keywords in a list of input words.
95
96 @param keywords - list of keywords,
97 @param words - input text as a list of words
98 @param context_size - number of words that should surround a keyword
99 @return list of concordances"""
100
101 # Initialize the concordance map with empty lists, for each keyword.
102 concordances = prep_concordance_map(keywords)
103
104 # If any word in the text matches a keyword, create a concordance.
105 for i in range(0, len(words)):
106 for keyword in keywords:
107 if matches(keyword, words[i]):
108 concordance = form_concordance(words, i, context_size)
109 concordances[keyword].append(concordance)
110
111 return concordances
112
113 def find_all_concordances(words, context_size):
114 """ Make a complete concordance - assume all words match.
115
116 @param words - input text as a list of words
117 @param context_size - number of words that should surround a keyword
118 @return list of concordances"""
119
120 concordances = {}
121
122 for i in range(0, len(words)):
123 word = words[i]
124 if word not in concordances:
125 concordances[word] = []
126 concordance = form_concordance(words, i, context_size)
127 concordances[word].append(concordance)
128
129 return concordances
130
131 def print_concordances(concordances, simple, prefix, suffix):
132 """ Print the concordances to screen.
133
134 @param concordances - list of concordances to display
135 @param simple - True: display only concordances, False: group by keywords
136 @param prefix - prefix to keywords
137 @param suffix - suffix to keywords"""
138
139 # For each concordance, mark the keywords in the sentence and print it out.
140 for keyword in concordances:
141 if not simple:
142 sys.stdout.write(prefix + keyword + suffix + COLON + NEWLINE)
143 for words in concordances[keyword]:
144 if not simple:
145 sys.stdout.write(TAB)
146 for i in range(0, len(words)):
147 if matches(keyword, words[i]):
148 sys.stdout.write(prefix + words[i] + suffix)
149 else:
150 sys.stdout.write(words[i])
151 if i < len(words) - 1:
152 sys.stdout.write(SPACE)
153 else:
154 sys.stdout.write(NEWLINE)
155
156 def prep_concordance_map(dict_words):
157 """ Prepare a map with keywords as keys and empty lists as values.
158
159 @param dict_words - list of keywords"""
160
161 # Put an empty list value for each keyword as key.
162 concordances = {}
163 for word in dict_words:
164 concordances[word] = []
165
166 return concordances
167
168 def matches(word_a, word_b):
169 """ Case insensitive string equivalence.
170
171 @param word_a - first string
172 @param word_b - second string (duh)
173 @return True or False"""
174
175 return word_a.lower() == word_b.lower()
176
177 def form_concordance(words, occurance, context_size):
178 """ Creates a concordance.
179
180 @param words - list of all input words
181 @param occurance - index of keyword in input list
182 @param context_size - number of preceding and following words
183 @return a sublist of the input words"""
184
185 start = occurance - context_size
186 if start < 0:
187 start = 0
188
189 return words[start : occurance + context_size + 1]
190
191 def read_stdin():
192 """ Read everything from standard input as a list.
193
194 @return list of strings"""
195
196 words = []
197 for line in sys.stdin:
198 # Add all elements returned by function to words.
199 words.extend(line.split())
200
201 return words
202
203 def read_option(key, options, default):
204 """ Get an option from a map, or use a default.
205
206 @param key - option key
207 @param options - option map
208 @param default - default value, used if the map does not contain that key
209 @return value from the map or default"""
210
211 for option, value in options:
212 if (option == SWITCH + key):
213 return value
214
215 return default
216
217 def get_configuration(arguments):
218 """ Retrieve the entire configuration of the script.
219
220 @param arguments - script runtime parameters
221 @return map of options with defaults included
222 @return list of arguments (keywords)
223 @return list of words from standard input"""
224
225 # All possible option sigils are concatenated into an option string.
226 option_string = EMPTY.join([("%s" + COLON) % i for i in OPTIONS.keys()])
227 # Read all the options.
228 options, arguments = getopt.getopt(arguments, option_string)
229
230 # Apply default values if no values were set.
231 fixed_options = {}
232 for key in OPTIONS.keys():
233 fixed_options[key] = read_option(key, options, OPTIONS[key])
234
235 # Read the list of words at standard input.
236 input = read_stdin()
237
238 return (fixed_options, arguments, input)
239
240 def process(options, arguments, input):
241 """ The main function.
242
243 @param options - map of options with defaults included
244 @param arguments - list of arguments (keywords)
245 @param input - list of words from standard input"""
246
247 # Extract some key option values.
248 context_size = int(options[CONTEXT_SIZE])
249 simple = options[DISPLAY] == OPTIONS[DISPLAY]
250
251 # Conduct main processing - find the concordances.
252 concordances = {}
253 if arguments == []:
254 # If no arguments are specified, construct a concordance for all
255 # possible keywords.
256 concordances = find_all_concordances(input, context_size)
257 else:
258 # And if there are,make a concordance for only those words.
259 concordances = find_concordances(arguments, input, context_size)
260
261 # Display the results.
262 print_concordances(concordances, simple, options[PREFIX], options[SUFFIX])
263
264 # The processing starts here.
265 if __name__ == '__main__':
266 # Read all user-supplied information.
267 options, arguments, input = get_configuration(sys.argv[1:])
268
269 # The configuration is not full - display usage information.
270 if arguments == [] and input == []:
271 display_help(sys.argv[0])
272 exit(1)
273
274 # If evverything is in order, start concordancing.
275 process(options, arguments, input)
276


The code is also available at GitHub as python/concordancer.py.

Zentube

Another variation on the theme of zenity. I honestly like the way you can make simple front-ends. In addition, I'm doing something with youtube again, or more precisely, I'm doing stuff with youtube-dl.

So, the problem with youtube is that if you don't have Internet access, you obviously can't really use it, and there are certain instances where it'd come in handy. One such instance is when you're doing language teaching in an Internet-bereft classroom.

So there's youtube-dl to get some videos downloaded, but a person is not always in the mood for fiddling with the shell when preparing their lesson material.

Hence, this script provides the simples of interfaces to download videos via youtube-dl. That's pretty much it. Anyway, I think it's simple and does its job.

Oh, yeah, I played around with the idea of automatically installing a package if it is not available at the time of execution. It's a sort of experiment, to see if it can be done at all. I'm not sure how effective this is though. And it depends on apt and gksudo.

The code:
 
1 #!/bin/bash
2 #
3 # Zentube
4 #
5 # A simple GUI front-end to youtube-dl. All you need to do is run it,
6 # and put in the address of the video, and the back-end tries to
7 # download the video.
8 #
9 # Parameters:
10 # None
11 #
12 # Requires:
13 # youtube-dl
14 # zenity
15 # gksudo & apt (if you want youtube-dl installed automatically)
16 #
17 # Author:
18 # Konrad Siek <konrad.siek@gmail.com>
19 #
20 # License:
21 #
22 # Copyright 2008 Konrad Siek.
23 #
24 # This program is free software: you can redistribute it and/
25 # or modify it under the terms of the GNU General Public
26 # License as published by the Free Software Foundation, either
27 # version 3 of the License, or (at your option) any later
28 # version.
29 #
30 # This program is distributed in the hope that it will be
31 # useful, but WITHOUT ANY WARRANTY; without even the implied
32 # warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
33 # PURPOSE. See the GNU General Public License for more
34 # details.
35 #
36 # You should have received a copy of the GNU General Public
37 # License along with this program. If not, see
38 # <http://www.gnu.org/licenses/>.
39
40 # The downloader backend.
41 PACKAGE=youtube-dl
42
43 # Output information.
44 OUTPUT_DIR=~/Videos/
45 EXTENSION=.flv
46 TEMP_FILE=/tmp/$(basename $0).XXXXXXXXXX
47
48 # The quality of the output file can be adjusted here, or you can comment
49 # out this setting altogether, to get the default.
50 QUALITY=--best-quality
51
52 # Exit code constants.
53 SUCCESS=0
54 INSTALLATION_ABORTED=1
55 INSTALLATION_FAILED=2
56 INVALID_VIDEO_ADDRESS=4
57 INVALID_OUTPUT_DIRECTORY=8
58 BACKEND_ERROR=16
59
60 # This is a convenience installer for apt-using distros, e.g. Ubuntu.
61 if [ -z "$(which $PACKAGE)" ]
62 then
63 # Ask whether to attempt automatic install of the missing package.
64 # If the answer is no, then quit with an error.
65 zenity --question \
66 --title="Automatic installation" \
67 --text="Can't find <b>$PACKAGE</b>. Should I try installing it?" \
68 || exit $INSTALLATION_ABORTED
69
70 # Try installing the missing package, or quit with an error if the
71 # attempt is failed.
72 gksudo "apt-get install $PACKAGE" || exit $INSTALLATION_FAILED
73 fi
74
75 # Ask user for the URL of the video.
76 url=$(\
77 zenity --entry \
78 --title="Video address" \
79 --text="What is the address of the video?" \
80 )
81 # If no URL is given, then quit.
82 [ -z "$url" ] && exit $INVALID_VIDEO_ADDRESS
83
84 # Move to the output directory, create it i necessary.
85 mkdir -p "$OUTPUT_DIR" || exit $INVALID_OUTPUT_DIRECTORY
86 cd "$OUTPUT_DIR"
87
88 # Make a temporary file to collect error messages from the downloader.
89 temp_file=$(mktemp $TEMP_FILE)
90
91 # Run the downloader.
92 $PACKAGE $QUALITY --title "$url" 2>"$temp_file" | \
93 zenity --progress --pulsate --auto-kill --auto-close --text="Downloading..."
94
95 # Check for errors, and display a success of error dialog at the end.
96 errors=$(cat $temp_file)
97
98 if [ -z "$errors" ]
99 then
100 # Display successful info.
101 zenity --info --text="Download successful!"
102
103 # Remove temporary file.
104 unlink "$temp_file"
105
106 # Exit successfully.
107 exit $SUCCESS
108 else
109 # Display error dialog.
110 zenity --error --text="$errors"
111
112 # Remove temporary file.
113 unlink "$temp_file"
114
115 # Exit with an error code.
116 exit $BACKEND_ERROR
117 fi


The code is also available at GitHub as bash/zentube.

Wednesday, September 16, 2009

Read selection

And onto further adventures!

After making that zenspeak script I got told that it'd be more useful, if you could enter a whole lot of text into it. So then it dawned on me that maybe having a Gedit plugin like that could be useful.

So, if you got the External Tools plug-in installed in Gedit, you can put this script in there somewhere, tweak it a bit to use your favorite speech generator, voice, etc., and you're all set to never read a single word again.

One drawback: it won't be easy to stop it if you've let it run, so if you make it read a lot of text, it might be troublesome.

I suppose I don't have it in me to really write stuff here today.

The code:
 
1 #!/bin/bash
2 text=`xargs -0 echo`
3
4 SYSTEM=espeak
5
6 if [ -n "$text" ]
7 then
8 echo "Reading \"$text\" with $SYSTEM."
9
10 if [ $SYSTEM == espeak ]
11 then
12 padsp espeak "$text" -v en+f3
13 elif [ $SYSTEM == festival ]
14 then
15 echo "$text" | festival --tts
16 fi
17 fi


The code is also available at GitHub as gedit/gedit_read_selection.

Zenspeak

You can give this one to children. It makes them noisier.

This one's just a simple interface to either espeak or festival: it asks you what to say via zenity and then says it. It doesn't take any arguments, so you start it up with a simple:

./zenspeak

In summary, it's not exactly dragon magic.

The code:
 
1 #!/bin/bash
2 #
3 # Zenspeak
4 #
5 # Provides a simple graphical (Gtk) interface to a speech production system:
6 # either espeak or festival. It's really simple too: you put in some text,
7 # the text is spoken. When you put in zero text, the program ends.
8 #
9 # Parameters:
10 # None
11 #
12 # Depends:
13 # espeak (apt:espeak)
14 # festival (apt:festival)
15 # zenity (apt:zenity)
16 #
17 # Author:
18 # Konrad Siek <konrad.siek@gmail.com>
19 #
20 # License information:
21 #
22 # This program is free software: you can redistribute it and/or modify
23 # it under the terms of the GNU General Public License as published by
24 # the Free Software Foundation, either version 3 of the License, or
25 # (at your option) any later version.
26 #
27 # This program is distributed in the hope that it will be useful,
28 # but WITHOUT ANY WARRANTY; without even the implied warranty of
29 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
30 # GNU General Public License for more details.
31 #
32 # You should have received a copy of the GNU General Public License
33 # along with this program. If not, see <http://www.gnu.org/licenses/>.
34 #
35 # Copyright 2009 Konrad Siek
36
37 # System for production of sound is selected by the parameter,
38 # or the defaut is used if none were specified.
39 SYSTEM_DEFAULT=espeak
40 SYSTEM=`(( $# == 0 )) && echo "$SYSTEM_DEFAULT" || echo "$1"`
41 echo $SYSTEM
42
43 # System dependent settings for espeak:
44 espeak_speed=120 # default: 160
45 espeak_pitch=60 # 0-99, default: 50
46 espeak_amplitude=20 # 0-20, default: 10
47 espeak_voide=english # list of voices: `espeak --voices`
48 espeak_variant=f2 # m{1,6}, f{1,4}
49
50 # I'm completely pants when it comes to setting up festival, so I won't
51 # even attempt it here.
52
53 while true
54 do
55 # Show dialog and get user input.
56 something=`zenity --entry --title="Say something..." --text="Say:"`
57
58 # If no user input or cancel: bugger off (and indicate correct result).
59 if [ -z "$something" ]
60 then
61 exit 0
62 fi
63
64 # Put the input through either espeak or festival.
65 if [ "$SYSTEM" == "espeak" ]
66 then
67 # Note: the sound is padded within pulse, so that it can be
68 # played simultaneously with other sources.
69 padsp espeak \
70 -a $espeak_amplitude \
71 -p $espeak_pitch \
72 -s $espeak_speed \
73 -v $espeak_voice+$espeak_variant \
74 "$something"
75 elif [ "$SYSTEM" == "festival" ]
76 then
77 # Incidentally, that's all I know about festival.
78 echo "$something" | festival --tts
79 fi
80 done


The code is also available at GitHub as bash/zenspeak.