
The visual HTML and PNG Skimpy generators translate words or phrases into strings containing HTML preformatted text or PNG image files respectively. The preformatted text contains "ASCII art" representing the input phrase, which looks like somewhat sloppy handwriting on a speckled page, when viewed in an HTML browser. The PNG image contains a "pixelized" version of the word or phrase with added noise.
It is intended that it is easy for a human to read the word or phrase when rendered as HTML, but it is difficult for a program to extract the word or phrase automatically. The program uses a number of techniques to make the output difficult for a computer to interpret including the addition of noise. The preformatted text version is probably more secure than the PNG version, but it is also harder to read (not surprisingly).
In order to allow CAPTCHA tests that are usable by people with visual empairment,
Skimpy also provides an audio implementation.
The audio WAVE Skimpy program uses a compiled audio sample file
waveIndex.zip
. The input of the program are words or phrases and the output
are the words or phrases spelled as individual spoken characters in an audio stream.
It is intended that a human can understand the audio stream but a computer program will not be able to analyse the stream and extract the letters. To make the audio stream more difficult to automatically analyse (without making it unintelligible) the program randomly overlaps and stretches/shrinks the input samples, among other things.
The Skimpy tools are far easier to install, use, and embed than other similar technologies.
Also included with the package is a PNG canvas implementation which allows easy programmatic creation of graphical PNG images. This interface allows for the the construction of PNG images from textual fonts or geometric shapes, with transparent or non-transparent backgrounds. The canvas also can generate Javascript data structures which allow HTML pages to intelligently respond to mouse events over the image.
For example below are examples of a bar chart and a pie chart generated by the skimpy canvas. Mouse over the images to see responses to mouse events (on supported browsers with javascript enabled).
information about the mouse event over either image should appear here. |
---|
setup.py
install script in the root directory.
prompt> setup.py install
skimpy.py --pre myAddress@example.com --filename pre/address.html skimpy.py --wave myAddress@example.com --filename wave/address.wav --indexfile ../wave/waveIndex.zip skimpy.py --png myAddress@example.com --filename png/address.pngThese commands create an HTML page, a Wave audio file, and a PNG image containing the Skimpy encodings for
myAddress@example.com
. The HTML may
be included in web pages, like this one, and the PNG or Wave may be linked from the
web page as I've done here:
pre/address.html
preformatted text:

png/address.png
PNG image: wave/address.wav
audio encoding..
With any luck human readers of the web page will be able to read the HTML or image or hear and understand the audio and use the email address but evil spammer programs will fail to recognize and understand the address.
Web applications also frequently use Captcha tools like Skimpy to verify that the agent interacting with a web application is a human and not an automated program or "robot". To use Skimpy to foil robots include a Skimpy encoded text (obtained either by using the Python API or by capturing the output from the Skimpy command line program) near a web form and ask the "user" to type in the text on the form. When the user submits the form verify that the text entered by the user matches the text used to generate the Skimpy preformatted text.
A Python CGI script which implements a challenge like this is included in the
distribution: guess.cgi
. The
http://www.xfeedme.com/skimpyGimpy/guess.py/go demo page provides a live demo
of this kind of use. Below is a screen shot of the page:
--filename
or --stdout
must be specified.
command | option | explanation |
---|---|---|
skimpy.py --pre WORD |
Generate preformatted text HTML for WORD |
|
--filename FILENAME |
Direct output to FILENAME
(eg, word.html ). |
|
--stdout |
Direct output to standard output. | |
--speckle RATIO |
Use speckle noise ratio RATIO
(eg 0.1) |
|
--scale NUMBER |
Scale font by NUMBER
(eg 0.7 or 2.3) |
|
--color HEX6DIGITS |
Use color HEX6DIGITS
in format RRGGBB (eg, 77ff77 for light green or 0000aa for dark blue) |
|
skimpy.py --png WORD |
Generate PNG image for WORD |
|
--filename FILENAME |
Direct output to FILENAME
(eg, word.png ) |
|
--stdout |
Direct output to standard output. | |
--speckle RATIO |
Use speckle noise ratio RATIO
(eg 0.1) |
|
--scale NUMBER |
Scale font by NUMBER
(eg 0.7 or 2.3) |
|
--color HEX6DIGITS |
Use color HEX6DIGITS
in format RRGGBB (eg, 77ff77 for light green or 0000aa for dark blue) |
|
--fontpath PATH_TO_BDF_FONT_FILE |
Use the BDF font specification located
in PATH_TO_BDF_FONT_FILE (eg, /usr/local/fonts/timesRoman.bdf )
| |
skimpy.py --wave WORD |
Generate Wave audio file for WORD |
|
--filename FILENAME |
Direct output to FILENAME
(eg, word.wav ) |
|
--stdout |
Direct output to standard output. | |
--indexfile PATH_TO_WAVE_INDEX |
Required for Wave generation. Use the zipped wave index located at PATH_TO_WAVE_INDEX , usually the location of the wave index file provided in the distribution. (eg, ../wave/waveIndex.zip )
|
"example of skimpyAPI usage" # import the API interface module from skimpyGimpy import skimpyAPI # this is the text we want to encode WORD = "example text" #HTML GENERATION: # these are the parameters we want to use # for preformatted text HTMLSPECKLE = 0.1 HTMLSCALE = 0.7 HTMLCOLOR = "001199" HTMLFILE = "pre/WORD.html" # create an HTML generator htmlGenerator = skimpyAPI.Pre(WORD, speckle=HTMLSPECKLE, # optional scale=HTMLSCALE, # optional color=HTMLCOLOR, # optional ) # store the preformatted text as htmlText htmlText = htmlGenerator.data() # store the preformatted text as htmlText # and also write text to file htmlText = htmlGenerator.data(HTMLFILE) #PNG GENERATION: # these are the parameters we want to use # for PNG image output PNGSPECKLE = 0.11 PNGSCALE = 2.1 PNGCOLOR = "00EEAA" PNGFONTPATH = "../fonts/radon-wide.bdf" PNGFILE = "png/WORD.png" # create an PNG generator pngGenerator = skimpyAPI.Png(WORD, speckle=PNGSPECKLE, # optional scale=PNGSCALE, # optional color=PNGCOLOR, # optional fontpath=PNGFONTPATH # optional ) # store the PNG data as pngText pngText = pngGenerator.data() # store the PNG data as pngText # and also write PNG to file pngGenerator.data(PNGFILE) #WAVE GENERATION: # these are the parameters we want to use # for WAVE audio output INDEXFILEPATH = "../wave/waveIndex.zip" WAVEFILE = "wave/WORD.wav" # create a wave generator waveGenerator = skimpyAPI.Wave(WORD, indexFile=INDEXFILEPATH # required! ) # generate the wave data as waveText waveText = waveGenerator.data() # generate the wave data as waveText # and also save to file waveText = waveGenerator.data(WAVEFILE) |
data()
methods are called
twice each for illustrative purposes where only one call is needed.
guess.cgi
implements a simple challenge/response which tests whether
the web client agent can understand a word encoded by Skimpy.
skimpytest.cgi
provides a web interface for generating Skimpy representations
for strings where the user can type in a string and see or hear a Skimpy representation for the string.
interpolate.cgi
CGI script aides in construction of these control point
representations. It is possible to modify the default choice of character mappings
or to use several alternative character mappings. Since it's hard to explain, suffice
it to say that if you want to change the character glyphs and
can't figure out how to do it for yourself send questions via email using the address below.
Audio character pronunciations are stored as piecewise linear approximations of input samples. It is possible to build and use alternative indices of alternative sample sets, but this document does not explain that procedure either at this time, sorry.
