Showing posts with label expandafter. Show all posts
Showing posts with label expandafter. Show all posts

Thursday, March 10, 2011

Counting the number of characters in TeX

I was recently asked if it is possible know if there is only one character in a TeX "string," so that we could change the style. This was because usually text did not fit in a box of fixed size, so the font had to be smaller, unless there was only one character.
This is a good example to introduce the notions of scanning in TeX and the usage of \let. We want to create a macro \countchar so that for instance \countchar{Hello world!} would set a TeX counter to 12.
In this first part, I will explain how to catch one character into a control sequence (macro) \gobblechar in TeX. In a second post I describe how to use this macro to count the number of characters.

Gobbling one character using \def


First of all, we need to be able to look at one character at a time. Let us see the different options. One possibility is to define a macro with one argument, then use the macro without {}:
\def\gobblechar#1{\def\char{#1}}
\gobblechar Hello world!. Char is [\char].

This produces the following output: "ello world!. Char is [H]." This is the expected result: without the curly brackets around the "Hello world'" sentence, the argument of the macro becomes the first non-blank token, i.e., H; it is saved in the \char control sequence and the remaining "ello world!" is printed as usual.
    Let us try now with a control sequence as argument:
    \def\text{Hello world!}
    \gobblechar \text. Char is [\char].

    This produces ". Char is [Hello world!]." because now TeX takes the macro \text as a single token, feeding it as argument to \gobblechar. To prevent this behavior, we need \text to be expanded before \gobblechar. We achieve this by doing the following code:
    \expandafter\gobblechar \text. Char is [\char].
    Which produces again "ello world!. Char is [H]." There is still one problem with this function: it is not possible to catch spaces that way:
    \def\text{ Hello world!}
    
    \expandafter\gobblechar \text. Char is [\char].
    This produces "ello world!. Char is [H]." and the space before the `H' is forgotten. As I stated above, the macro will look for the first non-blank token. This is where we leave the \def solution and go with the \let construct.

    Gobbling one character using \let

    \let is a powerful TeX construct that is often unknown to people. While \def is used to build a control sequence (or macro) that will expand to something else, \let creates an alias to something else. Compare for instance:
    \def\deftext{\text}
    \let\lettext=\text
    \ifx\deftext\text yes\else no\fi.
    \ifx\lettext\text yes\else no\fi.
    
    Which produces "no. yes." The \ifx construct can tell if two macros are "the same" (it is in fact a bit more complicated, I'll maybe write about it in the future). Using \let, it is for instance possible to swap the meaning of \a and \b by using:
    \let\tmp=\a \let\a=\b \let\b=\tmp
    And it is possible to \let to a character, for instance,
    \let\a=a
    This is \a\ sentence th\a t h\a s some \a's.
    
    produces "This is a sentence that has some a's." So, let us now redefine our \gobblechar as follows:
    \def\gobblechar{\let\char=}
    \expandafter\gobblechar \text. Char is [\char].
    
    This produces again "ello world!. Char is [H]." Indeed, TeX will expand the beginning to \let\char= Hello World!, and since the definition of \let states that there can be one optional space between the equal sign and the token. Let us put modify the macro to:
    \def\gobblechar{\let\char= }
    \expandafter\gobblechar \text. Char is [\char].
    
    This now produces "Hello world!. Char is [ ]." with the leading space correctly caught in \char. Note that simply writing
    \let\char=  Hello world!. Char is [\char].
    
    would not work, since TeX would directly convert the two spaces between `=' and `H' to a single space. And then assign \char to `H'.