I was recently asked if it is possible know if there is only one character in a TeX "string," so that we could change the style. This was because usually text did not fit in a box of fixed size, so the font had to be smaller, unless there was only one character.
This is a good example to introduce the notions of scanning in TeX and the usage of \let. We want to create a macro \countchar so that for instance \countchar{Hello world!} would set a TeX counter to 12.
In this first part, I will explain how to catch one character into a control sequence (macro) \gobblechar in TeX. In a second post I describe how to use this macro to count the number of characters.
First of all, we need to be able to look at one character at a time. Let us see the different options. One possibility is to define a macro with one argument, then use the macro without {}:
This produces the following output: "ello world!. Char is [H]." This is the expected result: without the curly brackets around the "Hello world'" sentence, the argument of the macro becomes the first non-blank token, i.e., H; it is saved in the \char control sequence and the remaining "ello world!" is printed as usual.
This produces ". Char is [Hello world!]." because now TeX takes the macro \text as a single token, feeding it as argument to \gobblechar. To prevent this behavior, we need \text to be expanded before \gobblechar. We achieve this by doing the following code:
This is a good example to introduce the notions of scanning in TeX and the usage of \let. We want to create a macro \countchar so that for instance \countchar{Hello world!} would set a TeX counter to 12.
In this first part, I will explain how to catch one character into a control sequence (macro) \gobblechar in TeX. In a second post I describe how to use this macro to count the number of characters.
Gobbling one character using \def
First of all, we need to be able to look at one character at a time. Let us see the different options. One possibility is to define a macro with one argument, then use the macro without {}:
\def\gobblechar#1{\def\char{#1}} \gobblechar Hello world!. Char is [\char].
This produces the following output: "ello world!. Char is [H]." This is the expected result: without the curly brackets around the "Hello world'" sentence, the argument of the macro becomes the first non-blank token, i.e., H; it is saved in the \char control sequence and the remaining "ello world!" is printed as usual.
\def\text{Hello world!} \gobblechar \text. Char is [\char].
This produces ". Char is [Hello world!]." because now TeX takes the macro \text as a single token, feeding it as argument to \gobblechar. To prevent this behavior, we need \text to be expanded before \gobblechar. We achieve this by doing the following code:
\expandafter\gobblechar \text. Char is [\char].Which produces again "ello world!. Char is [H]." There is still one problem with this function: it is not possible to catch spaces that way:
\def\text{ Hello world!} \expandafter\gobblechar \text. Char is [\char].This produces "ello world!. Char is [H]." and the space before the `H' is forgotten. As I stated above, the macro will look for the first non-blank token. This is where we leave the \def solution and go with the \let construct.
Gobbling one character using \let
\let is a powerful TeX construct that is often unknown to people. While \def is used to build a control sequence (or macro) that will expand to something else, \let creates an alias to something else. Compare for instance:\def\deftext{\text} \let\lettext=\text \ifx\deftext\text yes\else no\fi. \ifx\lettext\text yes\else no\fi.Which produces "no. yes." The \ifx construct can tell if two macros are "the same" (it is in fact a bit more complicated, I'll maybe write about it in the future). Using \let, it is for instance possible to swap the meaning of \a and \b by using:
\let\tmp=\a \let\a=\b \let\b=\tmpAnd it is possible to \let to a character, for instance,
\let\a=a This is \a\ sentence th\a t h\a s some \a's.produces "This is a sentence that has some a's." So, let us now redefine our \gobblechar as follows:
\def\gobblechar{\let\char=} \expandafter\gobblechar \text. Char is [\char].This produces again "ello world!. Char is [H]." Indeed, TeX will expand the beginning to
\let\char= Hello World!
, and since the definition of \let states that there can be one optional space between the equal sign and the token. Let us put modify the macro to: \def\gobblechar{\let\char= } \expandafter\gobblechar \text. Char is [\char].This now produces "Hello world!. Char is [ ]." with the leading space correctly caught in \char. Note that simply writing
\let\char= Hello world!. Char is [\char].would not work, since TeX would directly convert the two spaces between `=' and `H' to a single space. And then assign \char to `H'.
Hello, this post is a couple of years ago, but very useful.
ReplyDeleteThank you very much.
Now, I have a question:
I tested all your commands here, and they all worked the way you described, both here and in Part 2. But I can not understand why this passage:
\def\gobblechar{\let\char= }
\expandafter\gobblechar \text. Char is [\char].
and the following:
\let\char= Hello world!. Char is [\char].
do not work the same way. I noticed that the second way you added the two spaces. But this passage is not the expansion of the first?
Surely, there is a subtle difference, but I can not see it ... you could clarify this?
The way TeX do expansion in *not* the same as, say macro replacement in C, where the name is just replaced with the body.
DeleteIn TeX, there is already stuff that happens to the body of the control sequence during the definition, in particular, it is converted to tokens.
During conversion to token, one or more spaces *characters* are converted to only one space *token*. But consecutive space tokens are never merged together. Consider for instance the following:
\def\sp{ }
Sentence with \sp double space.
There will be two spaces between "with" and "double" because there is a space character followed by a space token (then the space after `\sp' is ignored because it is after a control sequence---normal behaviour).
So, in your second example between the '=' and the 'H', there are two space characters that are converted to one space token, that is ignored by the let.
In the first example, between the '=' and the 'H', there is one space token (from the \gobblechar), one ignored space character (because it is after the control sequence), and one other space token (from the beginning of \text). In the end, there are two space tokens, one of them is ignored by the \let, and the other is aliased to \char...
If you are confused about the conversion to tokens, remember this is the reason why you can define the following:
\makeatletter
\def\mytmp{\another@tmp}
\makeatother
And then use `\mytmp' in your text even though you cannot use `\another@tmp' directly.