Wednesday, December 28, 2005

Re: st: Macro substitution in Mata

Partha Deb <partha.deb@hunter.cuny.edu> said thanks to Kit and me and, in the process, asked another question about macros and the use of Mata.

The general rule is that macros and Mata do not go together, or at least do not go together in the ado way. Let me explain what I mean. Let us consider an ado-file with Mata code in it:

------------------------- myfile.ado --- program myfile --------------- ... | macros okay here mata: myfunc(`...') | ... | Even okay in end | mata: ... | statements program mysub | ... | end | --------------- mata: | function myfunc(...) | macros *NOT* { | okay here ... | myfuncsub(...) | (They work ... | differently } | from how you | expect) function myfuncsub(...) | { | ... | } | | end ---------------- ------------------------- myfile.ado ---

What you must remember is that ado code is interpreted whereas Mata code is compiled. In interpeted code, the computer looks at a string such as :x=x+1" when it needs to execute it, and says to itself:

"Oh, equals sign; that means assignment. Okay, to the left of the equals sign should appear a name: x must be the name. I'm to assign to x. What am I to assign to x? "x+1". I see a plus sign, so evidently the user wants to be to assign "x" plus "1" to x. To the left of a plus sign can appear a name or number. "x" looks like a name. Can I find something named x? Great. To the right of the plus sign can appear a name or number. "1" looks like a number. Great, it is a number! Now, let's find the current value of x, add 1.0 to it, and store the result in x."

The important part is that the computer does this when it needs to execute the code. If the line "x=x+1" appears in a loop, the computer will go through the above reasoning each and every time. Fortnately, computers do not get bored.

In compiled code, the same reasoning as above is broken into two parts. The first part is the understanding:

"Oh, equals sign; that means assignment. Okay, to the left of the equals sign ..."

Rather than actually performing the reqest as it is interpreted, however, the computer makes terse notes on what it is to do when the time comes to execute the statement. The computer could even make those notes in the language understood by the computer itself (machine language), or it could make it in a language that will be easy for the computer to understand later, but will require a little interpretation (p code). Either way, the notes made read, in effect,

Get the 8-bytge value at address 0x489a8, increment, and re-store it.

The actual note might read

21489a82a32489a8

The accumulation of all the notes the compiler makes is called the comiled version of the program, or the object code. What's neat about notes like "21489a82a32489a8" is that they can be executed quickly, and that is what makes Mata so fast relative to ado-files: The undertstanding of what is to be done is performed only once rather than over and over again.

Now let's tie this into mmacros. Consider a statement like

`x'=`x'+1

Let's start with ado-files. Say the first time we execute the statement, macro `x' contains "bill". Then the statement reads

bill=bill+1

and off the interpreter goes with its interpretation:

"Oh, equals sign; that means assignment. Okay, to the left of the equals sign should appear a name: bill must be the name. I'm to assign to bill. What am I to assign to bill? "bill+1". I see a plus sign, so evidently the user wants to be to assign "bill" plus "1" to bill. To the left of a plus sign can appear a name or number. "bill" looks like a name. Can I find something named bill? Great. To the right of the plus sign can appear a name or number. "1" looks like a number. Great, it is a number! Now, let's find the current value of bill, add 1.0 to it, and store the result in bill."

Say that later on, the line is reexecuted, but this time, macro `x' contains "fred". Then:

"Oh, equals sign; that means assignment. Okay, to the left of the equals sign should appear a name: fred must be the name. I'm to assign to fred. What am I to assign to fred? "fred+1". I see a plus sign, so evidently the user wants to be to assign "fred" plus "1" to fred. To the left of a plus sign can appear a name or number. "fred" looks like a name. Can I find something named fred? Great. To the right of the plus sign can appear a name or number. "1" looks like a number. Great, it is a number! Now, let's find the current value of fred, add 1.0 to it, and store the result in fred."

Now let's consider what happens in a compiler. Let us assume that, at compile time, at the time the ado-file is loaded, at the time before the program is actually executed, macro `x' contains "junk". The compiler goes through its interpretation logic and translates it to`

Get the 8-bytge value at address 0x23b70, increment, and re-store it.

and it stores this note as

2123b702a3223b70

Now comes execution time. Just as before, we will assume that the first time the statement is executed, macro `x' contains "bill". Nonetheless, the note reads 2123b702a3223b70, and that increments junk. The second time, macro `x' might contain "fred", but it still doesn't matter, because the note is unchanged, and junk is incremented again.

If you use macros in your Mata code, it is the value of the macro at compile time that matters, not the value at execution time. So you cannot use macros the same way in Mata code as ado code, and my recommendation is simply not to use them even though, sometimes, in advanced Mata code, you will see a programmer violate that. In those cases, it is done with full understanding of the comile-time interpretation. Let's say the programmer is writing a routine that will update the file "stata.trk" and, the way things work out, the filename will appear three times in the final code. The name is fixed today, but it might change sometime in the future. Rather than leaving a note in a comment at the top of the file that says, "If the name stata.trk ever changes, make sure you change all three occurances below," he programer might introduce a macro which is defined at comiletime to contain "stata.trk". This way, there would be only one place to change, and that one place advertises the filename's fixedness.

I hope this helps Partha. I did not fully understand Partha's question, so instead I have tried to supply the ingredients so that he could answer it himself. Perhaps I went in the wrong direction.

-- Bill wgould@stata.com

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/


Tag:


Links to this post:

Create a Link



<< Home

This page is powered by Blogger. Isn't yours?